# Calculating similarity between users using matlab

1. Nov 19, 2012

### Minaxi

Hi all,
I am having database of 963 users .
Records of two users are

uid gender occupation age
1 F student 23
2 M teacher 30

Now i need to calculate the similarity of each user with every other as

sim(ui,uj)=0.8*sim(age) + 0.1*sim(gender) + 0.1*sim(occupation)

where sim(age)=1-[(Ai-Aj)/(agemax-agemin)];

agemax is the maximum age and agemin is minimum age of user.

sim(gender)=1 if G1=G2 else 0
sim(occup)=1 if occupation is same .

Kindly tell me the code fir it so that i can get a matrix of similarities.(963*963)

2. Nov 22, 2012

### mikeph

the data import depends on the form of your database, but assuming you can do this and obtain a 4-by-963 cell array, you can do this using a pair of for loops, eg.

Code (Text):
sim = zeros(963);
for i = 1:962
for j = i+1:963
sim(i,j) = [your specific sim function]
end
end
this should give you a strictly triangular matrix of the result of every possible pairing.

3. Dec 5, 2012

### samh

Hi, if I were you I'd encode all attributes as integers, so that all records can be stored as rows in a matrix of type int. I'll assume that you've done this, and your database is a matrix "db" of size [numusers 4]. I'll let you work out the details of that. First, let me give names to the columns:

Code (Text):
g = db(:,2); % the genders, assuming male=1 female=2
o = db(:,3); % the occupations, assuming integers
a = db(:,4); % the ages
Now you can compute the pairwise similarity matrix, S, pretty quickly:

Code (Text):
G = g*g' ~= 1*2;
O = bsxfun(@minus,o,o') == 0;
A = 1-bsxfun(@minus,a,a')/(agemax-agemin);
S = 0.8*A + 0.1*G + 0.1*O;
This'll give a nice speedup over for-loops when the number of users gets very large.