Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Calculating similarity between users using matlab

  1. Nov 19, 2012 #1
    Hi all,
    I am having database of 963 users .
    Records of two users are

    uid gender occupation age
    1 F student 23
    2 M teacher 30

    Now i need to calculate the similarity of each user with every other as

    sim(ui,uj)=0.8*sim(age) + 0.1*sim(gender) + 0.1*sim(occupation)

    where sim(age)=1-[(Ai-Aj)/(agemax-agemin)];

    agemax is the maximum age and agemin is minimum age of user.

    sim(gender)=1 if G1=G2 else 0
    sim(occup)=1 if occupation is same .

    Kindly tell me the code fir it so that i can get a matrix of similarities.(963*963)
  2. jcsd
  3. Nov 22, 2012 #2
    the data import depends on the form of your database, but assuming you can do this and obtain a 4-by-963 cell array, you can do this using a pair of for loops, eg.

    Code (Text):
    sim = zeros(963);
    for i = 1:962
        for j = i+1:963
            sim(i,j) = [your specific sim function]
    this should give you a strictly triangular matrix of the result of every possible pairing.
  4. Dec 5, 2012 #3
    Hi, if I were you I'd encode all attributes as integers, so that all records can be stored as rows in a matrix of type int. I'll assume that you've done this, and your database is a matrix "db" of size [numusers 4]. I'll let you work out the details of that. First, let me give names to the columns:

    Code (Text):
    g = db(:,2); % the genders, assuming male=1 female=2
    o = db(:,3); % the occupations, assuming integers
    a = db(:,4); % the ages
    Now you can compute the pairwise similarity matrix, S, pretty quickly:

    Code (Text):
    G = g*g' ~= 1*2;
    O = bsxfun(@minus,o,o') == 0;
    A = 1-bsxfun(@minus,a,a')/(agemax-agemin);
    S = 0.8*A + 0.1*G + 0.1*O;
    This'll give a nice speedup over for-loops when the number of users gets very large.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook