I am not sure this is the right forum for this -- I have a question about a particular paper:(adsbygoogle = window.adsbygoogle || []).push({});

http://www-users.cs.umn.edu/~sboriah/PDFs/ChandolaCBK2009.pdf

The authors describe 4 heuristics that can be derived from categorical data -- this is in order to map categorical data to numerical. These heuristics are d_m, f_m, n_x, f_x. They also provide two examples y and z and the values of the quantities above computed with respect to dataset in table 3. I am able to lock into their values exactly for d_m and f_m but I cannot reproduce n_x and f_x.

Could someone read this paper and try to derive these values? I basically take it their equation (3.3) shows summation of reciprocals of arity for A_x set (i.e. the set of mismatching attributes) -- I can't reproduce -5.45 and -7.90.

Please note I already contacted the authors -- one responded that Dr. Boriah is the person responsible for these calculations but he is apparently not reachable.

**Physics Forums | Science Articles, Homework Help, Discussion**

The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

# A Question about a particular paper on categorical data

Have something to add?

Draft saved
Draft deleted

**Physics Forums | Science Articles, Homework Help, Discussion**