Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Matlab, logarithms and rounding small numbers to zero

  1. Dec 20, 2011 #1
    I'm a novice at matlab so I apologize if this is a dumb question: I need to find the sum of A, B and C given X, Y and Z, where ln(A)=X, ln(B)=Y etc. However, the values A, B and C are so small that when I try to use 'exp' the result is rounded to zero. This is a problem because later in my code I divide sum(A,B & C) by a number, take the log of the result and multiply this by another number as part of a loglikelihood equation. Obviously, since you cannot take the log of 0 this rounding poses a problem for my code. this part of the code is trying to find the binomial coefficient, and I can do the task with nchoosek and vpi but it slows down the code to the point that it is unusable(I'm plugging the code into fminsearch to optimize parameters, and the search takes days or cuts itself off after a time). I've tried just adding logs of numbers and subtracting the same amount later so that exp doesn't get tripped up, but I'm now faced with a situation where that doesn't work.

    g=exp(v); %This is where the numbers get rounded to zero
    gg=log(D); %log(0) gets turned to -Inf, cause the code to not work.

    Any advice would be appreciated.
  2. jcsd
  3. Dec 20, 2011 #2

    Simon Bridge

    User Avatar
    Science Advisor
    Homework Helper

    So v can be a very big negative number for g to be so small?

    Either you have made a mistake, you need more precision, or you should change your representation - say: scale the units or pick a positive axis in the other direction?
  4. Dec 20, 2011 #3

    Simon Bridge

    User Avatar
    Science Advisor
    Homework Helper

    I'm not sure your method matches your description.
    Put the numbers you want to handle into a vector.

    you want the sum of the elements of F=[A,B,C]
    which would be

    but you have L=[ln(A),ln(B),ln(C)]=[X,Y,Z]

    so L=ln[F]

    so F=exp[L]

    so you need to do:
    L=[X' Y' Z']; %if X,Y,Z are row vectors already

    what you did was:

    v=X+Y+Z; % which is lnA+lnB+lnC
    g=exp(v); % which is exp(lnA+lnB+lnC)=exp(lnA)exp(lnB)exp(lnC)=ABC
    D=sum(g); % which is still ABC
    gg=log(D); % which is ln(ABC)

    (if X, Y, Z are scalars)

    but you say you wanted A+B+C at the end.
    Last edited: Dec 20, 2011
  5. Dec 21, 2011 #4
    X, Y and Z are not scalors, they are vectors of length N. The elements in each of these vectors are logs. V=X-(Y+Z) . I'm using logs because the numbers I am dealing with are sizes that would otherwise get rounded and eventually cause my code to produce NaN's. So, I am left with a vector V whose elements are ln(A), ln(B) etc. What I need to get is the S=sum(A,B...), i.e. the sum of the exponents corresponding to each of the elements in V. I cannot solve via 'exp' because the numbers are so small that they get rounded to 0. Sorry for not including enough information before, and thank you very much for taking the time to respond to my first post.
  6. Dec 21, 2011 #5

    Simon Bridge

    User Avatar
    Science Advisor
    Homework Helper

    That's OK - I accounted for them being vectors... it makes no difference to the comments above.
    Your problem is not that they are too small - but that the ones in V are too negative.
    Using logs is not the way to solve that problem.

    The method you used still does not match your description of what you want to achieve - but at least the subtraction in V explains the negative numbers.
    But if you are using logs to bring really big numbers to a manageable scale, then you have made a mistake right there.

    Lets see if I can demonstrate...
    1st set up the demo for N=4, I don't have your data so I'll make some up:

    a = 16 2 3 13
    b = 5 11 10 8
    c = 9 7 6 12

    we want to recover:
    > a+b+c
    ans =

    30 20 19 33

    Code (Text):
    > x=log(a);y=log(b);z=log(c);
    > v=x-(y+z)
    v =

      -1.0341  -3.6507  -2.9957  -1.9994

    > g=exp(v)
    g =

       0.355556   0.025974   0.050000   0.135417

    > d=sum(g)
    d =  0.56695
    > gg=log(d)
    gg = -0.56749
    ... see? It does not do what you want.
    In your case, Y+Z need not be a very big number (see below).
    Notice how, in my case, the components of v come out negative? That way the exponents are less than one.

    Note exp(X-(Y+Z)) = A/BC. Remember what logs do?

    Presumably your vectors are size >> 4 ... so try it with smaller vectors to check your working.

    Now - my way:
    Code (Text):
    > L=[x' y' z']
    L =

       2.77259   1.60944   2.19722
       0.69315   2.39790   1.94591
       1.09861   2.30259   1.79176
       2.56495   2.07944   2.48491

    > exp(L)
    ans =

       16.0000    5.0000    9.0000
        2.0000   11.0000    7.0000
        3.0000   10.0000    6.0000
       13.0000    8.0000   12.0000

    > sum(exp(L'))
    ans =

       30   20   19   33
    ... which is a+b+c ... what you say you want.

    Perhaps you can pm me the data you are using?
    You should realize that there are maximum and minimum numbers that matlab, and your computer, can handle. Check that your results are not going to exceed those bounds.

    The realmax and realmin will give you ballpark figures for this:

    Code (Text):
    > realmax
    ans =  1.7977e+308
    > realmin
    ans =  2.2251e-308
    > log(realmin)
    ans = -708.40

    ans = 0
    > exp(-745)
    ans =  4.9407e-324
    So you see, the number does not have to be all that small for the exponent to be less than realmin.

    eg. if a particular element of each is X=400, Y=700, Z=700, then V=-1000
    this means that the exp(V) will be 0 and so log(exp(V)) is Inf.

    But if you did used my method, you get

    > sum(exp([400 700 700]))
    ans = 2.0285e+304

    note: if A B and C are so big that A+B+C > realmax, this will still give nonsense.

    That's why I suspect you need to change your representation.
    Scale to different units. You can make up any unit you want, so long as you are consistent, just to get the numbers into the computational range of your machine.

    > 1.7e308 + 2e308
    ans = Inf

    But if I introduce a scale factor of 10^308 (basically representing my data in units of 10^308 times whatever I used before)
    The sum becomes
    >1.7 + 2
    ans = 2.7

    and, in my report, I write the answer as 2.7e308
    Last edited: Dec 21, 2011
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook