Does this python script match the math?

In summary, the conversation discusses a problem with implementing a mathematical equation in Python and the poster is seeking help in verifying their code. They also mention trying to simplify the problem and using other programs such as Mathematica and MATLAB to solve it. The issue is possibly related to not owning Excel to compare results. The conversation also highlights the importance of being able to solve problems independently.
  • #1
member 428835
Hi PF!

I've posted a bunch lately and you have been SO helpful (seriously, thank you all). Can you corroborate if my python script below matches the math? I'd seriously appreciate it!

The math: $$\max_{x\in [0.7,1)}F(x) : F(x) := \sum_{t = 250}^{750} \log\left( P_{\nu=3} \left( \frac{y_t^2}{\sigma_t^2} \right) \right) : \sigma_t^2 = x\sigma_{t-1}^2 + (1-x)y_{t-1}^2,$$ $$ \sigma_{250}^2 = \sum_{t=1}^{250} y_t^2 / 250$$. Here ##P## is the student-t PDF with 3 degrees of freedom. The data ##y_t## are generated as

$$y_t = N(0,\hat\sigma_t) :\hat\sigma_t^2 = 0.95\hat\sigma_{t-1}^2 + (1-0.95)y_{t-1} $$

where ##y_0 = N(0,0.2)## where ##N(0,\hat\sigma)## is a normally random distributed variable with mean 0 and standard deviation ##\hat\sigma##. Do the maximizing values of this function correspond to the following python code (I minimize the negative of ##F##, which is the same as maximizing F)?

Python:
import numpy as np
from scipy.stats import t
import scipy.optimize as optimize
import matplotlib.pyplot as plt
from scipy.stats import t
# STEP 1: BUILD DATA
def build_data(N=750, lmda=0.95, sig_0=0.2):
    # PREALLOCATE AND INITIATE DATA
    sig_sq = np.zeros(N)
    sig_sq[0] = sig_0**2
    y = np.zeros(N)
    y[0] = np.random.normal(loc=0, scale=sig_0)
    # CONSTRUCT SUCCESSIVE DATA
    for i in range(1, N):
        sig_sq[i] = lmda*sig_sq[i-1] + (1-lmda)*y[i-1]**2
        y[i] = np.random.normal(loc=0, scale=np.sqrt(sig_sq[i]))
    plt.plot(y)
    plt.plot(sig_sq)
    plt.show()
    return y, sig_sq
# STEP 2: OPTIMIZE
def arg_max(M = 250):
    # GET DATA
    data = build_data(750, 0.95, 0.2)[0]
    y = data[M:]
    # OPTIMIZING FUNCTION
    def F(X):
        # BUILD pdf argument
        sig_sq_pre = np.sum((data[0:M]) ** 2) / M
        lmda_star = -np.log(t.pdf(y[0] ** 2 / sig_sq_pre, df=3))
        for i in range(len(y)):
            sig_sq = X*sig_sq_pre + (1-X)*(y[i-1])**2
            sig_sq_pre = sig_sq
            lmda_star += -np.log( t.pdf( y[i]**2/sig_sq, df=3 ))
        print(X)
        return lmda_star
    bnds = [(0.7, 0.999)]
    IC = np.array(0.75)
    result = optimize.minimize(F, method='TNC', bounds=bnds, x0=IC, options={'maxiter': 200})
    print(result)
    return result.x
if __name__ == '__main__':
    arg_max()

EDIT: question stem and python code accurate as of 8:18 PM EST (sorry, for some reason PF rarely shows me math mode and python mode until I've posted the problem, leading to tons of edits after. Maybe a Chrome issue, or ubuntu?
 
Last edited by a moderator:
Technology news on Phys.org
  • #2
@joshmccraney you should be checking that aspect. We could say it does but then you discover we were wrong.

I would place prints inside your loops to show variable values so you know the indexing and intermediate values are correct.
 
  • Like
Likes Vanadium 50, FactChecker and berkeman
  • #3
jedishrfu said:
@joshmccraney you should be checking that aspect. We could say it does but then you discover we were wrong.

I would place prints inside your loops to show variable values so you know the indexing and intermediate values are correct.
I've done this so many times it's unbelievable. I'm lead to believe it doesn't work (for alternative reasons I won't discuss), but I can't see where the issue is. I think I've implemented it correctly. I spent about 3 hours debugging, which didn't fix any bugs but lead to me writing it this way (which I think is the cleanest, aside from OOP, which I'm still learning)
 
  • #4
joshmccraney said:
I've done this so many times it's unbelievable. I'm lead to believe it doesn't work (for alternative reasons I won't discuss), but I can't see where the issue is. I think I've implemented it correctly. I spent about 3 hours debugging, which didn't fix any bugs but lead to me writing it this way (which I think is the cleanest, aside from OOP, which I'm still learning)
If that were happening to me, I would simplify the problem until I could get the results to agree, and then keep adding in more and more of the original complexity to see where it breaks.

This is an important learning lesson for you -- you will need to be able to figure these kinds of problems/puzzles out many times in your life, and PF will not always be able to help you (like when the problem is proprietary to your work... Been there and done that).
 
  • #5
berkeman said:
If that were happening to me, I would simplify the problem until I could get the results to agree, and then keep adding in more and more of the original complexity to see where it breaks.

This is an important learning lesson for you -- you will need to be able to figure these kinds of problems/puzzles out many times in your life, and PF will not always be able to help you (like when the problem is proprietary to your work... Been there and done that).
This is very simplified actually (sad I know).

The problem is a simple one isn't it? Find the minimizing value (under constraint) for a function of a single variable. Sure, there are some noisy constants, but nothing too bad truthfully. I honestly think a sophomore in college could solve this as I have, or am I missing something?
 
  • #6
Have you implemented that equation in Excel? You can compare what you get for each summation step result in Excel with single-stepping through your Python code...
 
  • Like
Likes jedishrfu
  • #7
berkeman said:
Have you implemented that equation in Excel? You can compare what you get for each summation step result in Excel with single-stepping through your Python code...
I don't actually own excel (but am about to code it in Mathematica, and will try MATLAB too). I'll report back when I get results.
 
  • #8
joshmccraney said:
I don't actually own excel
Ah, now we find the core problem. You're a Windows hater are you?! Well, that'll teach you!

(J/K, I just live with the hatred boiling in my gut every day, but I still use their tools to get my work done...)
 
  • Haha
  • Like
Likes jedishrfu and member 428835
  • #9
berkeman said:
Ah, now we find the core problem. You're a Windows hater are you?! Well, that'll teach you!

(J/K, I just live with the hatred boiling in my gut every day, but I still use their tools to get my work done...)
Hahahahahahahah jumbo shrimp, country music, microsoft works...oxymorons right there hahahaha I'm of course joking
 
  • Like
  • Haha
Likes DrClaude and berkeman
  • #10
There are alternatives in the free open office suite.

https://www.openoffice.org/

Using the spreadsheet is a great way to check things because each cell could be an intermediate value.

Where I've been tripped up by python is in converting 2.7 code to 3.x code where the meaning of the / operator changes. In 2.7 division of two integers produces an integer and division of one or more floats produces a float.

I had code that took advantage of the integer feature to compute indices into a random file database. The 3.x code failed miserably and I had to do the step by step thing to isolate the problem. It was an absolute nightmare.

Here's more on the change from / in 2.7 to two separate operators // and / for 3.x code

https://www.tutorialspoint.com/division-operators-in-python
 
  • #11
joshmccraney said:
Can you corroborate if my python script below matches the math
While a good code review by other insightful people can be very valuable, nothing beats the general approach of (unit) testing as this is a working proof that the code works as specified (in the tests). In your case I would recommend you look into unit testing in Python (e.g. https://docs.python.org/3/library/unittest.html) and then cover your code with as many test as you feel relevant of what you know must be true. Sometimes this means you have to break your code into smaller units in order to test it properly, but that is only a good thing.
 
  • Like
Likes member 428835
  • #12
joshmccraney said:
for some reason PF rarely shows me math mode and python mode
The maths is a known issue - if there isn't already LaTeX on the page maths doesn't render until you refresh the page. There's a workaround for replies, which is to enter some LaTeX, copy your text to clipboard (just in case) then go to preview and refresh the page. It doesn't work for new threads, unfortunately.

Not displaying code blocks is not an issue I've seen, and it seemed to work when I created a dummy thread just now. That one may be a browser issue, don't know.

Incidentally, your definition of ##\sigma_t##:
joshmccraney said:
$$\sigma_t^2 = x\sigma_{t-1}^2 + (1-x)\sigma_{t-1}^2,$$
simplifies to ##\sigma_t^2=\sigma_{t-1}^2##.

Looking at your maths you seem to have a lot of divisions and logs and sums over five hundred terms. Accumulated rounding errors may be an issue, particularly if your terms are highly variable in magnitude or if some terms can be negative and only slightly different from others (e.g. if a=1e17, then a+1 is also 1e17 because the 1 is too small for the machine to represent, and even worse (a+1)-(a-1)=0, not 2). What precision does np.zeros give you by default? Does increasing or decreasing it significantly change your result?

Can you test a simpler thing? Can whatever external sanity checks are complaining be run for ##\sum_{t=250}^{252}## or something? If so does the sanity check fail then, or only when you increase the number of terms?

Can you re-order the maths to be better behaved? Note that ##\sum_i\log(a_i)=\log\left(\prod_ia_i\right)##. Is the product of terms better behaved than the sum of their logs?
 
  • Like
Likes member 428835
  • #13
I think you are getting confused with what variables are vectors and what are (or should be) scalars (i.e. floats).

In particular your objective function F(X) is passed a np.array of the same size as the x0 parameter so what do you think (1 - X) is? Also F(X) should return a float but you are returning lambda_star which is set to the return value of np.log, a np.array.
 
  • #14
Ibix said:
The maths is a known issue - if there isn't already LaTeX on the page maths doesn't render until you refresh the page. There's a workaround for replies, which is to enter some LaTeX, copy your text to clipboard (just in case) then go to preview and refresh the page. It doesn't work for new threads, unfortunately.

Not displaying code blocks is not an issue I've seen, and it seemed to work when I created a dummy thread just now. That one may be a browser issue, don't know.

Incidentally, your definition of ##\sigma_t##:simplifies to ##\sigma_t^2=\sigma_{t-1}^2##.
I thought I wrote: ##\sigma_t^2=y_{t-1}^2##?

Interesting observation about the product! It might make things better behaved, so I'll have to check this out!

Funny you said sanity checks: I actually got this thing working in Mathematica and was able to do some sanity checks for 250:251,252,...260 and really look at the plots. Indeed, I think I've programmed it correctly (same result in both Mathematica and python, and VERY different languages) and am using the wrong test to check whether it works or not. Your post was very insightful; thank you!
 
  • #15
joshmccraney said:
'm lead to believe it doesn't work
Boy if someone asked me to check code for them and I found out they didn't think it worked, I'd be pretty cheesed off. Only when you think it's right should you be asking people to check it. If you think it's wrong, you should ideally check it yourself, not leaving it for your helpers to figure out. I'd be really, really cheesed off.

I agree with the suggestion to try Excel.

I also think it;s good to build up the calculation a little at a time, printing out the intermediate steps.
 
  • Like
Likes jedishrfu
  • #16
Vanadium 50 said:
Boy if someone asked me to check code for them and I found out they didn't think it worked, I'd be pretty cheesed off. Only when you think it's right should you be asking people to check it. If you think it's wrong, you should ideally check it yourself, not leaving it for your helpers to figure out. I'd be really, really cheesed off.

I agree with the suggestion to try Excel.

I also think it;s good to build up the calculation a little at a time, printing out the intermediate steps.
Turns out everything I had was correct. The issue is someone in a position of authority thought it should tend to a particular value, hence me thinking it was wrong. By my account and looking at the math alone, I indeed thought it was correct.
 
  • Like
Likes Vanadium 50 and jedishrfu

1. How do I know if my python script matches the math?

One way to determine if your python script matches the math is by testing it on known mathematical equations and comparing the results. You can also check the logic and syntax of your script to ensure it follows the mathematical principles you are trying to replicate.

2. What if my python script does not match the math?

If your python script does not match the math, you may need to review the code and make adjustments to ensure it follows the correct mathematical principles. You may also need to check for any errors or bugs in your code that could be causing the mismatch.

3. Can I use external libraries or modules in my python script to match the math?

Yes, you can use external libraries or modules in your python script to match the math. These libraries can provide advanced mathematical functions and calculations that can help you accurately replicate the desired mathematical outcome.

4. How can I debug my python script if it does not match the math?

You can debug your python script by using print statements to check the values of variables and see where the code may be going wrong. You can also use debugging tools such as breakpoints and step-throughs to analyze the code and identify any errors.

5. Are there any resources or references I can use to ensure my python script matches the math?

Yes, there are many online resources and references available for python and mathematics that can help you ensure your script accurately matches the math. These include tutorials, documentation, and forums where you can seek help and advice from other programmers and mathematicians.

Similar threads

  • Programming and Computer Science
Replies
3
Views
1K
  • Programming and Computer Science
Replies
5
Views
2K
  • Programming and Computer Science
Replies
2
Views
911
  • Programming and Computer Science
Replies
1
Views
2K
  • Programming and Computer Science
Replies
9
Views
2K
  • Programming and Computer Science
Replies
16
Views
1K
  • Programming and Computer Science
Replies
2
Views
894
  • Programming and Computer Science
Replies
8
Views
950
  • Programming and Computer Science
Replies
4
Views
4K
  • Programming and Computer Science
Replies
6
Views
2K
Back
Top