To empirically estimate the big-Oh in small programs

  • Thread starter Thread starter nomadreid
  • Start date Start date
  • Tags Tags
    Estimate Programs
Click For Summary

Discussion Overview

The discussion revolves around how to empirically estimate the big-Oh notation for short recursive programs written in Python. Participants explore methods for measuring the growth of these programs through empirical data collection, while also considering the theoretical aspects of algorithm complexity.

Discussion Character

  • Homework-related
  • Technical explanation
  • Exploratory

Main Points Raised

  • One participant suggests that simply counting operations may be incorrect due to varying time and memory costs of different operations.
  • Another participant proposes that increasing input size or running the program multiple times could yield better timing data.
  • A participant mentions the possibility of counting operations separately if they are executed different numbers of times, or combining counts if one operation is only slightly slower.
  • Code snippets are shared to demonstrate how to measure execution time using Python's time module, with a caution about the impact of printing on timing accuracy.
  • One participant expresses concern about assigning tasks they cannot solve, while acknowledging the collaborative nature of the course structure with a colleague handling the programming aspects.
  • A suggestion is made to explore Pyzo, a Python environment that could facilitate the integration of Python and mathematical libraries for the students.

Areas of Agreement / Disagreement

Participants express a mix of agreement on the need for empirical measurement methods, but there is no consensus on the best approach to take or the specifics of implementation. Some participants emphasize the importance of theoretical understanding alongside empirical methods, while others focus on practical coding solutions.

Contextual Notes

Participants note limitations such as the variability in execution time due to different machines and Python versions, as well as the potential inaccuracies introduced by printing during timing measurements.

Who May Find This Useful

This discussion may be useful for educators teaching discrete mathematics with a programming component, students learning about algorithm complexity, and individuals interested in empirical methods for analyzing code performance.

nomadreid
Gold Member
Messages
1,773
Reaction score
256

Homework Statement


First, I am a teacher preparing to give a discrete mathematics course, in which I will not do any computing, but the students will (in Python). I want them to make empirical estimates for the growth of their (short) recursive programs by putting in some sort of "counter" in their programs, and then run it on several sets of data of varying sizes. (I will also ask them to analyze theoretically, but that is not my question here.) But since I am not a programmer, I am not sure how to do this.

Homework Equations


N/A

The Attempt at a Solution


Simply have the program count the number of operations seems to me to be incorrect, because different operations will take different amounts of time and/or memory space. I am also aware that different machines (not to mention different versions of Python) will give different answers, but they should be within a linear factor of one another. Reading off the time for the completion of the program is not going to work, because the programs will be short. Some websites classify some operations complexity, but not all of them, and besides, that brings us back to a theoretical approach. Anyway, I don't want to actually give them the Python program construction (my students can program; I can't), but I will be suggesting the mathematical steps to put in. (Therefore an answer in mathematical language rather than computer language would be appreciated. I think the first mathematician to have seen a computer program reading "n=n+1" probably had a heart attack.)
Thanks for any hints.
 
Physics news on Phys.org
nomadreid said:
Reading off the time for the completion of the program is not going to work, because the programs will be short.
You can always make the input longer (or let the programs run multiple times).
Most of the time you have some set of loops with just one or two different operations executed many times. If they do not get called the same amount of times, you can count both separately. Or add them. If one is just a factor of 2-3 slower than the other, it does not matter much compared to a different complexity.
nomadreid said:
my students can program; I can't
I don't think it is a good idea to give students tasks you cannot solve.
 
  • Like
Likes   Reactions: nomadreid
you can also query for millisecs and and do stats on the deltas :

Code:
import time

def do_something_here():
  print "hello world"

# init timer
oldticks=time.clock()

# loop with timing
for i in range(1000):

  do_something_here()

  newticks=time.clock()
  delta=(newticks-oldticks)
  oldticks=newticks

  print "loop: ",i,"  delta: ",delta, " secs"
 
  • Like
Likes   Reactions: nomadreid
Printing can take some time on its own, so better do it outside the areas where time is evaluated.
 
  • Like
Likes   Reactions: nomadreid
Thanks, mfb and jedishrfu. I guess combining the answers, having the students run the program lots of times, subtracting the time taken to record the result from each run, and then doing stats on the results, should do it.
I do agree that giving students stuff that the instructor cannot solve is not a great idea; however, in this course, the computing part is relegated to a colleague with whom I split the course (I just do the maths part). That is, I know that the assignments I give are mathematically possible, but I need to give assignments that I know are computable in Python, even though my colleague will help them through the details. Therefore I am informing myself via this magnificent forum.
 
Since, you're doing python and math together you both might want to check out Pyzo at pyzo.org. Its a collection of python libs + an interactive display where the environment is setup and ready to go. They don't have any example code but searching the web should find examples of how to use the various libraries available namely numpy, scipy, matplotlib ...
 
  • Like
Likes   Reactions: nomadreid
Thanks, jedishrfu. The site looks very useful. In fact, the SymPy fits my idea what a computer should do. I shall recommend this to my students and to my colleague.
 

Similar threads

  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 12 ·
Replies
12
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 102 ·
4
Replies
102
Views
4K
Replies
16
Views
3K
  • · Replies 29 ·
Replies
29
Views
4K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 10 ·
Replies
10
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
Replies
17
Views
2K