| New Reply |
Expectation value for CDF fails... |
Share Thread | Thread Tools |
| Jun22-12, 05:04 AM | #18 |
|
|
Expectation value for CDF fails...But I disagree that using polynomials makes the code "exceedingly slow" in fact polynomials are computationally very fast structures to calculate since you have a cost O(n-1) for nth degree polynomials. That means that most of the time the cost to calculate such function in R will be around O(4), which is great. A family of these tricks are based on doing things like alter the algorithm in ways that mathematically makes the algorithm slower but taking into account the machine architecture actually it becomes faster. So for these techniques you have to know your machine. Another family of techniques are based of breaking the code into pieces and then reuse calculations to avoid redundancy. For instance, if you need to calculate [itex]sin(x)[/itex] and [itex]cos(x)[/itex] (which yo do in BM) you do not want to actually use the functions [itex]sin(x)[/itex] and [itex]cos(x)[/itex], what you do is to break the code for [itex]sin(x)[/itex] and [itex]cos(x)[/itex] and merge it into a [itex]sincos(x)[/itex] function which will be twice as fast and, thus, will make your code much faster. So it goes for the [itex]log(x)[/itex] function, for BM you have to remember that the [itex]log(x)[/itex] is only applied for numbers within (0,1] that means that using some [itex]log(x)[/itex] implementations is a waste of resources since higher orders of the polynomial adds virtually nothing to the accuracy. So let's say that [itex]log(x)[/itex] uses a polynomial of degree 10, since you know you will never calculate numbers higher than 1, you can create a function [itex]log01(x)[/itex] which only uses a polynomial of degree 5 as machine accurate as the [itex]log(x)[/itex] function but twice as fast. So, I just explained a couple of tricks that can really speed up your code if you use them instead applying [itex]sin(x)[/itex] , [itex]cos(x)[/itex] and [itex]log(x)[/itex] right away. And actually the fun goes ooooon... But you're fun, right?
|
| Jun26-12, 03:57 PM | #19 |
|
|
"R" is using code from the 1960's-80's which was for CISC machines. Cache latencies and piplining issues, were not yet part of the programming milieu; Foss for example, has several *if* statements in the code deciding when to use which polynomial ... that's exactly what slowed my code for the MPM down horribly compared to my estimate of it's speed. (I'll explain later.) On most modern machines, the brute force operations of addition and multiplication both have the same time delay, so it's more like 7 operations to do a 4th degree polynomial. 7 operations is about the same time as it takes to compute sin or cos to the full double precision on my machine.... But a 4th degree poly is nowhere near able to achieve the same precision. If you look at the source code I already posted -- you'll notice that I use second and third degree polynomials to avoid computing exp(-0.5*x**2) as much as possible. I use these polynomials > 98% of the time, compute exp() for the remainder. I even did alignments for power of two operations where possible, since that doesn't even require a multiply instruction; but only an integer add... For a particular number of repetitions -- I think it was around a billion -- I have time measurements for each of the math--co operations on my machine. multiply and divide : 2 seconds (or less) sin() and cos(): 15 seconds sqrt(): 36 seconds. exp(): 46 seconds. log(): 87 seconds. In theory, then, I expected my program to 98% of the time to do polynomials at a weight of *less* than 8 seconds. And 2% of the time, to run exp at 46+2+2 ~ 50 seconds; That works out to <9 seconds average... Box muller (ignoring multiplies) would be 87+36+15=138 seconds. (Which isn't far off from reality...) Hence, the MPM method ought to be on the Order of 15 times fasterthan BM. But it's nowhere *NEAR* that much faster -- it is, in fact, only *slightly* faster. If anyone sees a coding mistake, let me know -- but I don't think there is anything wrong with the way I wrote the program. The main issue for my machine, and these issues are *VERY* common in all desktop machines whether powerPC, or Intel, and even some ARM processors... Is that failed branch prediction must break a very deep pipeline in the processors. On an Intel Celeron, the pipeline depth is around 40. That means, every time an "if" statement changes the flow of the machine -- the CPU runs around 40 times slower than normal. A single *if* statement can be very expensive, time wise. A 2.5GHz processor is reduced to 62MHz for a brief moment... There is no way to program preferred branch directions, either, in the C language. It's all up the CPU's automatic discretion.... (if it has any bug free ability at all...). I really do want to stay away from machine architecture dependent coding. That means I can't achieve the "highest" performance -- but it also means I can share the code with anyone; I know from hardware design experience that the relative timing of my machine's math co. hold for many other processors as well. Sin and Cos don't appear to be implementable by any method I know which will be faster using polynomials, than just using the built-in function. Log(), however, could easily be replaced -- and I have some experimental (non-polynomial) formulas that might be well worth curve fitting.... I'm learning a bunch about how a uniform random plugged into a formula produces a pdf ! eg: f( u1 ) --> a pdf of --> d/dx f^-1( x ). This appears to be key in designing a more easily computed Gaussian generator. There is much that I am studying in BoxMuller, some of which is just for knowledge -- other parts explorations for speed-up. One fairly interesting path is to consider sin and cos: Since a function expressed as sin() is at 90 degrees to one expressed with cos(); I was thinking it may be possible to use this relationship to compute an anti-function of something expressed as sin(f(x)). exp(-0.5*x**2) is slow to compute; but when I graph acos( exp( -0.5*x**2 ) ), The resulting graph appears to be a hyperbola of some kind that would be *VERY* easily computed using only multiplication, division, and addition. Hence, it would be around 2.5 times faster than computing the exponential itself, to compute the cosine of a different function... (And it may be *MUCH* faster than computing the anti-exponential directly as BoxMuller does...!) Does anyone happen to know what the formula of the hyperbola is for acos( exp(...) ) ? |
| Jun26-12, 06:01 PM | #20 |
|
Recognitions:
|
I guess if you look at x > 0 only, it does resemble a hyperbola. Taking the gradient near the origin to represent the other asymptote, the angle between asymptotes on the same branch is 3π/4. So the hyperbola should look like y2 = tan2(3π/8)x2 - c, rotated clockwise by 3π/8 and suitably translated. |
| Jun27-12, 07:02 AM | #21 |
|
|
Code:
/*-- use AS 241 --- */
/* double ppnd16_(double *p, long *ifault)*/
/* ALGORITHM AS241 APPL. STATIST. (1988) VOL. 37, NO. 3
Produces the normal deviate Z corresponding to a given lower
tail area of P; Z is accurate to about 1 part in 10**16.
(original fortran code used PARAMETER(..) for the coefficients
and provided hash codes for checking them...)
*/
if (fabs(q) <= .425) {/* 0.075 <= p <= 0.925 */
r = .180625 - q * q;
val = q * (((((((r * 2509.0809287301226727 +...
But anyway, since you also say So I am going to calculate the number 5000000 comparing 1 million additions of 0.5 vs 5000000 additions of 1 within 1 million if statements, these are the results: Code:
x = 0; 10000000.times{|i| x+=0.5 } #time 5.388s
x = 0; 10000000.times{|i| x+=1 if i%2 == 0} #time 4.236s
|
| Jun29-12, 05:53 PM | #22 |
|
|
![]() I take it ruby is an interpreted language? There might be something in the interface... I have no idea why doing more work would take less time... ! Your example is absolutely baffling...! Now, in Ruby, since you have nothing after the if statement -- am I understanding correctly -- the syntax to mean only add 1 when the index is even? eg: do nothing when odd? (I'm not very familiar with Ruby... I have done more PHP do be honest.) BUT: I'm sure that my code isn't a factor of 10x + faster; which in theory is ought to be... Of course, I didn't account for the random generator's percent of processing time -- I was just expecting a much quicker response than I got!; Hence I clearly made a mistake *somewhere* ; (P.S. my unsure nature is why I asked if anyone saw a mistake in my code) In any event: The issue is called "branch prediction". If the processor has hardware smart enough to decide which path is *more* likely, then the penalty only occurs on the occasional time where the prediction is wrong. The rest of the time, the "if" statements are *very* cheap. Supposedly (I haven't checked the comment made by a friend) Intel processors keep some kind of statistic to predict which is more likely on the next pass. That includes the Atom. However, even if my code were somehow wrong -- if your example is portable into C as your cup of tea (are YOU sure!) -- then my code ought to have been *much* *much* *much* ******MUUUUCHH********** faster ! However, Even if I just got the speedup from using sin() vs. using exp() -- I'd be quite happy!. It's quick enough.... I have come across an almost identical shape before in another project I did some time ago, and it ended up being a ratio of something like (this is going to be wrong... of course) x/(3+x); There may have been square terms in it, etc. I don't remember ... but there was something that caused the shape change at zero. I really am going to have to come back to that problem once I find my backup of the earlier work -- hopefully that has the answer I am seeking. It will be a good estimate, regardless ... I do find it irritating I can't just solve the problem analytically; I mean Euler's identity has the exponent in it already: cos( t ) + i*sin( t ) = exp( i*t ) -- so converting from exp to cos() or sin() just shouldn't be that hard. ! I am sure I have seen an exact answer to this problem in a CRC handbook or something. It wasn't a very complicated expression.... But, I don't see how to solve it analytically myself. If anyone does KNOW the exact answer -- please share, please.... ;) (Even if it ISNT a hyperbola..!) If his is faster and already has the tail solved correctly -- it might just be better to use his code and optimize the random generator. (I have a potential trick for that.... !) But: I have another issue I'm working out analytically, that is more important to solve first (next post.) Thanks so much for the link; that may be good enough. |
| Jun30-12, 03:22 AM | #23 |
|
|
Ok, I mentioned earlier that the product of Gaussians -- aren't Gaussians.
In this first part of the problem, I am interested in solutions where the Gaussians being multiplied aren't correlated. This problem comes up quite often in error propagation formulas; and so is interesting to study. As an example, I attached two such operations; appropriately labeled -- and they both obviously result in non-Gaussians. A typical reference (USA/NIST) indicates that for multiplication statistics one ought to use: [tex]s_{area }= \sqrt { width^{2} \times s _ {length} ^ {2} + length ^ {2} \times s _ {width} ^ 2 } [/tex] http://www.itl.nist.gov/div898/handb...ion5/mpc55.htm The note says that this particular formula ignores correlation -- which (since there is none) -- is irrelevant to my problem. BUT -- When I try to solve the problem analytically, I get a different formula than shown in NIST. NIST also list an exact formula, which includes correlation -- and is a solution I don't understand. It appears to have many terms that I wouldn't know exactly what they are.... If anyone can show if it reduces to the other formula in absence of correlation, I'd be interested. I looked around for other listings of error propagation formulas ... and I generally come across this same formula shown in NIST. eg: In Stower's research for medical applications... research.stowers-institute.org/efg/Report/PropagationOfErrors.pdf And, in typical papers I find elsewhere... there is one alternate variation I often see: http://www.physics.uc.edu/~bortner/l...tion%20htm.htm All they are doing in the alternate form (really) is converting to ratios or percentages, and then doing the same propagation formula as NIST does. However, that variation of the formula also will give a divide by zero for the problem shown in attachment #2... which is actually worse... So, what is *really* wrong here? :) I *SERIOUSLY* can't find what I did wrong when attempting to derive NIST's formula; and this is just the simple case.... ugh! In order to solve the equation, I am going to assume the normals are in a special form -- with standard deviation always equal to one. It makes for less variables during the calculus, and reduces the odds I will make a mistake.... *sigh*. So, start with two normals N(a,1) and N(b,1), and solve for the product... In this post, I'm going to solve for the mean; If no one sees a mistake, I'll then solve for the deviation. Begin: The probability of each multiplication element is: [tex] p(X,Y)={ 1 \over 2\pi } ( e^{-{ 1 \over 2}(x-a)^2} e^{-{ 1 \over 2}(y-b)^2} ) [/tex] The elements of the mean for multiplication are: [tex] \mu_{e} ( X,Y )=xy [/tex] Which makes the weighted mean's elements: [tex] p(X,Y) \mu_{e} ( X,Y ) = { 1 \over 2\pi } xy( e^{-{ 1 \over 2}(x-a)^2} e^{-{ 1 \over 2}(y-b)^2} ) [/tex] Combining yields: [tex] \mu = \int \limits _{ - \infty }^{ \infty } \int \limits_{ - \infty }^{ \infty }{ 1 \over 2 \pi } xy( e^{-{ 1 \over 2 }(x-a)^2} e^{-{ 1 \over 2 }(y-b)^2} ) dxdy [/tex] Changing to polar coordinates to allow fixed radii vs probability pairs: [itex] r^{2}=(x-a)^{2}+(y-b)^{2} [/itex] and [itex] x-a=cos( \theta )r [/itex] and [itex] y-b=sin( \theta )r [/itex] [tex]dxdy=r \times d \theta dr [/tex] [tex] \mu = \int \limits_{ 0 }^{ \infty } \int \limits _{0}^{ 2 \pi } { 1 \over 2 \pi } (cos( \theta )r+a)(sin( \theta )r+b) e^{-{ 1 \over 2}r^2} (r \times d \theta dr) [/tex] [tex] \mu = \int \limits_{ 0 }^{ \infty } \int \limits _{0}^{ 2 \pi } { 1 \over 2 \pi } (sin( \theta )cos( \theta )r^{2} + cos(\theta)br + sin( \theta )ar + ab) e^{-{ 1 \over 2}r^2} d \theta dr [/tex] All of the integrated trigonometric terms will have the same value at 0 and [itex]2 \pi[/itex] and may be safely ignored. [tex] \mu = { 1 \over 2 \pi } \int \limits_{ 0 }^{ \infty } \left . (ab \theta) \right | _{0}^{ 2 \pi } re^{-{ 1 \over 2}r^2} \times dr [/tex] [tex] \mu = ab \int \limits_{ 0 }^{ \infty } r e^{-{ 1 \over 2}r^2} \times dr [/tex] [tex] \mu = ab [/tex] So, when multiplying N(5,2) * N (3,7); it will be the same as 2*7*(N(5/2,1),N(3/7,1) And that would yield: 5/2*3/7 = 15/14 * 2*7 = 15. In general, then, the mean is always the product of the original means for uncorrelated Gaussians. That, everyone seems to agree to ... but did I mess up or is this, "so far, so good?"
|
| Jun30-12, 06:42 AM | #24 |
|
|
Code:
void main(int argc, char *argv[]) {
double x = 0;
for(int i=0; i<100000000; i++){ x += 0.5; } /* time 1.244s */
for(int i=0; i<100000000; i++){ if (i%2==0) {x += 1;}} /* time 0.904s */
printf("x is %f\n", x);
}
![]()
|
| Jun30-12, 08:28 AM | #25 |
|
|
You can learn to derive the general formula for the product of to variables with this paper: http://www.jstor.org/discover/10.230...47699110527507 In the general expression derived in this paper you can see that when X and Y are not correlated many terms disappear and the remaining expression is the one you post here: [itex]s_{area }= \sqrt { width^{2} \times s _ {length} ^ {2} + length ^ {2} \times s _ {width} ^ 2 } [/itex] |
| Jun30-12, 01:18 PM | #26 |
|
|
![]() I do understand why it's faster now; I didn't notice before that the predicate of the if statement was an integer. So, you are doing one integer operation or one floating point operation; although modulus is often two operations (division, multiplication, subtraction) -- I think that integer division has a remainder in assembly... If you switch the variable i, to a float or double, then your code has the same characteristics as mine does. The predicate of the float is a float... But just getting a rough estimate of the speed of the *if* with your loop (ignoring the for loop itself, as that is not a true if statement but a repeat statement in assembly -- perfect branch prediction) I am pretty sure my earlier estimate of the pipeline delay breaking (which is based on intel's own data, not measurement) doesn't explain why my code is so much slower than expected.... Hmmm.... Yes, your ideas *do* make sense. |
| Jun30-12, 01:52 PM | #27 |
|
|
And I see the next step for computing a variance is also the same as mine.... but then he uses simple algebra; and although I used calculus -- I should have gotten the same answer -- and I just double checked, and I still can't find my mistake. To make it worse, my answer actually agrees more with my posted experiments ... which could just be be a fluke.... I wonder, is there some kind of assumption in Goodman that I missed? This is my path to it: The variances for multiplication are: [tex] \sigma ^{2} ( X,Y )=( xy - ab )^{2} [/tex] And the weighted variances: [tex] p(X,Y) \sigma^{2} (X,Y) = { 1 \over 2\pi }( xy - ab )^{2} ( e^{-{ 1 \over 2}(x-a)^2} e^{-{ 1 \over 2}(y-b)^2} ) [/tex] [tex] \sigma ^{2} = \int \limits _{ - \infty }^{ \infty } \int \limits_{ - \infty }^{ \infty }{ 1 \over 2 \pi } (xy-ab)^2( e^{-{ 1 \over 2 }(x-a)^2} e^{-{ 1 \over 2 }(y-b)^2} ) dxdy [/tex] Changing to polar coordinates to allow fixed radii vs probability pairs: [itex] r^{2}=(x-a)^{2}+(y-b)^{2} [/itex] and [itex] x-a=cos( \theta )r [/itex] and [itex] y-b=sin( \theta )r [/itex] and [itex]dxdy=r \times d \theta dr [/itex] [tex] \sigma ^{2} = { 1 \over 2 \pi } \int \limits_{ 0 }^{ \infty } \int \limits _{0}^{ 2 \pi } ((cos( \theta )r+a)( sin( \theta )r+b) - ab)^{2} e^{-{ 1 \over 2}r^2} (r \times d \theta dr) [/tex] Expanding the xy substitution inside the parenthesis, factoring common terms... [tex] \sigma^{2} = { 1 \over 2 \pi } \int \limits_{ 0 }^{ \infty } \int \limits _{0}^{ 2 \pi } (sin( \theta )cos( \theta )r^{2} + cos(\theta)br + sin( \theta )ar + ab -ab)^2 re^{-{ 1 \over 2}r^2} d \theta dr [/tex] [tex] \sigma^{2} = { 1 \over 2 \pi } \int \limits_{ 0 }^{ \infty } \int \limits _{0}^{ 2 \pi } ( sin( \theta )cos( \theta )r + ( cos(\theta)b + sin( \theta )a ) )^2 r^3e^{-{ 1 \over 2}r^2} d \theta dr [/tex] [tex] \sigma^{2} = { 1 \over 2 \pi } \int \limits_{ 0 }^{ \infty } \int \limits _{0}^{ 2 \pi } g( r, \theta ) r^3e^{-{ 1 \over 2}r^2} d \theta dr [/tex] Expanding the trigonometric terms' square. [tex] g( r, \theta ) = sin^2( \theta )cos^2( \theta )r^2 + 2sin( \theta )cos^2( \theta )br + 2sin^2( \theta )cos( \theta )ar + (cos^{2}( \theta )b^{2} + 2cos( \theta )sin( \theta )ab + sin^{2}( \theta )a^{2}) [/tex] Integrating the expansion with respect to [itex] \theta [/itex], r being constant. [tex] \int \limits _{ 0 }^{ 2 \pi }g( \theta, r )d \theta = [/tex] [tex]\left . \left ( {(4 \theta - sin(4 \theta)) \over 32}r^{2} - { 2 \over 3 } cos^3( \theta )br + {2 \over 3} sin^{3}( \theta )ar + { \theta + sin( \theta )cos( \theta ) \over 2 }b^2 - {cos( 2 \theta ) \over 2}ab+{ \theta - sin( \theta )cos( \theta ) \over 2} a^2 \right ) \right | _{0}^{2 \pi } [/tex] Any trigonometric term will have the same value at 0 and [itex]2 \pi [/itex] and cancel, leaving. [tex] \int \limits _{ 0 }^{ 2 \pi }g( \theta )d \theta = \left . \left ( { \theta \over 8 }r^{2} + { \theta \over 2 }b^2 +{ \theta \over 2} a^2 \right ) \right | _{0}^{2 \pi } = 2 \pi \left ( {r^{2} \over 8} + { a^{2} \over 2 } + { b^{2} \over 2 } \right ) [/tex] Now, Back-substituting the result... [tex] \sigma^{2} = { 1 \over 2 \pi } \int \limits_{ 0 }^{ \infty } 2 \pi \left ( {r^{2} \over 8} + { a^{2} \over 2 } + { b^{2} \over 2 } \right ) r^3e^{-{ 1 \over 2}r^2} dr [/tex] [tex] \sigma^{2} = { 1 \over 8 } \int \limits_{ 0 }^{ \infty } r^5e^{-{ 1 \over 2}r^2} dr + { a^{2} + b^{2} \over 2 } \int \limits_{ 0 }^{ \infty } r^3e^{-{ 1 \over 2}r^2} dr [/tex] [tex] \sigma^{2} = { 1 \over 8 } \left . \left ( -e^{-{ x^2 \over 2 }}(x^4 + 4x^2 + 8 ) \right ) \right | _{ 0 }^{ \infty } + { a^{2} + b^{2} \over 2 } \left . \left ( -e^{-{ 1 \over 2 }x^{2} }( x^2 + 2 ) \right ) \right | _{ 0 }^{ \infty } [/tex] [tex] \sigma^{2} = { 1 \over 8 } \left ( 0 - -(1)(0+0+8) \right ) + { a^{2} + b^{2} \over 2 }( (0) - (-2) ) [/tex] [tex] \sigma^{2} = 1 + \left ( { a^2 + b^2 } \right ) [/tex] Taking the final result, and extending it to values with arbitrary stds; [tex] N( a, \sigma _ {a} ) \times N( b, \sigma _ {b} ) = \sigma _ {a}\sigma _ {b} \times N( { a \over \sigma _ {a} }, 1 ) \times N( { b \over \sigma _ {b} }, 1 ) [/tex] [tex] N( a, \sigma _ {a} ) \times N( b, \sigma _ {b} ) \rightarrow \mu = ab , \sigma = \sqrt { ( \sigma _ {a} b )^{2} + ( \sigma _ {a}\sigma _ {b} )^{2} + ( a \sigma _ {b} )^{2} } [/tex] So, my formula only differs in one term from that posted on NIST and all over the internet... SO HERE'S Andrew's megalomania version.... It's me against the world (and likely to loose...! again...)I get, [tex] \sigma _ {width \times length} = \sqrt{ width ^ {2} \times \sigma _ {length} ^ {2} + \sigma _ {length} ^ {2} \times \sigma _ {width} ^ {2} + length ^ {2} \times \sigma _ {width} ^ {2} }[/tex] Oy!, and so how did I blow it this time? (I thought I was getting better at the calculus up and until now...) |
| Jun30-12, 03:03 PM | #28 |
|
|
|
| Jun30-12, 03:57 PM | #29 |
|
|
![]() Well, after NIST getting wrong the second formula I should not have trusted the first either!! And now NIST goes out of my bookmarks and, yesss, there it goesssss. So I went back to check Goodman's formula cause I could not see anything wrong in your development, and there it was, the extra term that you've just calculated is there in Goodman's formula! to be precise is [itex]E(Δx^2Δy^2)[/itex] which happens not to be zero when X,Y are independent unlike the rest of the terms that go away in that situation! Oh my God... if you cannot trust random guys in the Internet who're you gonna trust now. |
| Jun30-12, 07:14 PM | #30 |
|
|
perhaps I ought to have counted the instructions in detail before making the comment.... In reality, I ought to re-measure *everything* -- and re-test; but I am hesitant to do so -- as fixing the random generator needs to be done first to avoid wasting a lot of time -- and that's why I haven't yet. No, I really can't -- if I set it to 1==1, the gcc compiler will remove the "if" altogether -- and since it takes the "if" to determine the percentage of execution of each path -- such a change must at least include a calculation that allows me to estimate the performance hit... I can only remove fixed computations, such as that for exp(). There are, inside intel's processor, several hardware optimizations that overlap; there is out-of-order execution of instructions, pipeline-delay, cache-delay, and math co-processor de-coupling. If I had made the hardware myself, it would be easy to weigh which is causing what -- but even professional software engineers tell me that the published specifications do not always agree with performance. eg: some were attempting to make an optimized movie player for Intel -- only to find that the optimized version ran slower than the un-optimized version because the processor did not optimize in the same way the intel engineers said it would. I would like to think that *none* of these are causing the problem -- however, I have no alternate explanation at this time and it doesn't look to be simple. (But I wouldn't be surprised if it was just a dumb mistake of mine or a transcription error made much earlier...) (P.S. Considering we have optimizations uncertainties in the loop -- *MULTIPLIED* by other uncertainties -- what formula ought I use to calculate the range (product) of the uncertainties? )Here's the code I wrote, it's short enough to inspect right here: PHP Code:
The first "*if*" statement returns ~48% of the time; A true multiply is not needed since a multiply by 2.0 is the same as an exponent adjust of +1; So -- This code *OUGHT* to be reaaaally fast. I'd guestimate -- 1% of BM time or better... So, in order for the whole algorithm to be only slightly faster than BM (say 80% of BM's time as a random figure...), we would need 1% * 0.48 + x%*0.52 = 80%; So, that's about x% = 152% ; eg: I would have to be slower than BM by 50% on average... ?! Considering that BM uses the log operation at 80+ seconds (not counting the rest of the algorithm)/billion ops. And I am only using very short polynomials at less than 2s/add or multiply -- I find it surprising that my code is even on the same order of magnitude... Scan through the code, and do an estimate of what you think the execution time ought to be on average.... or better; could you isolate the specific thing in the code which you suspect is causing the problem -- and I'll replace that with some kind of dummy statement and see what happens ? Cheers... |
| Jun30-12, 11:03 PM | #31 |
|
|
ME?!![]() I do understand ... but ... but ... That's mildly disturbing ... Everything from the medical research community, the universities (and especially the physics departments) all the way to the USA's political parties appear to be using that formula from NIST; .... and it's not quite right in the head, eh? ![]() I'd write to NIST -- but I doubt they even read their e-mail... hmmm...... I'm looking again at the page you linked to from Jstor; They give the first page to read -- and then the rest is purchase only. At 8 pounds, it looks to be rather expensive to just randomly read many articles from them. When the authors mention "a sketch of specializations and applications", do you know what they are referring to? I am puzzling over what happened a bit, and am interested in the other terms... Jstor breaks it into three lines, the first line is identical to what I wrote for un-correlated; the remaining two lines, then are for the correlated cases. Now, it's obvious that when there is no correlation -- that anything multiplied by a correlation constant of zero, goes away.... Third line goes *poof*. Q1: In the Jstor article, is the covariance constant a Pearson value, or something else? But the second line is not so obvious; This is why I had to trust what you said regarding what did and didn't go away.... (I'm humble, I've made my 10 mistakes already...) I am thinking that it goes to zero because the expectation value of Δx is by definition zero, and the same is true of delta y. I showed in the first part of my proof that for uncorrelated Gaussians, the mean of a product is the product of the means. But, once we have a term like Δy * Δy , THAT is NO longer a Gaussian, and so I am no longer sure why: E( Δx Δy**2 ) = 0; Assuredly E( Δx * Δy ) WOULD be zero; but I don't know about the combination.What do you think, Is it going to be true in general, or is it because there are two partial products that interact? E(x)*E( Δx Δy**2 ) + E(y)*E( Δx Δy**2 ) = 0. In my own graphed examples... I was noticing skew (and kurtosis?) in the final products of Gaussians (previous post), and I am also interested in knowing if there are generalized / simple ways to determine how skewed or how much kurtosis one can expect from a particular multiplication.... |
| Jul1-12, 03:35 AM | #32 |
|
|
![]() But yeah, I myself learn techniques to evaluate how efficient code should be and blah, blah but, at the end of the day, turns out that hardware, compiler, settings end up having such impact that "non-optimal" code turns out to be the fastest one. As one of my favorite wisdoms goes "Theory is better than practice in theory, but practice is better that theory in practice" (I think I already posted this wisdom in this thread... geee I'm growing old) You should place conditions in a way that you check first the most likely scenario, so in second place instead if (z>spin2) which barely happens you should check if (z<=spin2) for regions A2,A3,error... so check the order of the other if statements too and make sure that likely comes first. One comment, you use exp(-0.5*z*z) when if (z>spin) and if (z<=spin2) so roughly in [1..2], then you transform z = spin2 - z; so roughly again z is now in [0..1] and then, again -0.5*z*z is in [-0.5..0] so you calculate exp([-0.5..0]). So since you know this instead approximating exp() you should only need approximate exp within [-0.5..0]. In a parallel thread I asked about ways to approximate functions with ratios of polynomials and someone mentioned: Pade Aproximant Turns out that there are books with chapters dedicated only on how to approximate the exp function with this kind of ratios. Will this be faster than just using exp? I don't have that feeling but, anyway, there you go. |
| Jul1-12, 04:13 AM | #33 |
|
|
![]() First time I didn't see the gov in the URL and I just thought it was a random guy just copying the wrong formula but... reaaaaaaally? So many people rely in this formula? That's worrying because actually the error makes the variance over optimistic, so if some engineer out there is using this to assess risk in the device he is designing we are in trouble!! I mean, really?Physics departments too? Well, they say this year the LHC might announce the discovery of the Higgs boson, if they are using this formula somewhere you can be a hell of party popper "Yeah... LHC? Yeah, Andrew speking, I was bored and then I discovered the Internet is wrong so, yeah, since you already made a clown of yourselves with the faster than light neutrino thingy I just thought I might spare you some further embarrassment.... Have a nice day" ![]() |
| Jul3-12, 04:52 PM | #34 |
|
|
http://en.wikipedia.org/wiki/Propagation_of_uncertainty Their formula is: [tex]f = AB[/tex] [tex] \left(\frac{\sigma_f}{f}\right)^2 = \left(\frac{\sigma_A}{A}\right)^2 + \left(\frac{\sigma_B}{B}\right)^2 + 2\frac{\sigma_A\sigma_B}{AB}\rho_{AB}[/tex] So, if pair A,B is *un-correlated*, [itex]\rho_{AB}[/itex] is zero ... this formula looses product of deviations term, as well. Assuming Goodman is right (and I agree with him, and NIST actaully cites him...but doesn't quote him correctly...) -- then, the problem seems to show up where the technique of using partial derivatives to compute the error propagation formula is used; eg: as is shown in the Wikipedia article section "Non-linear Combinations". What do you think? I'm not so sure this is important for many typical applications, as the error term is going to be a small fraction of the total standard deviation; but I'd be interested in looking at LHC's work to see what formula they do use... (Not that I'd correct them if they were wrong ... I'd just like to see if they use it at all.) CERN's site, though, doesn't appear to have direct links to open papers showing their methods. I do like sites like ARXIV, as they allow/encourage public inspection of their work ... I wish there were more establishments that did that.
|
| New Reply |
| Thread Tools | |
Similar Threads for: Expectation value for CDF fails...
|
||||
| Thread | Forum | Replies | ||
| express moment / expectation value in lower order expectation values | General Math | 3 | ||
| C++ conditional testing fails, why? | Programming & Comp Sci | 4 | ||
| The Expectation of X and the Expectation of X squared (discrete math) | Calculus & Beyond Homework | 2 | ||
| If GP1 fails to favour GR | Cosmology | 9 | ||
| When L'Hopital's Rule fails... | Calculus | 1 | ||