Trivariate CDF, solving for variable within CDF itself?

veejl · May 23, 2011

So I've spent the better part of the last 2 days reading your forums (awesome btw) as well as as scouring Google and other sites for the past week, trying to figure out what to do.

I have this equation here:

[tex]p(injury) =\Phi \frac{ln(F) - 2m - 3a + b}{0.8}[/tex]

probability(injury) = cumulative distribution function (ln(Force) - 2(mass) - 3(age) + constant)/0.8.

I'm trying to figure out a way to solve that eqn for Force, such that I have F = ...

I am hoping to apply this to a data set containing cases of known injuries with mass/age given for each injury. I plan to set the probability(injury) to 1, since I know that an injury did occur. My output would be a force for each case.

I honestly have no clue what to do in solving the equation.
- If I hold mass/age as constants, then my eqn is pretty much useless since it only looks at Force (assuming that solving the CDF requires taking a derivate of that entire eqn).
- Initially I thought to ignore the CDF on a friend's advice that I am looking at a single point versus a cumultive probability. So I got this: F = e^{0.8*p(fracture) + 2m + 3y - b)}. But, I think this too is incorrect and some other mathematical permutations need to be happening.
- Tried to convert it into a PDF, but really not sure what that accomplished
- Would "point probability" be a way to go? (http://en.wikipedia.org/wiki/Cumulative_distribution_function#Point_probability"

Some other things that might be useful:
- I know that injuries happen at a minimum Force. So that could be useful as a lower limit or bound.

I am also worried that I can't that I can't justify stating probability(injury)=1, even though it is known that injury did occur. Any thoughts on this point?

any help or insight would be utterly fantastic. thanks in advance!

bpet · May 23, 2011

If you mean p=Phi( (ln(F)-2m-3a+b)/0.8 ) where then the equation is solved by
F = exp( 0.8*Phi^{-1}(p)+2m+3a-b ) where Phi^{-1)(p) is the inverse function of the CDF, also called the quantile function. It depends what Phi is but if it is the normal distribution cdf then there is no simple expression for it but it can be easily calculated in most computing packages, for example with NORMSINV() in Excel.

Stephen Tashi · May 23, 2011

I'll re-state the problem, as I understand it.

The data underlying the problem consists of vectors, each with 4 components. The components are (J,A,M,F) defined by:

J : 1 if there was an injury in the incident and 0 otherwise.
A: age of person in the incident
M: mass involved in the incident
F : force involved in the incident

The probability that J is 1 is a known function Phi(A,M,F).

You only have data of the form: (1,a,m,?) where the 1 indicates there was an injury in the incident, a is the age of the person in incident and m is the mass involved in the incident. The "?" indicates that the force involved in the incident is unknown.

You want "an equation for force" as a function of age and mass. Age and mass don't determine a unique force. If you could find the probability distribution of force as a function of age and mass, you could state a single number like the average force or most probable force.

Setting p(injury) =1 and solving for f in terms of m and a can't be justified by any mathematical reasoning that I see. I think it would tend to overestimate the force involved since, intuitively, that's trying figure out how much force is needed to cause injury with certainty.

I'm not sure this problem is solvable. If it is, I think the solution involves using the data to estimate the joint distribution of (f,m,a). This probably involves making some further assumptions about the data. This is an interesting problem. I'll continue to think about it.

veejl · May 24, 2011

Hm.. interesting points.

@bpet - that's correct, I'm looking at a normal distribution CDF

@Stephen Tashi - The original eqn I'm working from is a probability distribution function of injury, taking into account force, age, and mass. I have probability curves from this, plotting force vs probability of injury, at a 3 different ages/masses. Perhaps that helps?

Stephen Tashi · May 24, 2011

veejl said:

Hm.. interesting points.

@bpet - that's correct, I'm looking at a normal distribution CDF

Keep in mind that you are not using the formula for the CDF of a normal distribution as a CDF. You are using it for an entirely different purpose ( to set the parameter of a bernoulli random variable). So the the theory of the normal distribution has nothing to do with your question.

@Stephen Tashi - The original eqn I'm working from is a probability distribution function of injury, taking into account force, age, and mass. I have probability curves from this, plotting force vs probability of injury, at a 3 different ages/masses. Perhaps that helps?

Such plots have no information that is not already in the formula that you gave, unless they indicate the limits placed on the variables. For example, can age be greater than 100?

I don't whether a solving this problem is something that will merely be written up in a school term paper or whether you need an answer for some important practical purpose. This is not a simple problem. As best I can tell, it involves estimating the joint distribution of three variables from data that does not directly give the values of all three variables. You won't find this solved in an introductory statistics book.

veejl · May 25, 2011

Ok, I apprec the advice.

I've decided to tweak my original project idea a bit and work with the formula as is. So I will input injury force, and keep ramping that up until my probability of injury is >.5.

Would that be possible? And if so, any tips on tackling it?

Right now, in Excel, I have the following:
=NORM.DIST((ln(F)-2m-3a+b)/.8, 0, 1, TRUE), where mass(m) and age(a) are pulling information from a specific column, and force(F) is pulling from a specific cell that I am inputting a force into.

However, I think this gives the cumulative probability from 0-->F, which is not what I want. I need to find out at a given force, what is the probability of injury.

Any tips with this route?

Stephen Tashi · May 25, 2011

veejl said:

Ok, I apprec the advice.

I've decided to tweak my original project idea a bit and work with the formula as is. So I will input injury force, and keep ramping that up until my probability of injury is >.5.

Would that be possible? And if so, any tips on tackling it?

Perhaps the purpose of spreadsheets is for people to fool around with them without actually knowing what they are doing. In that sense, anything is possible.

The method you propose has no logical or mathematical basis. If there is some serious purpose behind your work, I suggest you hire a qualified consultant. (Apparently, you aren't going to think enough about probability theory to understand the problem yourself.) If the purpose of your work is not that serious then its fine to enjoy doing various random calculations with Excel.

bpet · May 27, 2011

veejl said:

...Right now, in Excel, I have the following:
=NORM.DIST((ln(F)-2m-3a+b)/.8, 0, 1, TRUE), where mass(m) and age(a) are pulling information from a specific column, and force(F) is pulling from a specific cell that I am inputting a force into.

However, I think this gives the cumulative probability from 0-->F, which is not what I want. I need to find out at a given force, what is the probability of injury.

The excel formula does give the probability of injury as a function of force. The purpose of the cdf is to convert a number in the range (-inf,inf) to (0,1).

Stephen Tashi said:

...The method you propose has no logical or mathematical basis...

It's a probit model, perfectly valid!

chiro · May 27, 2011

veejl said:

So I've spent the better part of the last 2 days reading your forums (awesome btw) as well as as scouring Google and other sites for the past week, trying to figure out what to do.

I have this equation here:

[tex]p(injury) =\Phi \frac{ln(F) - 2m - 3a + b}{0.8}[/tex]

probability(injury) = cumulative distribution function (ln(Force) - 2(mass) - 3(age) + constant)/0.8.

I'm trying to figure out a way to solve that eqn for Force, such that I have F = ...

I am hoping to apply this to a data set containing cases of known injuries with mass/age given for each injury. I plan to set the probability(injury) to 1, since I know that an injury did occur. My output would be a force for each case.

I honestly have no clue what to do in solving the equation.
- If I hold mass/age as constants, then my eqn is pretty much useless since it only looks at Force (assuming that solving the CDF requires taking a derivate of that entire eqn).
- Initially I thought to ignore the CDF on a friend's advice that I am looking at a single point versus a cumultive probability. So I got this: F = e^{0.8*p(fracture) + 2m + 3y - b)}. But, I think this too is incorrect and some other mathematical permutations need to be happening.
- Tried to convert it into a PDF, but really not sure what that accomplished
- Would "point probability" be a way to go? (http://en.wikipedia.org/wiki/Cumulative_distribution_function#Point_probability"

Some other things that might be useful:
- I know that injuries happen at a minimum Force. So that could be useful as a lower limit or bound.

I am also worried that I can't that I can't justify stating probability(injury)=1, even though it is known that injury did occur. Any thoughts on this point?

any help or insight would be utterly fantastic. thanks in advance!

So you want to find some kind of functional equation for F (i.e. force)?

From what you have posted the force is based on at least 3 variables (a,m,b) and another "injury" variable.

What kind of relationship do you want to find? Do you want an expression that is generic for all general a,b,m and injury, or do you want constraints on your function (for example fixing m to be a constant)?

Also with regard to an injury actually occurring, it doesn't mean if something occurs that its probability is 1: you have to remember probability is reflected by a long term experiment where the probability converges to the number of times something happens over the total trials. The only way you would get a probability of 1 is if every single time you tried something it happened.

Stephen Tashi · May 27, 2011

It's a probit model, perfectly valid!

I didn't say the model wasn't valid (although when you see probit models, they are usually based on such crude curve fits that it's hard to take them seriously). I said what he is doing with the model (to obtain "an equation for force") has no logical or mathematical basis.

Trivariate CDF, solving for variable within CDF itself?

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Who May Find This Useful

Similar threads

Graduate Expected numbers of cards of a last color remaining

Undergrad The problem of points

Graduate Probability puzzle

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Undergrad Understanding permutations and combinations in a coin toss experiment

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect