Looking for empirical equation in experimental data

AI Thread Summary
An engineering student is seeking an empirical equation for experimental data in nuclear research, specifically looking at a variable G dependent on A, B, and C. Initial attempts to model G as a simple linear combination of the variables proved inadequate due to non-constant slopes. Suggestions included using interaction terms and least-squares fitting to find a more accurate representation of the data. The student found that a more complex model, incorporating higher-order terms, yielded better correlation with measured values, though concerns about outliers in the data remain. The discussion emphasizes the importance of model complexity in accurately capturing the relationships between variables in empirical research.
Curran919
Messages
2
Reaction score
0
I am an engineer student working in nuclear research. I am performing some experiments looking for an empirical equation to apply to results in a test section, but am having trouble making a mental leap. Here is the core of the problem with all of the engineering 'fat' trimmed off:

I have a variable with three dependents:
G = f(A,B,C)

I have shown that G is more or less linear WRT each variable for multiple values of the other variables (sorry, I'm an undergrad engineer, mathematic notation is lacking):
G=f(A) of O(1) for every B,C
G=f(B) of O(1) for every A,C
G=f(C) of O(1) for every A,B


I would like to say that because of this,
G = f(A)+g(B)+h(C)
or even,
G = aA+bB+cC+d where a,b,c,d are constants

but this would only be true if the slope of f(A) where constant regardless of B,C (and the same for f(B)/f(C)). Of course, it isn't. Is what I've said correct, and if so, is there an alternative conclusion I can make?

G = (A-a)(B-b)(C-c)?
 
Mathematics news on Phys.org
A common model for empirical work is the linear model. Where things don't fit so well, the experimenter can include interaction terms. Thus, the model might be (for two independent variables x and y)

z = a*x + b*y + c*x*y

The a, b, and c are constants are need to be fit to the data. You might check some books on the Design of Experiments, as such modeling is often done in that context.
 
Consider the function f(A,B,C) = AB+AC+BC. This is linear in each variable, but not globally approximate to anything on the form Aa+Bb+Cc for constants a, b and c. If you are only interested in local behavior, you should add the constraints of a, b and c. Then maybe you can get an approximate linear form.

If you know some linear algebra, you could find the linear expression that is "closest" to your set of data-points in the manner that the sum of squares of the differences from the data points and the values of a linear expression is minimized. If you have values for f(x,y,z) at (x_1,y_1,z_1), (x_2,y_2,z_2),...,(x_n,y_n,z_n), solve for the least-squares solution to Mx=b, where
M = \begin{bmatrix} x_1 & y_1 & z_1 & 1 \\ x_2 & y_2 & z_2 & 1 \\ \vdots & \vdots & \vdots & \vdots \\ x_n & y_n & z_n & 1 \end{bmatrix}

and

b = \begin{bmatrix} f(x_1,y_1,z_1) \\ f(x_2, y_2, z_2) \\ \vdots \\ f(x_n , y_n , z_n) \end{bmatrix}.

I.e. solve for M^TMx = M^Tb.



Then one of your x = \begin{bmatrix} a \\ b \\ c \\ d \end{bmatrix} will give an approximation f(x,y,z) \approx ax+by+cz+d on these data-points. The more "linear" your function behaves the better the approximation.

If you suspect it to be on other forms, such as higher degree polynomials or linear combinations of entirely different functions this can also be done similarly. To do this: If you think the function is approximately a linear combination of the functions g_1(x,y,z),...,g_k(x,y,x), substitute \begin{bmatrix} g_1(x_i,y_i,z_i) & \ldots & g_k(x_i,y_i,z_i) \end{bmatrix} for the i'th row of the matrix M, and solve for some k-vector x, which will be the coefficients of the functions. The linear form corresponds to the case where g_1(x,y,z) = x, g_2(x,y,z) = y, g_3(x,y,z) = z, and g_4(x,y,z) = 1.

Often M^TM will be invertible giving a unique solution (M^TM)^{-1}M^Tb, and inverting will not be very difficult as the matrix M^TM is a k x k matrix where k is the number of functions you are considering. You should probably constrain your data-set so you can multiply the matrices without difficulty. Hope this helps, good luck.
 
Last edited:
Thank Jarle, very helpful.

Indeed using <br /> f(x,y,z) \approx ax+by+cz+d <br /> gave a poor correlation between the estimated and the measured readings. I tried:

<br /> f(x,y,z) \approx axyz+bxy+cxz+dyz+ex+fy+gz+h <br />

and the correlation appears much better. I think I have some outliers in the measurement data, so I will remove a few instances and see what happens. Is there an underlying explanation to the terms that I used, or is it just a mathematical catch-all (or more terms \approx less error)? I tried nixing the terms that seemed to have a low correlation, which was okay for axyz, but removing any of the second order terms introduced considerable error.
 
Thread 'Video on imaginary numbers and some queries'
Hi, I was watching the following video. I found some points confusing. Could you please help me to understand the gaps? Thanks, in advance! Question 1: Around 4:22, the video says the following. So for those mathematicians, negative numbers didn't exist. You could subtract, that is find the difference between two positive quantities, but you couldn't have a negative answer or negative coefficients. Mathematicians were so averse to negative numbers that there was no single quadratic...
Insights auto threads is broken atm, so I'm manually creating these for new Insight articles. In Dirac’s Principles of Quantum Mechanics published in 1930 he introduced a “convenient notation” he referred to as a “delta function” which he treated as a continuum analog to the discrete Kronecker delta. The Kronecker delta is simply the indexed components of the identity operator in matrix algebra Source: https://www.physicsforums.com/insights/what-exactly-is-diracs-delta-function/ by...
Thread 'Unit Circle Double Angle Derivations'
Here I made a terrible mistake of assuming this to be an equilateral triangle and set 2sinx=1 => x=pi/6. Although this did derive the double angle formulas it also led into a terrible mess trying to find all the combinations of sides. I must have been tired and just assumed 6x=180 and 2sinx=1. By that time, I was so mindset that I nearly scolded a person for even saying 90-x. I wonder if this is a case of biased observation that seeks to dis credit me like Jesus of Nazareth since in reality...
Back
Top