How would I correlate many variables to a few coefficients?

In summary: But if you don't have a lot of data - if you only have a few curves for each combination of substances - then you might not be able to do better than least squares.In summary, the speaker has 550 asymmetrical sigmoid curves fitted to a function with 4 varying coefficients. These curves represent strength as a function of time and temperature for 550 different compounds, each made up of varying substances at varying concentrations. The speaker's goal is to correlate these substances and their concentrations with their coefficients, but they are unsure how to do so with more than one substance. They believe using the method of least squares to fit a system of multiple regression equations may be the solution.
  • #1
mcovalt
28
0
I have around 550 asymmetrical sigmoid curves fitted to a function with 4 varying coefficients. Each of these curves represent strength as a function of time and temperature for a different compound. Each compound is made up of varying substances at varying concentrations.

Overall, I have 550 curves of 550 different compounds, 20 substances, and each compound has no more than 7 of these 20 substances. I'm trying to correlate these substances and their concentrations with their coefficients.

If I had one substance at varying concentration, I know how I'd go about finding it's correlation to the coefficients. I'd find the trend of:

coefficienti(substance,concentration)

But I have no idea how I'd go beyond just one substance, and I've got a lot more than two! I'm hoping someone can point me in the right direction.
 
Physics news on Phys.org
  • #2
mcovalt said:
I have around 550 asymmetrical sigmoid curves fitted to a function with 4 varying coefficients.

I'll suggest how to rewrite your question - see if the details are correct.

I'm trying to predict the "strength" vs "tempertature" and "time" curves of some chemical compounds as a function of the concentrations of their component substances.

I have 20 different substances. From these, I created 550 chemical compounds by combining up to 7 of the substances. Among the different compounds, the concentrations of the various substances varies..

For each compound, I experimentally measured strength as a function of temperature and time. I have selected a particular family of functions to fit the experimental data. Each member of this family is specified by specifying the values of 4 constants.

I would like to predict the value of the 4 constants as a function of compound's composition - i.e. the substances that are in it and their concentrations.

----

(Perhaps you don't want the predict he value of the 4 constants - perhaps you only want to find "correlations"? )
 
  • #3
Thank you Stephen! You have a way with words! That's exactly the scenario and problem.

Each experimentally produced curve follows an asymmetrical "S" shape. These 550 experimentally produced curves have been fitted with an equation containing four constants to alter the function to fit each experimentally produced curve.

I may misuse the word "correlation". My end goal is to produce four functions to estimates each of these four constants when given the substances of an imaginary compound.

Since writing the post, I believe I have stumbled upon an answer. I would use the method of least squares to fit a system of multiple regression equations. I believe this ought to do it.
 
  • #4
mcovalt said:
I would use the method of least squares to fit a system of multiple regression equations. I believe this ought to do it.

It might. Fitting equations to data using least squares is most often done when the variables being predicted had some sort of random error in their measurement. In your case, it isn't clear whether the error in fitting is mostly random error. For example, if your experimental curves are very precise then if a least squares fit produces a big error for the coeffiicent of one curve, it would always produces a big error for that curve, because there the errors in the coefficients aren't random, they are fixed.
 
  • #5


I understand your dilemma and it is a common challenge in correlating multiple variables to a few coefficients. In order to address this issue, there are a few approaches you can take:

1. Use statistical methods: You can use techniques such as multiple regression analysis to determine the relationship between multiple independent variables (substances and their concentrations) and a dependent variable (coefficients). This will allow you to identify which variables have the strongest influence on the coefficients and how they interact with each other.

2. Create a visual representation: You can also create a visual representation, such as a scatter plot or heat map, to visualize the relationship between the different substances, their concentrations, and the coefficients. This can help you identify any patterns or trends in the data that may not be evident from statistical analysis alone.

3. Conduct sensitivity analysis: This involves systematically varying the values of each independent variable while keeping the others constant, and observing the resulting changes in the coefficients. This can help you understand the individual contributions of each variable to the overall coefficients.

4. Consider machine learning techniques: You can also explore the use of machine learning algorithms, such as neural networks, to identify complex relationships between the variables and coefficients. This may be particularly useful if there are nonlinear relationships between the variables.

In conclusion, correlating multiple variables to a few coefficients is a challenging task, but with the right approach and techniques, it is possible to gain insights and understand the relationships between these variables. I would recommend consulting with a statistician or data scientist for further guidance on the best approach for your specific data set.
 

Related to How would I correlate many variables to a few coefficients?

1) How do I choose the variables to include in my correlation analysis?

The best approach is to first identify the research question or hypothesis that you want to test. Based on this, select the variables that are most relevant and have a strong theoretical basis for being correlated. It is also important to consider the nature of the data and the sample size, as well as any potential confounding variables that may influence the results.

2) What is the significance of the correlation coefficient in a correlation analysis?

The correlation coefficient measures the strength and direction of the relationship between two variables. A value close to 1 indicates a strong positive correlation, while a value close to -1 indicates a strong negative correlation. A value of 0 indicates no correlation. The significance of the coefficient is determined by the p-value, which indicates the probability of obtaining the observed correlation by chance.

3) Can I use correlation analysis to determine causation?

No, correlation does not imply causation. While a strong correlation between two variables may suggest a relationship, it does not prove that one variable causes the other. It is important to consider other factors and conduct further research to establish causality.

4) How can I interpret the results of a correlation analysis?

The results of a correlation analysis should be interpreted in the context of the research question or hypothesis. If the correlation is statistically significant and in line with the expected direction, it supports the research hypothesis. If the correlation is not significant or in the opposite direction, it may indicate that the variables are not related or that there are other factors at play.

5) What are the limitations of correlation analysis?

Correlation analysis is a useful tool for exploring relationships between variables, but it has its limitations. It does not establish causality, and it can only measure linear relationships. Non-linear or complex relationships may not be accurately captured by correlation analysis. Additionally, correlation does not account for other variables that may influence the relationship between the variables being studied.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
29
Views
6K
  • Set Theory, Logic, Probability, Statistics
Replies
16
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
10
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
5K
  • Set Theory, Logic, Probability, Statistics
Replies
21
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
949
  • Precalculus Mathematics Homework Help
Replies
3
Views
2K
  • Quantum Interpretations and Foundations
2
Replies
45
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
3K
Back
Top