Summing simple histograms to recreate a more complex one

In summary, the conversation discusses a problem of recreating a complex histogram by summing various simpler histograms from a database. The problem is similar to the animal feed mixing problem and can be modeled as a linear programming problem. However, there may not be a unique solution and other criteria may need to be considered. The problem becomes more complicated if there are histograms of different physical properties and the physical properties may not be additive.
  • #1
Master Sidoshi
I wouldn't be surprised if I've posted in the wrong section because in fact the reason for posting is to get help naming this problem. That being the first step to knowing where to look for a solution. Newbie to the forum so open to advice.

The problem: I have a complex histogram and a database of 600 or so less complex histograms. I know all the 'ingredients' (less complex histograms) I need to recreate the complex one are in my database. Some histograms in my db have unique X/Y data pairs ('bars') so definitely belong in my 'reconstructed' histogram the other bars or X/Y data pairs will have to be recreated by summing various 'ingredients' from my database. I need to iteratively sum histograms from my database until I can as closely as possible recreate the original histogram. What class of problem is this?

Real world application: I have a perfume, I want to recreate it. In the laboratory I acquire a spectrum (wavelength vs. Intensity; chemical shift vs. Intensity or m/z vs intensity etc - basically x/y data points or histograms) of the perfume. In my database I have spectra (histograms) of Sandlewood Extract, Vanilla, Jasmine, Patchouli, Neroli etc etc. I want to sum the Vanilla, Jasmin etc 'histograms' in various combinations/iterations until I've recreated the original perfume histogram.

NB: The Vanilla, Jasmine and other 'ingredient' histograms will have multiple 'peaks' or 'bars', say a dozen, which cannot vary in relative intensity (y-axis), those (relative) values are fixed. Lemon oil and orange oil will have some x-axis values that overlap so will sum the intensity (y value) for that x value if both are used in final solution. The final solution is a histograms with 100's of peaks.

I don't even know where to begin looking for a solution as I don't know what the problem is called. The best tags I could come up with were 'iterative' and 'optimization'.
 
Physics news on Phys.org
  • #2
This is very much like the animal feed mixing problem where the relative proportions of numerous ingredients with known nutrition component profiles and costs are set so that the resultant feed stock has the required overall nutrition component profile and cost of manufacture is minimised .

The actual optimisation calculations are done on the computer using linear programming methods .
 
Last edited:
  • #3
Master Sidoshi said:
I want to sum the Vanilla, Jasmin etc 'histograms' in various combinations/iterations until I've recreated the original perfume histogram.

A tempting way to model your problem is to say it amounts to finding a way to express a vector as a linear combination of other vectors where all the coefficents in the linear combination are non-negative numbers representing the fraction of each vector that is used in the combination.

In many real life problems of this nature, there is no unique solution unless you add other requirements. For example, if you assign a cost to each ingredient, you could ask what for the least costly combination of the ingredients that produces the desired final histogram. That would cast the problem as a "linear programming" problem.

If you don't have a simple function (such as total cost) to minimize, then you need to decide what to do if there are many possible ways of achieving the desired total histogram. Do you have other criteria that would make one solution more plausible or desirable than another? You might get some hints from the mathematics used in the statistical problem of representing a probability distribution as a "mixture" of other distributions.

The problem is more complicated and more interesting if you have histograms of different physical properties - i.e. if you have N histograms of different physical quantities, giving you N total histograms for the unknown substance and N histograms fo each of the known substances.

You also must consider whether the physical properties are actually additive. For example, light emitted when the atoms of compound are excited might be absorbed by another compound in solution with it.
 

1. How can summing simple histograms recreate a more complex one?

By combining multiple simpler histograms, the overall shape and distribution of the data can be better represented, resulting in a more complex and accurate histogram.

2. Can this method be used for any type of data?

Yes, this method can be used for any type of data as long as the individual histograms are calculated using the same parameters and bin sizes.

3. What is the benefit of creating a more complex histogram?

A more complex histogram can provide a more detailed and accurate representation of the data, allowing for better analysis and understanding of the underlying patterns and trends.

4. Are there any limitations to summing simple histograms?

One limitation is that the individual histograms need to have a similar distribution in order to accurately recreate the more complex one. Additionally, if the data is highly skewed or contains outliers, the resulting histogram may not accurately represent the data.

5. How does the number of simple histograms affect the complexity of the final histogram?

The more simple histograms that are summed, the more complex the final histogram will be. However, there is a point of diminishing returns where adding more simple histograms does not significantly improve the complexity of the final histogram.

Similar threads

  • High Energy, Nuclear, Particle Physics
Replies
3
Views
2K
  • Linear and Abstract Algebra
Replies
5
Views
1K
  • MATLAB, Maple, Mathematica, LaTeX
Replies
1
Views
3K
  • Electromagnetism
Replies
2
Views
672
  • Linear and Abstract Algebra
Replies
1
Views
935
Replies
2
Views
792
Replies
2
Views
5K
  • Advanced Physics Homework Help
Replies
3
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
952
  • Introductory Physics Homework Help
Replies
24
Views
1K
Back
Top