Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

A Summing simple histograms to recreate a more complex one

  1. Jul 19, 2017 #1
    I wouldn't be surprised if I've posted in the wrong section because in fact the reason for posting is to get help naming this problem. That being the first step to knowing where to look for a solution. Newbie to the forum so open to advice.

    The problem: I have a complex histogram and a database of 600 or so less complex histograms. I know all the 'ingredients' (less complex histograms) I need to recreate the complex one are in my database. Some histograms in my db have unique X/Y data pairs ('bars') so definitely belong in my 'reconstructed' histogram the other bars or X/Y data pairs will have to be recreated by summing various 'ingredients' from my database. I need to iteratively sum histograms from my database until I can as closely as possible recreate the original histogram. What class of problem is this?

    Real world application: I have a perfume, I want to recreate it. In the laboratory I acquire a spectrum (wavelength vs. Intensity; chemical shift vs. Intensity or m/z vs intensity etc - basically x/y data points or histograms) of the perfume. In my database I have spectra (histograms) of Sandlewood Extract, Vanilla, Jasmine, Patchouli, Neroli etc etc. I want to sum the Vanilla, Jasmin etc 'histograms' in various combinations/iterations until I've recreated the original perfume histogram.

    NB: The Vanilla, Jasmine and other 'ingredient' histograms will have multiple 'peaks' or 'bars', say a dozen, which cannot vary in relative intensity (y-axis), those (relative) values are fixed. Lemon oil and orange oil will have some x axis values that overlap so will sum the intensity (y value) for that x value if both are used in final solution. The final solution is a histograms with 100's of peaks.

    I don't even know where to begin looking for a solution as I don't know what the problem is called. The best tags I could come up with were 'iterative' and 'optimization'.
     
  2. jcsd
  3. Jul 19, 2017 #2

    Nidum

    User Avatar
    Science Advisor
    Gold Member

    This is very much like the animal feed mixing problem where the relative proportions of numerous ingredients with known nutrition component profiles and costs are set so that the resultant feed stock has the required overall nutrition component profile and cost of manufacture is minimised .

    The actual optimisation calculations are done on the computer using linear programming methods .
     
    Last edited: Jul 19, 2017
  4. Jul 19, 2017 #3

    Stephen Tashi

    User Avatar
    Science Advisor

    A tempting way to model your problem is to say it amounts to finding a way to express a vector as a linear combination of other vectors where all the coefficents in the linear combination are non-negative numbers representing the fraction of each vector that is used in the combination.

    In many real life problems of this nature, there is no unique solution unless you add other requirements. For example, if you assign a cost to each ingredient, you could ask what for the least costly combination of the ingredients that produces the desired final histogram. That would cast the problem as a "linear programming" problem.

    If you don't have a simple function (such as total cost) to minimize, then you need to decide what to do if there are many possible ways of achieving the desired total histogram. Do you have other criteria that would make one solution more plausible or desirable than another? You might get some hints from the mathematics used in the statistical problem of representing a probability distribution as a "mixture" of other distributions.

    The problem is more complicated and more interesting if you have histograms of different physical properties - i.e. if you have N histograms of different physical quantities, giving you N total histograms for the unknown substance and N histograms fo each of the known substances.

    You also must consider whether the physical properties are actually additive. For example, light emitted when the atoms of compound are excited might be absorbed by another compound in solution with it.
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted



Similar Discussions: Summing simple histograms to recreate a more complex one
  1. Here's one more (Replies: 1)

Loading...