1. Limited time only! Sign up for a free 30min personal tutor trial with Chegg Tutors
    Dismiss Notice
Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Creating a function based on data

  1. Aug 5, 2009 #1
    1. The problem statement, all variables and given/known data
    Now this isn't a homework question, it's something me and some others are looking into, and someone posted this function, and I'm not sure how he worked it out:

    Creating a function for Elvis' album sales based on the RIAA certifications of his albums that have been certified so far

    So what he's done is used the figures from the albums certified to create an equation. So therefore if you substitute 1 for x in the function, which is actually equal to 500,000 you will get all the albums that certified for over 500,000 sales, it's not exact, but it's a close enough function, how did he work that function out?

    2. Relevant equations

    3. The attempt at a solution

    I have no idea!
    Last edited: Aug 5, 2009
  2. jcsd
  3. Aug 7, 2009 #2


    User Avatar
    Homework Helper

    Fitting a function to a set of data points is an age old problem that is often as much art as it is science. Unless there is some theoretical basis that can be used to hint at the functional relationship, about the only option is intelligent guessing. Once you've zeroed in on a function, you can use least squares analysis to determine the unknown parameters if there are more data points than unknown parameters

    In this case, it isn't hard to tell that there is some sort of an inverse relationship between y and x (as x gets bigger, y gets smaller). Thus a logical first guess might be y = a/x + b. After a few trial and error attempts, it isn't far fetched to eventually try y = 1/(ax + b). Also, scaling x by 500,000 beforehand makes sense because the values are so large.
  4. Aug 16, 2009 #3
    a function can be created using a calculator that presumably uses least square regression techniques.

    from using your data (and excluding the 4 milion and 5 million data points) an exponential function of a = 60.4(0.78^s) where a = number of albums and s = sales*500000. This equation has a correlation co-efficient of -0.98 whereas using linear regression, only -0.79 is achieved.

    this line of best fit suits most of your data points and the one that is the most far off is the first data point
    Last edited: Aug 16, 2009
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook