Creating a function based on data

In summary, someone has created a function for Elvis' album sales based on the RIAA certifications of his albums. They used the figures from the certified albums to create an equation, which is equal to 1/(1.23*x + 0.278), where x represents the number of millions of sales. This equation can be used to find albums that sold over a certain number of sales, such as 500,000 or 1 million. The process of creating this function involves fitting a function to a set of data points, using intelligent guessing and least squares analysis. The final equation has a correlation coefficient of -0.98, which is a good fit for the data points.
  • #1
JFonseka
117
0

Homework Statement


Now this isn't a homework question, it's something me and some others are looking into, and someone posted this function, and I'm not sure how he worked it out:

Creating a function for Elvis' album sales based on the RIAA certifications of his albums that have been certified so far

69 albms - over 500,000 sales certified
39 albums - over 1,000,000
19 albums - over 2,000,000
11 albums - over 3,000,000
4 albums - over 4,000,000 and 5,000,000
2 albums - over 6,000,000
1 albums - over 9,000,000

Let exclude the 4 albums over 4 million, which could be estimated on their own and disturb the linear repartition of figures.

I set X = 500000*x

Then Elvis statistics are almost perfectly equal to the function 1/(1,23*x + 0,278).

Let x = 1 if you want to find albums that sold over 500,000
Let x = 2 if you want to find albums that sold over 1 million
etc.

So what he's done is used the figures from the albums certified to create an equation. So therefore if you substitute 1 for x in the function, which is actually equal to 500,000 you will get all the albums that certified for over 500,000 sales, it's not exact, but it's a close enough function, how did he work that function out?

Homework Equations


None

The Attempt at a Solution



I have no idea!
 
Last edited:
Physics news on Phys.org
  • #2
Fitting a function to a set of data points is an age old problem that is often as much art as it is science. Unless there is some theoretical basis that can be used to hint at the functional relationship, about the only option is intelligent guessing. Once you've zeroed in on a function, you can use least squares analysis to determine the unknown parameters if there are more data points than unknown parameters

In this case, it isn't hard to tell that there is some sort of an inverse relationship between y and x (as x gets bigger, y gets smaller). Thus a logical first guess might be y = a/x + b. After a few trial and error attempts, it isn't far fetched to eventually try y = 1/(ax + b). Also, scaling x by 500,000 beforehand makes sense because the values are so large.
 
  • #3
a function can be created using a calculator that presumably uses least square regression techniques.

from using your data (and excluding the 4 milion and 5 million data points) an exponential function of a = 60.4(0.78^s) where a = number of albums and s = sales*500000. This equation has a correlation co-efficient of -0.98 whereas using linear regression, only -0.79 is achieved.

this line of best fit suits most of your data points and the one that is the most far off is the first data point
 
Last edited:

1. How do you choose the right function for a given dataset?

Choosing the right function for a dataset depends on various factors such as the type of data, the relationship between the variables, and the purpose of the analysis. Some common types of functions used for data analysis include linear, quadratic, exponential, and logarithmic functions. It is important to carefully analyze the dataset and consider the assumptions and limitations of each function before making a decision.

2. What are some key steps involved in creating a function based on data?

The key steps for creating a function based on data include: identifying the variables and their relationship, selecting an appropriate function, fitting the function to the data using regression techniques, evaluating the accuracy of the function, and making necessary adjustments or transformations to improve the fit.

3. How can you test the accuracy of a function created from data?

There are various methods to test the accuracy of a function created from data, such as using statistical measures like the coefficient of determination (R-squared) or the root mean square error (RMSE), conducting hypothesis tests on the model parameters, and visually inspecting the fit of the function to the data using graphs or plots.

4. Can a function created from data be used for prediction?

Yes, a function created from data can be used for prediction if it has a good fit to the data and meets the assumptions of the prediction method. However, it is important to note that predictions based on a function are not always accurate and can be affected by changes in the data or underlying relationships.

5. What are some common challenges in creating a function based on data?

Some common challenges in creating a function based on data include dealing with missing or incomplete data, selecting an appropriate function for a complex dataset, overcoming non-linear relationships between variables, and choosing the right regression method. It is also important to consider potential sources of bias and ensure the assumptions of the chosen function and regression method are met.

Similar threads

  • Calculus and Beyond Homework Help
Replies
4
Views
2K
Back
Top