# How to interpret Cost Function?

Tags:
1. Oct 9, 2016

### Richard_Steele

I am just starting a course about machine learning and I don't know how to interpret the cost function.

When the teacher draws the straight line in the x and y coordinates, it looks like:

I see that theta zero is the start of the straight line (in the left side) in the Y coordinate.
The question is related about what theta one modifies? theta one modifies the inclination?

2. Oct 9, 2016

### Krylov

Yes, also called the slope.

However, how this is to be interpreted in a machine learning context is not clear to me, as I never really understood what that field is about. EDIT: Maybe somebody such as @StatGuy2000 can help you with the interpretation?

Last edited: Oct 9, 2016
3. Oct 9, 2016

### Richard_Steele

Of course, so Theta_1 modifies the slope.
But, I don't interpret the measuring scale of the slope. In the examples (middle and right, the graphs I posted on post #1), Theta_1 = 0.5.
What kind of unit is 0.5? Degrees? I don't know how to interpret 0.5 with the slope of the straight line.

Responding to your question about machine learning, this algorithm is used in linear regression. You give a dataset to your algorithm (example: x=house's size and y=price of the house), and you have to calculate a straight line that better fits to your dataset. Then, you give to the algorithm values of X (size of the house) and the software calculates the value of Y (price of the house). This is used in supervised learning. Supervised means that you give the correct answers to the software and the software can learns from data. Unsupervised learning means that you give data to the software, but nos classified data (no correct answer given in the data).

4. Oct 9, 2016

### Staff: Mentor

I moved the thread to our homework section, as it is homework-like.

It does not have units. It means if x (the horizontal coordinate) increases by 1, then h (the vertical coordinate) increases by 0.5. You can see that if you check the function definition and increase x by 1.

5. Oct 9, 2016

### Ray Vickson

Just look at the three diagrams. What do you see when you compare the first two? To anchor your understanding, try the following little exercise for yourself: plot the line y=1+x, which uses $\theta_0=1, \theta_1=1$. Compare that new line with the third line plotted above. What do you see?

6. Oct 9, 2016

### Richard_Steele

I see that the point where the straight line passes through Y is 1 (Y=1). When it increases X in 1 unit, then Y = X + 1, so Y is always one unit higher than X.
Right?

7. Oct 9, 2016

### Ray Vickson

Yes, but I asked you to compare the new line with the third line given in post #1. Take a sheet of graph paper; plot both lines on the same sheet. Now tell me what you see.

8. Oct 9, 2016

9. Oct 9, 2016

### Richard_Steele

Ok, in a few minutes you will have here the comparations.

10. Oct 9, 2016

### Richard_Steele

I see a variation in the slope. The 1 + 1X has more slope than 1 + 0.5X

11. Oct 9, 2016

### Ray Vickson

Exactly. When x increases by 1, 1+x increases by 1 but 1 + .5*x increases by 1/2.

It would have been more revealing if you had (as I suggested) plotted both lines on the same sheet of paper. If the software you are using does not allow that, then do it by hand on an actual, physical sheet of paper. Alternatively, you can use the graphing packages in a typical spreadsheet to plot several lines or curves on the same plot.

12. Oct 9, 2016

### Richard_Steele

Graphs plotted on sheet of paper.

13. Oct 9, 2016

### Ray Vickson

Good; that really does show up the difference most dramatically.

14. Oct 9, 2016

### Richard_Steele

Yes, it's more clear when I draw both on the same graph.

I am reading about 'minimizing the cost function'. What means minimizing and why is minimization used?

15. Oct 9, 2016

### Staff: Mentor

Your graphs ought to show the equations; that is, y = 1 + x and y = 1 + 0.5x.

Also, the axes are usually labelled on the positive ends. You have your labels for the x-axis on the negative end.
Any business that manufactures and sells a product is always interested in maximizing its profit. One way to do this is to minimize (make as small as possible) its costs.

16. Oct 9, 2016

### Richard_Steele

I am learning cost function applied to machine learning. I am using it in linear regression. So I don't know if minimization has the same objective in manufacturing and in machine learning.

17. Oct 9, 2016

### Staff: Mentor

I don't know how it's related to machine learning, but maybe how long it takes for a program to learn something? Maybe that's what "cost" means in this situation.

18. Oct 9, 2016

### Richard_Steele

In the video, the teacher is showing a cartesian plane. Horizontal line, X, is the size of the house. Vertical line, Y, is the price of the house. That dataset is called 'the training set' (it contains the correct answers).

Then, with the training set, the program has to calculate the straight line I was asking about in the #1 post in this thread. Of course, after that, its needed to 'minimize the function'. It's something like calculating the best parameters for theta zero and theta one, to produce a straight line that minimizes the error between the Y values (those from the trained dataset) and the h(x) (the hypothesis, the predicted Y value). This predicted Y value is called the prediction or the hypothesis. The only real Y values come from the real dataset (the training dataset).

The question is what minimization does and why we should apply it.

19. Oct 9, 2016

### EnumaElish

The slope parameter is measured in Y units/X units. In the equation Y = a + b X, b = ∂Y/∂X. If Y is "meters traveled" and X is "seconds of time" then b corresponds to velocity measured in meters per second. (In an estimation context b would be called average velocity or average incremental distance.) If Y is dollars and X is square feet then b is the increase in dollars when area increases 1 sqft. In this case b is measured in dollars per sqft.

Last edited: Oct 10, 2016
20. Oct 9, 2016

### Staff: Mentor

Where Y is a function of one variable, X, the slope would be dY/dX. Of course, in the case, the partial you wrote would be the same as the derivative I wrote. However, as this thread is in the Precalc section, the OP might not be familiar with derivatives of any kind.