Python: Help with bestfit line and outliers

  • Thread starter DMT
  • Start date
  • #1
DMT
9
0

Main Question or Discussion Point

I've been having some trouble with outliers messing up my best fit line on my scatter plot in python. I'm using numpy's polyfit function to calculate the slope and y intercept of the best fit line, however I always seem to get one or two points which throw off the slope enough to make quite a noticeable difference. I've already checked a few python references and did a lengthy google search, but haven't found a solution. Does anyone know of a good way to fix this problem without having to limit the interval or physically remove the bad points from my data?

Edit: Also, knowing a way to take errors into account would be very helpful as well.

Thanks!
 
Last edited:

Answers and Replies

  • #2
1,428
1,264
I have not used the polyfit function in python, but have used it a lot in Matlab. If have points that are quite far from the best fit line, the best I can say is that the points are not good points. If you are plotting some experiment, then they might be the result of some badly performed experiment. Python, like Matlab, will try to give you the best fit line always. You have yourself said that you haven't found anything on Google. This shows that the software is perfectly fine, and the problem is in your data.
 

Related Threads on Python: Help with bestfit line and outliers

Replies
2
Views
3K
Replies
1
Views
3K
  • Last Post
Replies
3
Views
2K
  • Last Post
Replies
1
Views
3K
  • Last Post
Replies
0
Views
2K
Replies
27
Views
2K
Replies
5
Views
318
  • Last Post
Replies
11
Views
860
Replies
17
Views
1K
Replies
9
Views
3K
Top