1. Limited time only! Sign up for a free 30min personal tutor trial with Chegg Tutors
    Dismiss Notice
Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Over-searching error (statistics)

  1. Nov 21, 2006 #1


    User Avatar
    Science Advisor

    I guess this should properly go in the Programming forum, but I think I might get a better response here.

    My question is with respect to statistics (context of machine learning) about "over-searching" error. You have a search space S of all possible models, from which you choose a subset K. Then you have some test data D, and you evaluate how well each model in K fits D. You pick the best model in K and use that as your model.

    Over-searching says that it is bad to do exhaustive sampling, where K = S. Though the model you end up with fits D better than the model you end up with when K is much smaller than S, for some reason the model when K = S does not work as well when it's tested against new data that's not in D.

    I didn't quite catch the reason for this and I still do not understand. I wrote down, "two or more search spaces contain different numbers of models. The maximum scores in each space are biased to different degrees." I understand this but I don't see its relevance to over-searching.
  2. jcsd
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook

Can you offer guidance or do you also need help?
Draft saved Draft deleted