- #1

- 315

- 1

However, what if your goals are more modest--suppose instead of describing all the data you only wish to find a "feature" in the data, like identifying a shape. Let a "feature descriptor" be a program F that outputs a string S, such that S agrees with D at a large number of points, and where S contains a "dummy" symbol (a symbol not used in D) at each other position. For example, if D = 0101001111111111010100011, then one might let F be a program that outputs the string S = xxxxxx1111111111xxxxxxxxx, which agrees with D at 10 places.

What is a criterion, along the lines of minimum description length, for finding valid features in the data? One possible criterion is to say that F is a valid feature descriptor if the length of F is less than the number of places where F's output, S, agrees with D. Is this a good criterion, or does it lead to spurious features?

A good criterion for finding valid features should very rarely allow one to find any "features" in a random string of data, no matter how long. So, suppose we have an infinite string D of 0's and 1's, each randomly selected by a fair coin flip. What is the probability that, by examining D, we can write a computer program P (say, in Python) that is a valid feature descriptor by the above criterion (length of P in bits is less than the number of places where the output of P agrees with D)? Is this probability close to 0 (as we'd like) or close to 1?