Computability meets Random Sampling

Stephen Tashi · Sep 2, 2012

Some things are physically impossible, but mathematically possible (like exactly bisecting a line segment). Some things are both physically and mathematically impossible (like exactly trisecting an angle "with a ruler and compass".) Things mathematically impossible are proven to be so within some abstract framework. ( For example, "with a ruler and compass" imposes certain restrictions on what mathematical processes are allowed.).

The theory of computability deals with mathematically possible and impossible things within an abstract framework. Has anyone has ever worked out what happens when it meets probability theory?

I have some questions and musings about that.

It is common in probability theory to talk about taking a sample from a continuous random variable, such as the uniform distribution on [0,1]. Probability theory just assumes you can do it and proceeds to analyze the consequences. But can it really be done?

First, is it physically possible? Can Nature can do it? I see nothing in Quantum Mechanics that forbids it. In fact, it seems to me that Quantum Mechanics uses some continuous probability distributions (Please orrect me, if I'm wrong.). However,if Nature does it, our ability to measure the outcomes is limited by their precision. So I don't think it is physically possible for beings to create a process that samples from a uniform distribution over all in the numbers in the interval [0,1].

Next, is it mathematically possible? Let's think about this in terms of whether some computer algorithm exists to do it. I suspect not, even if we allow the algorithm to have some external "random" input of finite precision..

I also suspect that proving this impossibility is tricky. For example, you can't claim "computers only represent rational numbers" because computers can manipulate symbols. An irrational number can be represented as a symbol like [itex] \pi [/itex] and an algorithm can generate symbolic expression like [itex] \sqrt{\pi} [/itex] or [itex] \pi^\pi[/itex]. Perhaps an argument based on cardinality would do.

When the status of solving problem X in computability theory is unknown, it is still of interest to prove theorems that say "If we had a way to compute an answer to problem X then we could compute an answer to problem Y".

If we could take a random sample from uniform distribution on the rational numbers in [0,1] then we could take a random sample from a [in some sense] uniform distribution on the positive integers. We know there are methods of making a 1-to-1 mapping between the rational numbers in [0,1] and the positive integers. You could pick your rational number "at random" then find out what integer it mapped to.

I don' know what measure theory has to say about this. I don't detect anything in measure theory that assumes computability. Measure theory would frown on claiming we have created a uniform distribution on the positive integers that gives each integer the same nonzero probability p of being selected. However, a uniform distribution on the rationals in [0,1] doesn't assign a nonzero probability to any individual rational number, so it isn't assigning a nonzero probability to the corresponding integer.

I don't think such a uniform distribution on the integers would provide answers to questions like "what is the probability that an integer selected at random is even?". There are many different ways to match the integers to the rationals. For example, you could count through the integers by listing two 2 odd integers followed by 3 even integers, like { 1,3, 2,4,6,5,7,8,10,12,...}. I don't see why all these ways would lead to the same answer and I'm not sure "the set of even integers" would be a measureable set.

chiro · Sep 2, 2012

The key thing when it comes to computability is that you need to look at two things.

The first is how something is defined structurally in terms of the information content, and the second is how you define the actual algorithm with respect to the way the computer functions in terms of how it executes those instructions.

The second thing is generally done under what is known as the Turing model. There are proposals of Quantum Turing models, but if you think about doing a computation on something like a PC or something with a similar instruction set and execution model, then you need to look at Turing type models of computation and the analysis thereof.

The structural issue (the first thing) is the thing that is often overlooked by mathematicians because mathematicians typically do not think about the internal structure of information, and instead just consider that the internal structuring of things is un-important (for example numbers are numbers, but the actual specific construction is not considered like it has to be when a programmer needs to implement something where they specify the exact description in memory and how things can be used in the context of transformations and algorithms).

The big lesson is this: when it comes to information you have to quantize it and the choice of quantization affects not only how and what you compute, but ultimately what you describe.

Now in regards to your question, you need to think in terms of computationally what the underlying intrinsic representation of something is in an explicit structurally representative information definition of that thing is.

Are you dealing with polynomials of degree n (where n is finite)? Ok then how are you representing the co-efficients? Are they numbers that are quantized with a fixed resolution? Are they solutions to some class of functions? Are they standard IEEE floating point numbers? Maybe fixed point numbers?

If you think about the above you'll start to see the real issue and it relates to the definitions used with regard to the internal binary representation of the structures themselves.

When you are forced to do this kind of thing in extensive depth, you'll see how important this very core issue is when you want to answer your question.

Stephen Tashi · Sep 3, 2012

I don't think the impossibility of taking a random sample from the uniform distribution (of real numbers) on the interval [0,1] is dependent on how the numbers are represented. The impossibility (if indeed it is impossible) is due to the fact that there are uncountable real numbers in that interval and the possible outputs of any algorithm with finite precision inputs that terminates in finite time will only form a countable infinity.

An interesting speculation about Nature is: Suppose Nature cannot take a sample from a continuous random variable. This does not imply that the random variables that Nature realizes are quantized because the fact that something is countably infinite or finite does not imply that the quantities are "evenly spaced". Can there be experiments precise enough to detect this, given that some quantities involved the experiments will be quantized and will have non-continuous distributions.

chiro · Sep 4, 2012

Representation is everything when it comes to the computability.

What is being represented and its interpretation is irrelevant in modern day Information Theory: information deals with completely context-independent representations of stuff in some finite alphabet that is a finite sentence with some probability characteristics.

But the interpretation of said data and what that means can be identified or inferenced by looking at the process that uses the data, and this is the real key for science.

The way that the data is used indirectly says what the data actually is and what it relates to. If the process in nature used data in a specific way, you can bet that the way it uses it has a direct impact on not only what the data intrinisically "is", but also what the structural characteristics of the data really are.

The computational aspect does this because it involves the decomposition and synthesis of structural components in order to make the necessary transformations that the algorithm is specified to do.

Computability requires that you have information that is quantized in some way.

The quantization doesn't have to be the kind that people automatically think like having even resolution (i.e. taking an interval and dividing it up evenly 2^n times with an n bit resolution): that is not what I mean.

But it does have to have some quantization to be analyzed and for new information to be synthesized.

We do this all the time without thinking about it: when we write an infinite series, we have quantized an entire infinite expression using a specific finite alphabet and a finite sentence string without even thinking about it.

In terms of detecting continuity, what I would suggest is to focus on something a little different: focus on trying to find an alphabet with a specific grammar (i.e a linguistic structure) that has optimal representation and takes into account the nature of the process and what it is actually "doing" to the data.

Basically what this does is shift the question of "does continuity exist" which is not something you can even test anyway with any computational process (since it requires a quantization in order to begin to analyze), but instead ask if such a linguistic framework exists to describe the information (un-contextual) and how it relates to the actual process that gives it meaning and contextualization.

When you look at the language, it's limitations, and how you can do further hypotheses and testing thereof, you can decide in the context of the structures and the processes what the actual representation should be and do experiments to test whether new results confirm this or provide evidence to reject it.

It's extremely subtle, but remember that every single time you analyze or every time a computation is made, it has to be done on some form of a quantized structure which means that when you are testing for continuity, the process is going to force you to quantize in your own analysis which means that you will chasing your tail without ever getting there.

Stephen Tashi · Sep 4, 2012

chiro said:

every single time you analyze or every time a computation is made, it has to be done on some form of a quantized structure which means that when you are testing for continuity, the process is going to force you to quantize in your own analysis which means that you will chasing your tail without ever getting there.

For once, I agree with you completely!

I'm not measure theory expert nor a logician, but I find the logical structure of probability theory interesting. In my non-expert standing, I don't see exactly what, if any, axiom of probability theory says that we can take random samples from a continuous probability distribution. There is certainly a body of theory that says we can define continuous probability distributions. So where does possibility of sampling them get established?

On explanation is that every theorem involving sampling has an "if...are random samples from..." clause in it. So the possibility of sampling is assumed over and over again, each time a theorem is stated.

On the one had, not being able to take a random sample from a continuous probability distribution isn't disturbing if our focus isn't on computability. For example, the inability to trisect and angle with ruler and compass doesn't disturb me from thinking about the trisectors of an angle existing mathematically. But on the other hand, when get into practical problems of statistics, we are talking about sampling processes that humans must perform.

I guess the attitude of many people is that "we can approximate random sampling from a continuous random variable to any desired degree of accuracy". But is that true for things like a Cauchy distribution? What would the precise mathematical definition of this assertion be? For example, there are many different ways that a convergence of a sequence of functions to another function can be defined. Is "approximate to any desired degree of accuracy" going to utilize one of those definitions?

chiro · Sep 4, 2012

Stephen Tashi said:

For once, I agree with you completely!

Haha, personally I'm glad people don't always agree: it would be a very boring world ;)

I'm not measure theory expert nor a logician, but I find the logical structure of probability theory interesting. In my non-expert standing, I don't see exactly what, if any, axiom of probability theory says that we can take random samples from a continuous probability distribution. There is certainly a body of theory that says we can define continuous probability distributions. So where does possibility of sampling them get established?

On explanation is that every theorem involving sampling has an "if...are random samples from..." clause in it. So the possibility of sampling is assumed over and over again, each time a theorem is stated.

On the one had, not being able to take a random sample from a continuous probability distribution isn't disturbing if our focus isn't on computability. For example, the inability to trisect and angle with ruler and compass doesn't disturb me from thinking about the trisectors of an angle existing mathematically. But on the other hand, when get into practical problems of statistics, we are talking about sampling processes that humans must perform.

I guess the attitude of many people is that "we can approximate random sampling from a continuous random variable to any desired degree of accuracy". But is that true for things like a Cauchy distribution? What would the precise mathematical definition of this assertion be? For example, there are many different ways that a convergence of a sequence of functions to another function can be defined. Is "approximate to any desired degree of accuracy" going to utilize one of those definitions?

IMO, I think the issue comes into play with the nature of the measuring device and how this corresponds to the probability space of a random variable.

We know that there is a zero probability of getting one specific value in a continuous distribution. In fact micro-mass showed a more rigorous proof of this in another thread so you can look at that (I think you commented in the same thread).

But ultimately, the issue is going to be how the measuring device corresponds to the quantization you force to take the said measurement and how the space of such measurements corresponds to the actual probability space and its events.

Indirectly it means that the measurement device used will have to have the same kind of physical properties that are correlated with a real number, but this is kind of counter-intuitive since the basis of physical intuition is that things are finite.

This idea comes down to probably the most important attribute of physics currently which states that everything reduces to atoms. Whether it's true or not is not relevant: the important thing is that this is what physics is based on as well as how people think so it's important to realize this assumption in order to look at attempts to answer your question.

When you keep this in mind, it seems that trying to get a physical intuition for infinity is rather hard: we are able to deal with finite concepts with ease and can even synthesize these concepts with relative ease.

But when it comes to the uncountable, this transcends any kind of physical intuition because we are craving some kind of finite reference point so that we can begin to analyze and make sense of it.

So ironically we are at a weird form of a paradox: how do you quantize the infinite?

Mathematicians do this largely through infinite series and sequences and more complex constructions built on these ideas. But physically, there is no real analog available.

The best way I can think of to solve your problem is to use a completely relative system and not something explicit.

Typically when you show someone a ruler, it is an explicit device: every little marking is explicit and every possible realization of length is drawn on the ruler so if you want to measure something you get the measurement that matches up with the right marker or tick on the ruler.

But you don't have to do this: you can do measurements another way and that's using a completely relative system.

The basic idea is that you have a duality for every single linguistic description of any object: you have a description for what an object is and what it is not.

From there you can go recursive and generate relativity between all kinds of objects and this can be infinite.

The thing about doing this is that you sacrifice something that scientists, statisticians, engineers, mathematicians, and basically everyone craves to have which is order and rank.

The above situation introduces a problem in that it doesn't give a natural way to order or rank realizations in a way consistent with what is done with a ruler or some other measuring device.

You can introduce a metric if you want, but the above doesn't require it.

For mathematics (and statistics) you need some kind of metric: some kind of way of taking the relative construction of potential realizations and creating a system of ranking, ordering, and sorting in the current way that mathematics is structured.

So what do you choose? The explicit way that allows you to analyze something in a way that is intuitive with our current development of mathematics, but is limited with respect of handling true continuity and uncountability? Or do you use the relative way and sacrifice the potential way to construct a metric that would allow you to do the same sort of thing that was intuitive using the current construction of mathematics?

There are trade-offs for both approaches, but the nature of infinity changes the whole game forever. As we have discovered in roughly the last century (especially since people like Hilbert and Von-Neumann worked on formalizing this kind of thing, where people like Cantor really got the ball rolling just a short time prior), infinity has to be treated differently: it can not be treated in the same way that we treat it with the non-infinite cases.

Finally, it means that isn't just related to quantity itself: it's related to language in general and mathematics is but one small subset of the many kinds of descriptive capacities that exist.

ImaLooser · Sep 8, 2012

Stephen Tashi said:

It is common in probability theory to talk about taking a sample from a continuous random variable, such as the uniform distribution on [0,1]. Probability theory just assumes you can do it and proceeds to analyze the consequences. But can it really be done?

First, is it physically possible? Can Nature can do it? I see nothing in Quantum Mechanics that forbids it. In fact, it seems to me that Quantum Mechanics uses some continuous probability distributions (Please orrect me, if I'm wrong.). However,if Nature does it, our ability to measure the outcomes is limited by their precision. So I don't think it is physically possible for beings to create a process that samples from a uniform distribution over all in the numbers in the interval [0,1].

As of today is physically impossible to take a random sample from an infinite set with a uniform distribution. The problem is that each element would have to be denoted by an infinite set, which with today's technology would take an infinite number of bits to store. There is no place to do that today. It might be possible in the future, but I don't see how.

Stephen Tashi said:

Next, is it mathematically possible?

Yes. If you like, you can simply assume that you can do it. It's called the Axiom of Choice.

Stephen Tashi said:

Let's think about this in terms of whether some computer algorithm exists to do it. I suspect not, even if we allow the algorithm to have some external "random" input of finite precision..

It's imaginable, but as of today computer science always assumes that each step of the computation takes some minimum amount of time. So you can't calculate an infinity in a finite amount of time.

Let's say we waive this rule for the sake of argument. Then we calculate each step n in 1/n^2 seconds. You can then calculate an infinite number of steps in 2 seconds. But in models of computability we normally agree not to do that.

Stephen Tashi said:

I don' know what measure theory has to say about this. I don't detect anything in measure theory that assumes computability.

Right. Measure theory is about sets, it has nothing to do with procedures.

Stephen Tashi said:

Measure theory would frown on claiming we have created a uniform distribution on the positive integers that gives each integer the same nonzero probability p of being selected.

The sum of the probabilities would be infinite. In all reasonable theories this probability is limited to one.

Stephen Tashi said:

However, a uniform distribution on the rationals in [0,1] doesn't assign a nonzero probability to any individual rational number, so it isn't assigning a nonzero probability to the corresponding integer.

I don't think such a uniform distribution on the integers would provide answers to questions like "what is the probability that an integer selected at random is even?".

Yes it does. That is what measure theory is for, to assign probability to all of the interesting infinite subsets of infinite sets.

Stephen Tashi said:

S

There are many different ways to match the integers to the rationals. For example, you could count through the integers by listing two 2 odd integers followed by 3 even integers, like { 1,3, 2,4,6,5,7,8,10,12,...}. I don't see why all these ways would lead to the same answer and I'm not sure "the set of even integers" would be a measureable set.

That is correct. In measure theory bijections have nothing to do with cardinality. You can make a bijection between a set of measure zero and a set of measure one if you like.

chiro · Sep 8, 2012

ImaLooser said:

Right. Measure theory is about sets, it has nothing to do with procedures.

Procedures work on data: understanding the nature and structure of the data is required to understand the transformation and computation of said data.

Consequently (but not surprisingly) there is a link between the cardinality of the relevant sets and the nature of the computation itself and its complexity.

Computability meets Random Sampling

1. What is "Computability meets Random Sampling"?

2. How does random sampling contribute to computability?

3. What are some real-world applications of "Computability meets Random Sampling"?

4. How does "Computability meets Random Sampling" relate to artificial intelligence?

5. What are the limitations of "Computability meets Random Sampling"?

Similar threads

Hot Threads

Recent Insights