Relation between Sparse Primes and Curse of Dimensionality?

  • Context: Graduate 
  • Thread starter Thread starter FallenApple
  • Start date Start date
  • Tags Tags
    Primes Relation
Click For Summary

Discussion Overview

The discussion explores the potential relationship between sparse primes and the curse of dimensionality, examining how the distribution of prime numbers might relate to concepts in vector spaces and dimensionality in mathematics. Participants delve into theoretical implications, mathematical representations, and analogies between these concepts.

Discussion Character

  • Exploratory
  • Technical explanation
  • Conceptual clarification
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • Some participants suggest that as the number line increases, primes become less common, which may relate to the curse of dimensionality where adding more basis vectors increases the dimensionality of the spanned space.
  • One participant proposes that each new prime could be viewed as adding a new dimension, impacting the likelihood of forming composite numbers from earlier primes.
  • Another participant discusses the representation of numbers as vectors composed of basis vectors, noting that the dimension of the number space increases with the discovery of new primes.
  • A participant raises the idea of compressing the natural number line to just the primes, suggesting that the ordering of numbers is encoded in the prime basis vectors.
  • One participant introduces a definition of an inner product based on the number of common prime factors between numbers, proposing a metric for measuring distance or correlation in this context.
  • Concerns are raised about the redundancy in representing numbers with multiple prime factors, leading to a search for metrics that account for unique basis vectors.

Areas of Agreement / Disagreement

Participants express various viewpoints on the relationship between sparse primes and dimensionality, with no clear consensus reached. The discussion includes competing ideas and interpretations of how these concepts might be related.

Contextual Notes

Participants acknowledge limitations in their arguments, such as the dependence on definitions of vector spaces and the unresolved nature of certain mathematical steps related to the proposed metrics and inner products.

FallenApple
Messages
564
Reaction score
61
Is there a relation between these concepts? We know that as we move down the number line, the primes become less common. This makes sense: as we get more and more numbers, they could be constituted using all the primes that came before early on and less likely to be constituted by all the recent primes.

Although the curse of dimensionality is more of an issue related to basis vectors. The curse is this: as more and more unrelated( i.e independent) vectors are added, the dimension of the spanned space goes up while each increment of data(i.e column vector) stays the same, hence it has to occupy more hyper volume for same increment of data. Hence the data becomes more sparse for each basis added.

This is analogous because each new number can be written as a multiplicative combination of older primes or it can be a new prime, with unknown chance. If there is a new prime, then this is analogous to adding a dimension(new building block) which could in principle have a non zero probability to build another number down the line.

But I don't know if we can think of a multiplicative building block in terms of a basis. Unless there is some way to code each prime in terms of 1s and 0s in vectors that somehow make the prime vectors orthogonal also add up to build composite numbers(in vector 0s and 1s form as well). Perhaps there's a way to use log transformations to map from multiplication to addition? But if we were to do this, how would we incorporate the randomness(chance of being prime or not) of the next prime into this picture since there isn't really a fixed probability of nth number being prime. Maybe we can use Bayesian Statistics to assign some weight to a prime based on past occurrences of primes?

In short, are the ideas analogous?
 
Last edited:
Mathematics news on Phys.org
What about the primes as basis vectors? You don't get a proper vector space unless you include all rational numbers (apart from 0), but let's ignore that point.
1 is (0,0,0,...)
2 is (1,0,0,...)
3 is (0,1,0,...)
4 is (2,0,0,...)
60 is (2,1,1,0,0,...)
 
  • Like
Likes   Reactions: FallenApple
mfb said:
What about the primes as basis vectors? You don't get a proper vector space unless you include all rational numbers (apart from 0), but let's ignore that point.
1 is (0,0,0,...)
2 is (1,0,0,...)
3 is (0,1,0,...)
4 is (2,0,0,...)
60 is (2,1,1,0,0,...)

Ah, so then since each number is now a vector of data composed of sums of basis vectors which themselves are vectors, the dimension of the number space keeps increasing as we keep discovering primes. So say we find that the n-th number is prime. Since the next number, n+1, after is not likely to have the recently discovered prime in it, but most likely made of much earlier ones. Maybe its just that for the next number, there is just so many more ways that the previous basis vectors can add to make it that it's just more likely to find a composite instead of a prime. That is probably the generative process here.

But if we look at cross sectional neighborhoods of the discrete natural number line, we can see the sparseness.

say that m>n, for some m and n that are natural numbers. Let's scan the number line left to right and add a extra basis to the basis set every time we discover a prime. This is for a fair comparison, looking at the dimension for the ith number only.

From this, we see that n is more likely to made of less basis vectors than m. Not always, but just more likely for the reasons described above. This is the same as saying that n is more likely to live in a lower dimensional space than m. So at least heuristically, from this argument, the neighborhood of m is more likely to be more sparse than the neighborhood of n. For m>>>n, this is even more likely.
 
Last edited:
Also, since new prime numbers are just surrounded by redundant information made from previous basis primes, can we compress this data ,that is the natural number line, down to just the primes by removing the composites?

Notice something interesting. Since 1=(0,0,0,...), the entire notion of ordering is encoded on the prime basis vectors. So we do not get new info until a new prime pops up, because every time we use 1, its just adding the zero vector.
 
Last edited:
We can even define an inner product to be the number of primes they have in common, multiplicity included.

6=2+3=(1,1,0,0,...)
15=3+5=(0,0,1,1,...)

<6,15>=( 2+3)(3+5)=<2,3>+<3,3>+<2,5>+<3,5>=0+1+0+0=1

This makes sense since the only prime factor in common between 6 and 15 is 3. So we can define the metric to be the number of common denominator that is prime. That is, the number of overlapping basis vectors.

So the notion of distance, or correlation, mutual information, or whatever, between two vectors on this space is just the number of primes they have in common.

With some math, we can verify that we have a complete inner product space over the field of integers(negatives correspond to division).
<1,1>=0. iff 1=(0,0,0,...) which is true. <10,3>=<2+5,3>=<2,3>+<5,3>=0+0=0 which is True and so on. So the axioms are satisfied, making this an inner product space.

Then with this, it makes sense that if we analyze two neighborhoods, once for a cluster centered at n and once for an equally sized cluster centered at m where m>n, we see that the amount of mutual information should be higher on average for the cluster around m just because there are more combinations of primes available, at least asmptotically. So for computations, if we think of the ith number as time(which is reasonable for a computer), we see that the entropy increases with time. For numbers, the entropy of the state space(cross section at a neighborhood) increases with number order.

I'm not sure how to get rid of the multiplicity though. The inner product as I have defined only retrieves overlapping information. I need another metric does not involve duplicates to give an honest assessment of increasing dimensionality for increasing n. <8,2>=<2+2+2 , 2>=<2,2>+<2,2>+<2,2>=3 but clearly there is redundant information. So we a metric that gives only the amount of unique basis vectors between two vectors. Maybe divide by sqrt(<8,8>)?
<8,2>/sqrt(<8,8>)=1. It seems to work.
 
Last edited:

Similar threads

  • · Replies 3 ·
Replies
3
Views
2K
Replies
7
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 14 ·
Replies
14
Views
6K