Curse of dimensionality - what is it that grows?

  • Context: Graduate 
  • Thread starter Thread starter schniefen
  • Start date Start date
Click For Summary

Discussion Overview

The discussion revolves around the concept of the "curse of dimensionality," particularly focusing on how the volume outside a hypersphere inscribed in a unit hypercube behaves as the dimension increases. Participants explore the implications of dimensionality in various contexts, including mathematical reasoning, Monte Carlo integration, and the paradoxes associated with higher dimensions.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • Some participants note that the volume outside a hypersphere inscribed in a unit hypercube converges to 1 as the dimension increases, suggesting that space "gets bigger" in some sense.
  • Others argue that cardinality is not related to volume measurements, emphasizing that different intervals can have the same cardinality but vastly different volumes.
  • There is a discussion about the need for more points in higher dimensions when estimating integrals using methods like Monte Carlo, implying that the density of points decreases as dimensions increase.
  • Participants mention the relevance of the curse of dimensionality in machine learning, where more attributes lead to longer evaluation times.
  • Some contributions clarify the mathematical expressions for the volumes of hyperspheres and hypercubes, with corrections regarding the distinction between volume and surface area.
  • One participant raises a question about the Axiom of Choice and its relation to the Banach-Tarski paradox, indicating a deeper philosophical inquiry into the nature of points in higher dimensions.
  • Another participant discusses the probability interpretation of the volume ratio between a unit sphere and its surrounding cube, suggesting that the average distance between points increases with more dimensions.

Areas of Agreement / Disagreement

The discussion contains multiple competing views and remains unresolved on several points, particularly regarding the implications of dimensionality on volume and density, as well as the interpretations of mathematical concepts involved.

Contextual Notes

Participants express uncertainty about the relationships between cardinality, volume, and density in higher dimensions, and there are unresolved mathematical steps in the discussion of hypersphere and hypercube volumes.

schniefen
Messages
177
Reaction score
4
TL;DR
The volume outside a hypersphere inscribed in a unit hypercube converges to 1 as the dimension increases. It seems that the space somehow “gets bigger” as the dimension increases. However, ##\mathbb{R}## has the same cardinality as ##\mathbb{R^2}##.
The volume outside a hypersphere inscribed in a unit hypercube converges to 1 as the dimension increases. It seems that the space somehow “gets bigger” as the dimension increases. However, given data points in the interval ##[0,1]##, the density of points in the interval compared to the density of points for the same points in the unit square doesn’t increase, since ##\mathbb{R}## has the same cardinality as ##\mathbb{R^2}##. Then, what is it that “gets bigger”, gets more? Is sparseness a more relevant term than density?
 
Physics news on Phys.org
Cardinality is not related volume measurements at all basically, and you should just not think about it. [0,1] and [0,10000000000] have the same cardinality, but the former has almost zero volume compared to the latter.
 
  • Like
Likes   Reactions: mfb and schniefen
Suppose one would like to estimate an integral of a 1D and 2D function in the unit interval and square respectively, using say Monte Carlo. Why are more points needed in the latter vs the former? It seems again the space has grown and the density of points decreased.
 
The curse of dimensionality is often used in the context of machine learning where more attributes to be considered means more dimensions to evaluate and thus longer evaluation times. In ML it’s good to simplify.
 
schniefen said:
Summary:: The volume outside a hypersphere inscribed in a unit hypercube converges to 1 as the dimension increases. It seems that the space somehow “gets bigger” as the dimension increases. However, ##\mathbb{R}## has the same cardinality as ##\mathbb{R^2}##.

The volume outside a hypersphere inscribed in a unit hypercube converges to 1 as the dimension increases.
Can you explain this? The hypercube has volume ##1## and the hyperspheres have volume ##\dfrac{2\pi^{n/2}r^{n-1}}{\Gamma(n/2)}.##

We get strange behaviors if we increase the dimension, cp.
https://www.physicsforums.com/threads/math-challenge-december-2019.981217/page-2#post-6274751 f.
 
schniefen said:
Suppose one would like to estimate an integral of a 1D and 2D function in the unit interval and square respectively, using say Monte Carlo. Why are more points needed in the latter vs the former? It seems again the space has grown and the density of points decreased.
I remember a colloquium about Banach-Tarski. The lecturer liked to emphasize, that it is less AC which contributes to the paradox, but rather our poor understanding of points.
 
fresh_42 said:
I remember a colloquium about Banach-Tarski. The lecturer liked to emphasize, that it is less AC which contributes to the paradox, but rather our poor understanding of points.
What is AC? What’s up with the points?

With the volume outside a hypersphere inscribed in the unit hypercube converging to 1, I meant that the volume of the hypersphere goes to zero as ##n## increases, right?
 
fresh_42 said:
Can you explain this? The hypercube has volume ##1## and the hyperspheres have volume ##\dfrac{2\pi^{n/2}r^{n-1}}{\Gamma(n/2)}.##

We get strange behaviors if we increase the dimension, cp.
https://www.physicsforums.com/threads/math-challenge-december-2019.981217/page-2#post-6274751 f.
The unit sphere has radius 1, so to be fair the unit cube needs to have "radius" 1 also (and hence volume ##2^n##). A correct claim is the volume of a sphere inscribed in a hypercube divided by the volume of that hypercube goes to zero as n goes to infinity.
 
schniefen said:
What is AC? What’s up with the points?
Banach-Tarski uses the Axiom of Choice as well as the concept of what a point is.
 
  • Like
Likes   Reactions: schniefen
  • #10
Is it correct to say that the density of points decreases as the dimension increases, for a fixed number of points? It doesn’t sound right, but yet I’ve come across phrases like “lower data density...” in higher dimensional space.
 
  • #11
I don't think you get that. If you set r=1, you have ##\pi^{n/2}\Gamma(n/2)##.

Taking log, and letting k=n/2, we get ##k \ln(\pi)- \ln(k!)##. Stirling's approximation says ##ln(k!) \approx k \ln(k)-k##. The ##k \ln(k)## term dominates the whole thing, so log of the volume goes to negative infinity as k goes to infinity, and hence the volume has to go to zero.Although now I realize there is a slight misstatements. Your formula is for the surface area of the hypersphere, the thing people normally talk about is the volume of a ball with radius 1 (though I don't think this changes the result, the ball is the thing that is geometrically a more tractable and important result).

The layman's description of what is happening is that as the dimension goes to infinity, all the volume of the ball is actually concentrated in points where all the coordinates look like ##1/\sqrt{n}##, so the volume of the ball becomes small relative to the cube it's inscribed in.
 
Last edited by a moderator:
  • #12
Yes, I saw my mistakes. It was the volume of the sphere, not the ball, and I automatically associated the exponential function as I saw ##\frac{c^n}{n!}## without recognizing that there is no summation. :frown:
 
  • #13
The volume ratio between unit sphere and surrounding cube is the probability that the sum of squares of uniform random numbers from 0 to 1 is below 1. For two dimensions, it's the chance that ##x^2+y^2<1## where x and y are both uniform random numbers in [0,1]. Obviously, as we add more and more terms that probability gets smaller and smaller. Every non-zero value of a third summand restricts the range x and y can have.
In other words, the average distance between your points increases, even though the distance in each dimension keeps following the same distribution. If you just have a fixed number of points then your coverage of the volume gets poorer and poorer the more dimensions you have.
 

Similar threads

  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 66 ·
3
Replies
66
Views
8K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 24 ·
Replies
24
Views
3K