Topological Data Analysis - Persistent Homology

phys_student1 · Jun 3, 2013

Hi,

I am not a mathematician, but I have noticed some recent papers on this seemingly new field, called Topological Data Analysis (see this relevant paper).

I have had an overview of the applications and it seems that when you have data points that were sampled from some source (e.g. an image), you can use Persistent Homology to visualize what these data looks like in higher dimensions. (this is my understanding).

I am still unsure what this really means. Will any data set have higher dimensional shape or geometry?

fresh_42 · Jun 11, 2019

This is build on thin ice. Topological here means the lack of scales, metrics and coordinates. But data are measures somehow which gives natural coordinates, even though the author says differently. With topology you also get lost of all analytical means, plus that a finite set of data points only allow trivial topologies, except some hypothesis are added.

I wouldn't take the paper very serious, i.e. a closer examination of these additional conditions is due. Topology is a rather new field of mathematics - only 100 years old - so people are still looking for applications outside mathematics. Of course this is a personal opinion, so let's wait and see.

TeethWhitener · Jun 15, 2019

I’m by no means an expert, but I have noticed a significant increase in interest around the field of topological data analysis by a number of US funding agencies.

As I understand it, the basic technique is to take high-dimensional data and find lower-dimensional features that are persistent over several different length scales (according to some relevant metric). Features that are persistent are presumed to be related functionally in some way, where’s features that aren’t are generally disregarded as noise. I have no idea how useful the technique is, but I wanted to chime into point out that it seems to have caught funders’ attention here in the US.

statdad · Jun 20, 2019

Don't be put off by people who dismiss topology because it is 'new' -- judge topological data analysis on its own merits.

A basic discussion is here.

https://towardsdatascience.com/from-tda-to-dl-d06f234f51d
A little more discussion is next (I tend to agree with the commentary here: It's an interesting idea, and could bring us some powerful mathematical ideas for sorting structure out from noise in high-dimensional data, but as of now results are mixed.)

https://rviews.rstudio.com/2018/11/...rspective-on-topological-data-analysis-and-r/

madness · Jun 28, 2019

There's been some interesting applications in neuroscience, for example:

https://www.biorxiv.org/content/biorxiv/early/2019/01/09/516021.full.pdf

WWGD · Jun 30, 2019

This may be close to an appeal to authority but I have heard people who seem knowledgeable, smart-enough otherwise endorse it. You create an associated complex to the data you are given . Features that "persist" across dimensions are thought to be "structural" and are otherwise considered noise.

WWGD · Jul 25, 2021

This is what I have understood: We somehow assign "functorially" a Simplicial Complex K to a data set S together with a filtration F , meaning the 1-complex is a subset of the 2-complex, and in general, if i<j, the ith complex is a subcomplex of the j-th . The filtration in question usually arises from a Real valued function ##f: K \rightarrow \mathbb R ## defined to mimick or model the problem of interest, which gives rise through a filtration for every Real a, through ## f^{-1}(a) ## for every Real number a. Then the k-th persistent Homology group is the homology induced by inclusion . We ultimately use the fundamental theorem of decomposition of finitely-generated modules over a PID so that the persistent parts are part of the "free part" of copies of ## \mathbb Z ## and the torsion part denotes the non-persistent part/features. We exploit the fact that there is a correspondence between F[t]-modules ( " F -adjoint t modules " *) and " Bar Codes". Bar Codes are collections of intervals describing the persistence of an element of homology. Persistence means homomorphism given by inclusion has a non-zero image.

Hope I did not make it more confusing. Will try to rewrite into more clarity when I can.
* I am not sure what these are, but I believe these are a standard module where "multiplication" is given by some fixed transformation.

springbottom · Aug 12, 2021

I feel like people are making it out to be more complicated than it is: the whole idea behind persistent homology is that
1) high dimensional structured data (e.g. images) often lives on some sub manifolds in the total space; and these manifolds often have nontrivial topological data associated, e.g. nontrivial homology groups
2) we can compute these groups effectively via persistent homology, which measures the homology that persists as you take a sampling in your high-dimensional space and grow balls around them. (e.g. if you have a 1-hole represented in H_1 in your data, as you grow balls about the points, the 1-hole will persist for a while, then die off. The persistent features encode the homology of the underlying manifold)

WWGD · Aug 16, 2021

springbottom said:

I feel like people are making it out to be more complicated than it is: the whole idea behind persistent homology is that
1) high dimensional structured data (e.g. images) often lives on some sub manifolds in the total space; and these manifolds often have nontrivial topological data associated, e.g. nontrivial homology groups
2) we can compute these groups effectively via persistent homology, which measures the homology that persists as you take a sampling in your high-dimensional space and grow balls around them. (e.g. if you have a 1-hole represented in H_1 in your data, as you grow balls about the points, the 1-hole will persist for a while, then die off. The persistent features encode the homology of the underlying manifold)

But how do you explain the ideal match between simplicial homology and bar codes. Why Simplicial and not other types?

Topological Data Analysis - Persistent Homology

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Undergrad About the definition of topological manifold using closed sets

Graduate Hopf fibration of 3-sphere

Undergrad Apparent counterexample to Cauchy-Goursat theorem (Complex Analysis)

Graduate Trivial fiber bundle vs product space

Graduate Shauder basis for Hilbert spaces

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers