I Looking for advice in clusterization

Frank Einstein · Oct 15, 2023

Hello everyone. I have a machine with a series of sensors. All sensors send a signal each minute. I want to know if any of those sensors are redundant. The data is available as an Excel file, where the columns are the variables and the rows are the measurements. I have 1000 rows.

To do this, I have used DBSCAN in Python as

[CODE lang="python" title="Data clusterization"]scaler = StandardScaler()
data_normalized = scaler.fit_transform(data)
data_normalized = data_normalized.T
dbscan = DBSCAN(eps=15, min_samples=2)
clusters = dbscan.fit_predict(data_normalized)[/CODE]

However, I think that there has to be a better way to find relationships between variables (each sensor or columns of the data file).

Could someone please point me towards a methodology more suitable for my goals?
Any answer is appreciated.
Tanks for reading.
Best regards.
Frank.

Dale · Oct 15, 2023

You can just look at the correlation matrix. If two inputs are highly correlated then you can probably drop one.

Frank Einstein · Oct 15, 2023

Dale said:

You can just look at the correlation matrix. If two inputs are highly correlated then you can probably drop one.

Thanks. I can calculate them with ease as well.

I Looking for advice in clusterization

Thread 'Discuss Ten Simple Rules for Effective Statistical Practice'

Similar threads

B A Little Probability Puzzle

I Need help solving this Existence Algorithm for truth

I What Are the Axioms of Fuzzy Logic and How Do They Extend Boolean Algebra?

A Distribution of Range of Samples taken from N(0,1)

I A variant of the Monty Hall problem

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers