I Looking for advice in clusterization

  • I
  • Thread starter Thread starter Frank Einstein
  • Start date Start date
  • Tags Tags
    Time series
AI Thread Summary
To identify redundant sensors from a dataset with 1000 rows, the discussion suggests using a correlation matrix to analyze relationships between sensor signals. High correlation between two inputs indicates that one may be dropped without losing significant information. While DBSCAN was initially used for clustering, participants recommend that examining the correlation matrix may provide a more straightforward solution for this specific goal. Calculating the correlation matrix is noted as an easy task. This approach could streamline the process of sensor redundancy evaluation.
Frank Einstein
Messages
166
Reaction score
1
TL;DR Summary
I need to know how to cluster data measured at different time instants.
Hello everyone. I have a machine with a series of sensors. All sensors send a signal each minute. I want to know if any of those sensors are redundant. The data is available as an Excel file, where the columns are the variables and the rows are the measurements. I have 1000 rows.

To do this, I have used DBSCAN in Python as

[CODE lang="python" title="Data clusterization"]scaler = StandardScaler()
data_normalized = scaler.fit_transform(data)
data_normalized = data_normalized.T
dbscan = DBSCAN(eps=15, min_samples=2)
clusters = dbscan.fit_predict(data_normalized)[/CODE]

However, I think that there has to be a better way to find relationships between variables (each sensor or columns of the data file).

Could someone please point me towards a methodology more suitable for my goals?
Any answer is appreciated.
Tanks for reading.
Best regards.
Frank.
 
Physics news on Phys.org
You can just look at the correlation matrix. If two inputs are highly correlated then you can probably drop one.
 
  • Like
Likes WWGD, FactChecker and Frank Einstein
Dale said:
You can just look at the correlation matrix. If two inputs are highly correlated then you can probably drop one.
Thanks. I can calculate them with ease as well.
 
Back
Top