Data collected from different devices: how to combine for analysis?

  • Context: Undergrad 
  • Thread starter Thread starter Mikki123
  • Start date Start date
  • Tags Tags
    Data
Click For Summary
SUMMARY

This discussion focuses on the challenges of combining data from three different devices for analysis, specifically current values collected over time. The participants emphasize the importance of ensuring that data is equally spaced and free from jitter to achieve meaningful comparisons. Fourier transformation is suggested as a potential method to preprocess the data, alongside normalization and feature extraction, to improve machine learning model performance. The conversation highlights that understanding the measurement context and instrument functionality is crucial for meaningful data analysis.

PREREQUISITES
  • Understanding of Fourier transformation techniques
  • Knowledge of data normalization methods
  • Familiarity with feature extraction in machine learning
  • Concept of data sampling and time series analysis
NEXT STEPS
  • Research Fourier transformation applications in data preprocessing
  • Explore advanced normalization techniques for time series data
  • Learn about feature extraction methods specific to sensor data
  • Study best practices for designing experiments and data collection protocols
USEFUL FOR

Data scientists, machine learning practitioners, and researchers working with multi-device data analysis and time series data who seek to enhance their understanding of data preprocessing techniques.

Mikki123
Messages
4
Reaction score
3
Hi Everyone,
I'm working on a project where I have current values from three different devices when there is no arc and an arc generated by an arc generator. When I plot them, they all look different since the data is from different devices. Is there anything I can do to make them comparable, like make them look similar, so that I can perform further analysis?
 
Physics news on Phys.org
In your plot where y is current values, what is x ?
 
it is just indexes starting from 0 to the number of samples
 
However you may get statistics, e.g. average, sandard deviation, as mathematical treatment, Number has no physical meaning. You had better pick up some phisical quantitiy from the samples for plot,e.g. same divice with different physical condition, same condition with different devices.
 
  • Like
  • Informative
Likes   Reactions: Vanadium 50, russ_watters and mcastillo356
All good. I will try doing that. Thankyou :smile:
 
Mikki123 said:
it is just indexes starting from 0 to the number of samples
Are they collected over time? Then x is time, isn't it? Are these samples collected in regular periods? Then it is just a matter of knowing frequency, no?
 
  • Like
Likes   Reactions: russ_watters
Hi Borek,
I just have a single column of 800,000 current values. The x values should be time, I suppose. I have the same from three different devices. But while plotting it, they looked so different. I wanted to train my machine learning model with this data for further processing. Since the data all looks so different, I'm getting such poor performance. Do you think the Fourier transformation for all three will make them look similar so that I can better train my model. I'm looking for any kind of preprocessing apart from normalization and feature extraction
 
Mikki123 said:
The x values should be time, I suppose.
You should probably know this (?????) It matters that they be equally spaced with no " jitter ".
Then look at (the difference) fourier transform to find interferences.
When you say feature extraction what exactly do you mean?
Why do you expect similar results?
 
  • Like
Likes   Reactions: russ_watters
Good data requires a detailed understanding of what it really means; how the measurement instrument works; what it is REALLY measuring. Good lab work is more about experiment design and selecting and/or researching the instrumentation than it is about collecting the data. This would be especially true if you are trying to represent the same physical quantity with different methods.

Without this prior engineering it is likely that the data is meaningless, or has unknown meaning. Bad data can be combined however you like. Garbage in, garbage out applies from the very beginning of experimentation and analysis.

A set of numbers isn't data, it's just numbers. Data has an associated meaning and context.

You will not get useful answers from us if we don't know, in detail, what you are measuring, why, and how.
 
  • Like
Likes   Reactions: gleem, Vanadium 50, hutchphd and 2 others

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 11 ·
Replies
11
Views
2K
  • · Replies 1 ·
Replies
1
Views
1K
Replies
86
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
Replies
27
Views
3K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 18 ·
Replies
18
Views
4K