Modelling Long Sets of Data: Measuring "Harshness

AI Thread Summary
Extrapolating from 500,000 binary data points to define "harshness" in audio datasets is challenging due to the high dimensionality and limited data. The relationship between data vectors and harshness may not be straightforward, making it difficult to predict harshness in other datasets without a clear physical basis. While various approaches exist, including physical explanations and machine learning techniques, the lack of certainty necessitates a realistic understanding of goals through probability theory. Transitioning from binary representations to more complex data, such as spectrums, adds further complexity to the analysis. Ultimately, measuring harshness remains uncertain and requires careful consideration of the methodologies employed.
clemon!!
Messages
20
Reaction score
0
say i have 500,000 0s or 1s.
say i have 50 such sets, each that i have ranked or assigned a value to - "harshness".

can i then extrapolate - is that the right word - to find the perfect dataset that instantiates the property of harshness?
and can i measure the harshness of other datasets?



thanks for any help - I've asked quite a few dumb questions of the board already :) !
 
Physics news on Phys.org


no help - not even anything i can google?
sorry - i keep changing what iwant to be doing haha :)
 


With that amount of data? Realistically, it's very unlikely. The data is extremely high dimensional, and yet you have very little of it. Unless the relationship between each data vector and "harshness" is extremely simple (e.g. more ones = more harsh), then you're going to have trouble finding meaningful relationships. Is this audio data of some sort?
 


yeah it's audio...
 


clemon! said:
yeah it's audio...

Then why is it a string of zeros and ones? That's not a very good way to represent audio for analysis.
 


clemon! said:
can i then extrapolate - is that the right word - to find the perfect dataset that instantiates the property of harshness?
and can i measure the harshness of other datasets?

There is no mathematical guarantee that you can accomplish those goals. For example, suppose you assign the property of hashness randomly. Then there is no forumula that would predict the harshness of other datasets.

If you believe there are physical causes for how you rate the harshness of a data set then there might by a way to predict the harshness of future data sets. There are many ways to approach this task and whether a way work depends on the physical facts of the situation not on any universal mathematical laws.

The approaches range from specific phyiscal explanations of harshness to curve fitting approaches or "black box" approaches (such as using simulated neural nets).

Since you are dealing with samples of data, you can't expect to have certainty about any answer you get. So you have to define your goals realistically using the language of probability theory. This is another complicated aspect of the problem.
 


well the other [quite odd] thing about this is that i was thinking of mostly working with square waves... that's why 1s and 0s anyway.


but i think i changed my mind and want to work with spectrums. again it'll be a lot of data tho... maybe less than 500,000 cells but now it's not 1s or 0s.

i can export time/ifft to excel with a program called sigview, which is a good start. but this is now 3 dimensional data plus a ranking. and i have no idea how to start looking for a trend in rank... i might in theory be able to reduce the amount of data, but yeah...



ideal result is way of measuring, plus i suppose the most harsh sound. can anyone give me an idea of the leg work involved in this task? i have no maths training but was pretty good at it at high school :)


and yeah, i am aware there's no guarantee that "harshness" can be measured like this :) !
 
Back
Top