Data Distribution

  • Thread starter ipmax
  • Start date
  • #1
4
0

Main Question or Discussion Point

Hi folks....I am trying to use outlier detection techniques on python....I checked my algorithm for sinusoidal distribution of data. I need to develop some other kind of distribution to check the working of the algorithm I have used. Can you give me examples of some other known distribution like sine, gaussian, binomial etc....which I can use for outlier detection?

IPMAX
 

Answers and Replies

  • #2
Simfish
Gold Member
818
2
Zipf, Poisson
 
  • #3
673
2
What type of data do you have/what do you expect to see?

scipy.stats has a whole bunch of distributions you can test against and a bunch of tests for trying to figure out how your data is distributed.
 
  • #4
4
0
I saw the scipy.stats module....I am confused with which function would be appropriate....I am dealing with the currents....what would be a good distribution for a current variable.....I tried sinusoidal (thats what I could come up with) :P
 
  • #5
673
2
The way I've done it is plot my data and then see which distributions it seems to look like. If you plot yours and post the graph, it may be easier to give you suggestions. Right now, I'd guess that sinusoidal does sound about right.
 
  • #6
4
0
you misunderstood my post....My whole point is to generate a current dataset from a certain distribution and mix random outliers in it and detect the outliers......I have tried sinusoidal distribution as a possible dataset and tried the detection. Now, I need to devise some other distribution of dataset. I just know sinusoidal current dataset...what else could be a data distribution that would be favorable to called current dataset?
 
  • #7
673
2
I just know sinusoidal current dataset...what else could be a data distribution that would be favorable to called current dataset?
Depends on the device/whatever you're trying to simulate: digital currents will likely be the derivative of a square wave (which itself is a collection of impulse functions), mosfets look sort of like http://en.wikipedia.org/wiki/Current%E2%80%93voltage_characteristichttp://en.wikipedia.org/wiki/Current%E2%80%93voltage_characteristic [Broken], etc. You may need outlier detection for some distros and not others.
 
Last edited by a moderator:
  • #8
4
0
what about exponential current and sinc? Is sinc current probable in real world?
 
Last edited:

Related Threads on Data Distribution

  • Last Post
Replies
6
Views
2K
Replies
3
Views
2K
  • Last Post
Replies
5
Views
2K
  • Last Post
Replies
1
Views
2K
  • Last Post
Replies
5
Views
3K
Replies
2
Views
4K
  • Last Post
Replies
1
Views
9K
  • Last Post
Replies
7
Views
4K
  • Last Post
Replies
1
Views
2K
Top