Time-based differential analysis

AI Thread Summary
The discussion focuses on identifying an appropriate term for analyzing network and system log data to detect significant events. The user seeks a term that better encapsulates their method than "rare events." They describe a system of scheduled jobs that monitor for infrequent occurrences, such as unusual software installations reported by antivirus clients, running at intervals from every 5 minutes to 24 hours. The analysis technique resembles anomaly detection, specifically referencing Grubb's test for outliers, but with key differences: it detects multiple outliers per iteration, operates on multivariate datasets, and incorporates identified outliers as known events for future analysis rather than removing them. The conversation highlights the nuances of event detection in data analytics without formal training in the field.
stoomart
Messages
392
Reaction score
132
is the fancy term I've been using to describe how I mine various network and system log data for interesting events that I want to be aware of. Since I never pursued a college degree or studied big data analytics, I'm hoping somone can help me identify a more appropriate term to use besides the one above or "rare events", which I don't think captures the essence of what I'm doing.

I have built many scheduled jobs that run anywhere from every 5 minutes to every 24 hours depending on the data and its importance. These jobs will notify me if events occurred since their last run that were not seen in the previous x days (the range depends on the data). One example is to look for infrequent software installations reported by our antivirus clients.

Any ideas what this analysis technique is called?
 
Technology news on Phys.org
Thank you for the link @Nidum, it is very interesting to learn about the different methods for anomaly detection. It sounds like the closest fit for the method I'm using is Grubb's[/PLAIN] test, with the following differences:

- Any number of outliers can be detected for each iteration, rather than a single outlier.
- Jobs often work on multivariate datasets, rather than univariate datasets.
- Outliers are added to the dataset as "known events" for subsequent iterations, rather than being expunged.​
 
Last edited by a moderator:
Dear Peeps I have posted a few questions about programing on this sectio of the PF forum. I want to ask you veterans how you folks learn program in assembly and about computer architecture for the x86 family. In addition to finish learning C, I am also reading the book From bits to Gates to C and Beyond. In the book, it uses the mini LC3 assembly language. I also have books on assembly programming and computer architecture. The few famous ones i have are Computer Organization and...
I have a quick questions. I am going through a book on C programming on my own. Afterwards, I plan to go through something call data structures and algorithms on my own also in C. I also need to learn C++, Matlab and for personal interest Haskell. For the two topic of data structures and algorithms, I understand there are standard ones across all programming languages. After learning it through C, what would be the biggest issue when trying to implement the same data...
Back
Top