Time-based differential analysis

Click For Summary
SUMMARY

The discussion centers on time-based differential analysis, a method for mining network and system log data to identify significant events. The user has developed scheduled jobs that run at intervals from every 5 minutes to 24 hours, notifying them of events that have occurred since the last run that were not previously detected. The closest established technique to this method is Grubb's test for outliers, with key differences including the ability to detect multiple outliers per iteration, the use of multivariate datasets, and the addition of outliers as known events for future analysis.

PREREQUISITES
  • Understanding of time-based data analysis techniques
  • Familiarity with Grubb's test for outliers
  • Knowledge of multivariate datasets
  • Experience with scheduling jobs in data processing tools
NEXT STEPS
  • Research Grubb's test for outliers in detail
  • Explore multivariate anomaly detection techniques
  • Learn about scheduling jobs in Apache Airflow
  • Investigate best practices for logging and monitoring network events
USEFUL FOR

This discussion is beneficial for data analysts, system administrators, and cybersecurity professionals who are involved in event monitoring and anomaly detection in network and system logs.

stoomart
Messages
392
Reaction score
132
is the fancy term I've been using to describe how I mine various network and system log data for interesting events that I want to be aware of. Since I never pursued a college degree or studied big data analytics, I'm hoping someone can help me identify a more appropriate term to use besides the one above or "rare events", which I don't think captures the essence of what I'm doing.

I have built many scheduled jobs that run anywhere from every 5 minutes to every 24 hours depending on the data and its importance. These jobs will notify me if events occurred since their last run that were not seen in the previous x days (the range depends on the data). One example is to look for infrequent software installations reported by our antivirus clients.

Any ideas what this analysis technique is called?
 
Technology news on Phys.org
Thank you for the link @Nidum, it is very interesting to learn about the different methods for anomaly detection. It sounds like the closest fit for the method I'm using is Grubb's[/PLAIN] test, with the following differences:

- Any number of outliers can be detected for each iteration, rather than a single outlier.
- Jobs often work on multivariate datasets, rather than univariate datasets.
- Outliers are added to the dataset as "known events" for subsequent iterations, rather than being expunged.​
 
Last edited by a moderator:

Similar threads

  • · Replies 21 ·
Replies
21
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 5 ·
Replies
5
Views
3K
Replies
4
Views
2K
  • · Replies 15 ·
Replies
15
Views
6K
  • · Replies 3 ·
Replies
3
Views
2K
Replies
29
Views
6K
Replies
2
Views
3K