Time-based differential analysis

  1. Mar 8, 2017 #1
    is the fancy term I've been using to describe how I mine various network and system log data for interesting events that I want to be aware of. Since I never pursued a college degree or studied big data analytics, I'm hoping somone can help me identify a more appropriate term to use besides the one above or "rare events", which I don't think captures the essence of what I'm doing.

    I have built many scheduled jobs that run anywhere from every 5 minutes to every 24 hours depending on the data and its importance. These jobs will notify me if events occurred since their last run that were not seen in the previous x days (the range depends on the data). One example is to look for infrequent software installations reported by our antivirus clients.

    Any ideas what this analysis technique is called?
  3. Mar 8, 2017 #2


  4. Mar 8, 2017 #3
    Thank you for the link @Nidum, it is very interesting to learn about the different methods for anomaly detection. It sounds like the closest fit for the method I'm using is Grubb's[/PLAIN] [Broken] test, with the following differences:

    - Any number of outliers can be detected for each iteration, rather than a single outlier.
    - Jobs often work on multivariate datasets, rather than univariate datasets.
    - Outliers are added to the dataset as "known events" for subsequent iterations, rather than being expunged.​
