NormShield Blog

Machine Learning in Cyber Security Domain – 6: False Alarm Rate Reduction

machine learning

Machine Learning False Alarm Rate Reduction; In some cases, IDS / IPS Systems may classify an event correctly or falsely. Classified events are evaluated in four categories in literature.

  1. True Positives (TP): intrusive and anomalous,
  2. False Negatives (FN): Not intrusive and not anomalous,
  3. False Positives (FP): not intrusive but anomalous,
  4. True Negatives (TN): Intrusive but not anomalous.

TP and FN represent correctly classified events, FP and TN represent wrongly classified events. Recognizing TN (intrusive but not anomalous) is a very hard task and can not be detected by the system itself, human factor must be involved to the mechanism for recognizing this type of events. FP (not intrusive but anomalous) is an event classified as intrusive but it is actually a normal user’s event. This is a very common occurrence in today’s systems. False alarm rate reduction is a one of the challenging problem for especially IDS / IPS system which has been used for commercial purpose.

In generally, for the purpose of reducing false alarm rate, an extra module (also known as the filter) must be implemented before IDS / IPS’ output. In this way, false alarms are eliminated from outputs and network administrator should only handle a small amount of alarm which can be really an intrusion attempt. Thus, time and manpower are saved. In this chapter, it is explained that how filter module works, and how it reduces false alarms.

Machine Learning, False Alarm Rate Reduction

The majority of researchers have provided a solution to alarm correlation for anomaly techniques since purely anomaly techniques trigger more alarms than other techniques. Although hybrid approach optimizes the visibility and performance of the system, it makes the alarm correlation more complicated. There is a need to attract researchers’ attention to providing solutions for alarm management for recently used hybrid detection methods.

There are two main assumptions for Anomaly Based IDSs, first of these intrusion events represent anomaly behavior and the second one is that user profile does not change much in a short amount of time. False alarms occur when the edges of these assumptions are not defined well. Basically, outputs of the IDS/IPS are consist of two classes of events. First is the attack events which are classified correctly and the other one is normal events which are classified falsely as an attack. Actually, both attack events and normal events consist of many classes. Since, we want to separate them into real alarm classes and false alarm classes, we think that there are two classes in this output data.

Now, we have the output data, and we do not know which one is real alarm and which one is not. In machine learning terminology this means that data have no labels. Because of this, we can use unsupervised techniques (also known clustering techniques) to create two clusters according to our purpose. There are so many algorithms developed for clustering. In general, clustering algorithms use distance metrics to evaluate the similarity between samples. Every sample is clustered with similar samples. So every cluster has samples which are similar to each other. With this idea, after the algorithm works, we have two classes for alarm data. One of these represents normal events, the other one represents attack events. Based on two main assumptions which are explained above, we can infer small cluster as representing attack events.

The approach which we have explained during the chapter is one of the basic level approaches, so it is explained because of good understanding about the methodology. There are a lot of different, complex and successful approaches developed in the literature. In recent studies, researchers have used the combination more than one technique instead of a  single algorithm for reducing false alarm rate. For example, for two layered clustering, first layer clusters suspicious events and non-suspicious events, and second layer gives the final decision for clusters. Like this, there are so many hybrid approaches developed in the literature.