Confusion Matrix to solve Cyber Crime

Daksh Jain
2 min readJun 6, 2021

A confusion matrix is a technique for summarizing the performance of a classification algorithm. Classification accuracy alone can be misleading if you have an unequal number of observations in each class or if you have more than two classes in your dataset.

Confusion Matrix

True Positive: You predicted positive and it’s true.
True Negative: You predicted negative and it’s true.
False Positive: (Type 1 Error) You predicted positive and it’s false.
False Negative: (Type 2 Error) You predicted negative and it’s false.

KDD CUP ‘’99 Data Set Description

This data set is prepared by Stolfo et al and is built based on the data captured in the DARPA’98 IDS evaluation program . DARPA’98 is about 4 gigabytes of compressed raw (binary) TCP dump data of 7 weeks of network traffic, which can be processed into about 5 million connection records, each with about 100 bytes.

For each TCP/IP connection, 41 various quantitative (continuous data type) and qualitative (discrete data type) features were extracted among the 41 features, 34 features (numeric), and 7 features (symbolic).

To analysis the different results, there are standard metrics that have been developed for evaluating network intrusion detections. Detection Rate (DR) and false alarm rate are the two most famous metrics that have already been used. DR is computed as the ratio between the number of correctly detected attacks and the total number of attacks, while the false alarm (false positive) rate is computed as the ratio between the number of normal connections that is incorrectly misclassified as attacks and the total number of normal connections.

In parallel SVM machine first, we reduced nonclassified features data by distance matrix of the binary pattern. From this concept, the cascade structure is developed by initializing the problem with a number of independent smaller optimizations and the partial results are combined in later stages in a hierarchical way, as shown in figure 1, supposing the training data subsets and are independent among each other.

So yeah, Machine Learning, and Confusion Matrix do have a great role in eliminating cyber crimes and make the internet a safer place for all!!

--

--

Daksh Jain

Automation Tech Enthusiast || Terraform Researcher || DevOps || MLOps ||