Skip to content

Anomaly Detection

🧠 Anomaly Detection

InfraSight includes a machine learning–based anomaly detection module designed to automatically identify unseen attacks and abnormal container behavior in real time. It operates in an online learning fashion using the River Python library, which enables continuous adaptation to new data streams.

🎯 Overview & Purpose

The anomaly detection subsystem continuously monitors container activity to detect deviations from normal behavior. Two complementary models are used:

  • Resource Usage Model: Detects unusual patterns in CPU, memory, and I/O usage.
  • Syscall Frequency Model: Identifies abnormal syscall invocation rates.

Both models aim to detect zero day or previously unseen attack behaviors by learning the baseline behavior of each container directly from live event streams.

⚙️ Architecture & Workflow

The anomaly detection service consumes data directly from Kafka, where InfraSight’s eBPF-based client and server publishes telemetry events.

The workflow proceeds in three stages:

  1. Data Ingestion: Events are streamed from Kafka topics corresponding to resource usage and syscall frequencies.

  2. Learning Phase: For each container, a dedicated model is created. The model observes the first events for approximately 5–15 minutes, depending on the container’s activity level, to establish its baseline behavior.

  3. Detection Phase: Once the warmup period ends, the model begins producing anomaly scores and emits alerts when unusual patterns are detected.

🧩 Each container maintains its own model instance, ensuring that detection remains context aware and sensitive to its unique behavior profile.

🧮 Model Details

InfraSight currently uses two online models for anomaly scoring:

Model Description Parameters
One-Class SVM Learns normal behavior and assigns a signed score to each event. Default configuration
Quantile Filter Flags events above the 99th percentile of previously seen scores. q = 0.99

Feature Extraction

The features used by each model are defined in the database schema:

Warmup Phase

  • The first 50 events per container are used exclusively for model initialization and are not evaluated for anomalies.

Scoring and Alerting

An event is classified as anomalous when:

  • The Quantile Filter detects that the score exceeds the 99th percentile of previous scores, or
  • The One-Class SVM score is below 0.

🚨 Alerting & Output

When an anomaly is detected:

  • An alert is logged to standard output with details such as container ID, timestamp, and anomaly type.