Neural Network Simulation

Interactive visualization of how our ML models process CVE data and make predictions

Early Premium

MLP - 25 features

Best Overall
0.9913
AUC-ROC
Architecture 25-256-128-64-1
Best For SPARSE (71.6%)
Availability At disclosure

Uses vendor/product rates and description similarity. Works immediately when a CVE is published.

Key Features:
desc_similarity vendor_rate product_rate version_rate

Full MLP

MLP - 66 features

Full Data
0.9719
AUC-ROC
Architecture 66-256-128-64-1
Best For RICH (2.5%)
Availability After enrichment

Uses all available signals including EPSS, sightings, and ATT&CK features. Requires NVD enrichment.

Key Features:
epss_score sightings attack_tactics + 25 early

GNN GraphSAGE

Graph Neural Network

Interpretable
0.9344
AUC-ROC
Architecture 18-128-128-1
Best For MODERATE (25.2%)
Graph Edges

Learns from CVE-CWE-CAPEC-ATT&CK knowledge graph. Provides interpretable reasoning chains.

Graph Nodes:

Why Three Models?

Different CVEs have different amounts of data available. Our adaptive ensemble selects the best model based on data availability.

71.6%
SPARSE Regime
Limited NVD data

Most CVEs at publication. No CWE, no CPE. Early Premium dominates with 66.7% similarity weight.

25.2%
MODERATE Regime
Partial enrichment

Some data available. GNN gets 16.9% weight - knowledge graph reasoning adds value.

3.2%
RICH Regime
Full EPSS + sightings

All signals available. Full MLP gets 39.9% weight - uses all models.

Detailed Comparison

Model AUC-ROC Features Architecture Best For Key Advantage
Early Premium Recommended 0.9913 25 25-256-128-64-1 SPARSE (71.6%) Works at disclosure
Full MLP 0.9719 66 66-256-128-64-1 RICH (3.2%) Uses all signals
GNN GraphSAGE 0.9344 18 + KG 18-128-128-1 MODERATE (25.2%) Interpretable

MLP Forward Pass Visualization

Select a model to simulate data flow through the network

AUC-ROC
Input
Hidden 1
256 neurons
ReLU + BN + Dropout
Hidden 2
128 neurons
ReLU + BN + Dropout
Hidden 3
64 neurons
ReLU + BN + Dropout
Output
Probability
Sigmoid

What's Happening

Network Stats

Input Features:
Architecture:
Total Parameters:
Dropout Rate:
30%

Key Differences

Early Premium (25 features)
  • Vendor/product historical rates
  • Version exploitation patterns
  • Description similarity (Sentence-BERT)
  • Available immediately at disclosure
Full MLP (66 features)
  • All Early Premium features +
  • EPSS scores, sighting counts
  • ATT&CK technique mappings
  • Requires NVD enrichment (days/weeks)

GNN Message Passing

GraphSAGE Aggregation Formula
hv(k) = σ(W · [hv(k-1) || AGG({hu : u ∈ N(v)})])

Each node aggregates features from its neighbors and combines with its own

18 features
CWE Neighbors
MEAN aggregation
CAPEC Patterns
2-hop neighbors
ATT&CK Techniques
Threat context

Layer 1: Aggregate from CWEs

Each CVE collects feature vectors from its connected CWE nodes. These are averaged (MEAN aggregation) and concatenated with the CVE's own features.

Layer 2: Aggregate from CAPEC/ATT&CK

Information flows from 2-hop neighbors (CAPEC patterns linked to CWEs). CVEs with CWEs leading to severe attack patterns inherit higher risk signals.

Knowledge Graph Statistics

CVE Nodes
CWE Nodes
CAPEC Nodes
ATT&CK Techniques
Total Edges

Monte Carlo Dropout Uncertainty

Measure prediction confidence by running multiple forward passes

How Monte Carlo Dropout Works

1
Keep Dropout On

Unlike normal inference, keep dropout enabled during prediction

2
Run 50 Passes

Each pass uses different random dropout, giving different predictions

3
Measure Spread

Low variance = high confidence, high variance = uncertain prediction

Training Visualization

Training Loss

Validation AUC-ROC

Best Epoch
Best AUC
Early Stop Epoch
100
Max Epochs

Training Configuration

Optimizer: Adam
Learning Rate: 0.001
Batch Size: 512
Early Stopping: 15 epochs patience

Activation Functions

Dropout

Dropout Rate:

MC Dropout Use

Batch Normalization

Benefits:

Loss Function

Binary Cross-Entropy Loss

L = -[y·log(p) + (1-y)·log(1-p)]

Measures how well predicted probabilities match actual exploitation outcomes (0 or 1).

💡 Heavily penalizes confident wrong predictions. If model predicts 95% and CVE wasn't exploited, loss is high.