Skip to content

Latest commit

 

History

History
144 lines (119 loc) · 4.62 KB

README.md

File metadata and controls

144 lines (119 loc) · 4.62 KB

Process Flow Visualization Tool for Audit Logs

This tool analyzes and visualizes process relationships, behaviors, and timelines from Linux audit logs generated by auditbeat. It combines traditional rule-based analysis with machine learning to provide comprehensive process behavior analysis and visualization.

Features

  • Parses auditbeat NDJSON log files
  • Builds process hierarchy trees and activity timelines
  • Identifies suspicious process behaviors and security concerns
  • Generates visualizations using Mermaid.js
  • Provides multiple visualization views:
    • Process Tree View: Shows hierarchical relationships
    • Timeline View: Shows process activities over time
    • Analysis Comparison: Compares traditional and ML analysis results
  • Color-coded process classification:
    • 🟢 Green: Root processes (PID 1)
    • 🔵 Blue: Normal processes
    • 🟠 Orange: Privileged processes (running as root)
    • 🔴 Red: Suspicious processes

Security Analysis

The tool performs security analysis:

Process Name Analysis

  • Detection of hex-encoded and obfuscated names
  • Recognition of known suspicious process names
  • Identification of attack indicators
  • Analysis of unusual character distributions
  • Detection of Unicode/non-ASCII characters

Behavior Analysis

  • System call pattern analysis
  • Privilege escalation attempts
  • File system operations
  • Network activity patterns
  • Process manipulation
  • Resource usage patterns

Machine Learning Analysis

  • Unsupervised anomaly detection
  • Process behavior pattern learning
  • Automatic feature extraction
  • Syscall frequency analysis
  • Temporal pattern recognition

Security Alerts

  • Suspicious execution patterns
  • Failed operations
  • Privilege abuse
  • Network abuse patterns
  • File system tampering
  • Command injection attempts

Installation

  1. Clone the repository:
git clone https://github.com/toopieare/CS5231_Proj.git
cd CS5231_Proj
  1. Create a virtual environment (recommended):
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt

Usage

  1. Place your auditbeat log file in the input directory and update config.py to indicate the correct filename
  2. Run the tool:
python main.py
  1. View the visualizations:
  • Open output/process_flow.html for the process hierarchy view
  • Open output/process_gantt.html for the timeline view

Project Structure

CS5231_Proj/
├── config.py           # Configuration settings
├── main.py            # Main application entry point
├── requirements.txt   # Python dependencies
├── src/
│   ├── analysis/      # Analysis modules
│   │   ├── behavior_analyzer.py    # Traditional behavior analysis
│   │   ├── ml_behavior_analyzer.py # Machine learning analysis
│   │   ├── process_tree.py        # Process hierarchy building
│   │   ├── security_analyzer.py    # Security checks
│   │   └── analysis_reporter.py    # Analysis comparison reporting
│   ├── data/          # Data processing
│   │   ├── data_processor.py      # Log data processing
│   │   └── log_loader.py          # Log file handling
│   ├── utils/         # Utility functions
│   └── visualization/ # Visualization generators
│       ├── html_generator.py      # HTML output generation
│       └── mermaid_generator.py   # Diagram generation
├── input/             # Input log files
└── output/           # Generated visualizations

Understanding the Visualizations

Process Tree View

  • Shows hierarchical relationships between processes
  • Indicates process states and security concerns
  • Provides detailed process information and alerts
  • Color-coded for quick status identification

Timeline View

  • Shows process lifetimes and activities
  • Groups processes by type and behavior
  • Indicates activity levels and suspicious patterns
  • Provides temporal context for process behaviors

Edge Types

  • Normal edges (-->) indicate standard relationships
  • Bold edges (==>) indicate suspicious process relationships
  • Red nodes indicate security concerns
  • Orange nodes indicate privileged operations

ML-Based Analysis

  • Uses autoencoder for anomaly detection
  • Learns normal process behavior patterns
  • Identifies unusual syscall patterns
  • Provides anomaly scores
  • Adapts to system-specific patterns

Comparative Visualization

  • Interactive scatter plot of analysis scores
  • Score distribution histograms
  • Detailed process-level metrics
  • Category-based score breakdowns
  • Debug information for validation