Expand description
Multi-day entity churn analysis with intelligent ephemeral pattern detection.
⚠️ DEPRECATED: Use entity-analysis churn instead.
# Old (deprecated):
vault-audit entity-churn day1.log day2.log day3.log
# New (recommended):
vault-audit entity-analysis churn day1.log day2.log day3.logSee entity_analysis for the unified command.
Tracks entity lifecycle across multiple audit log files (compressed or uncompressed) to identify:
- New entities appearing each day
- Returning vs. churned entities
- Entity persistence patterns
- Authentication method usage trends
- Ephemeral entities using data-driven pattern learning
§Usage
# Analyze entity churn across a week (compressed files)
vault-audit entity-churn day1.log.gz day2.log.gz day3.log.gz day4.log.gz day5.log.gz day6.log.gz day7.log.gz
# With baseline for accurate new entity detection
vault-audit entity-churn *.log --baseline baseline_entities.json
# With entity mappings for enriched display names
vault-audit entity-churn *.log --baseline baseline.json --entity-map entity_mappings.json
# Export detailed churn data with ephemeral analysis
vault-audit entity-churn *.log --output entity_churn.json
# Export as CSV format
vault-audit entity-churn *.log --output entity_churn.csv --format csvCompressed File Support: Automatically handles .gz and .zst files - no manual
decompression required. Mix compressed and uncompressed files freely.
§Ephemeral Pattern Detection
The command uses a sophisticated two-pass analysis to detect ephemeral entities (e.g., CI/CD pipeline entities, temporary build entities) with confidence scoring:
Pass 1: Data Collection
- Track all entities across log files
- Record first/last seen times and files
- Count login activity per entity
Pass 2: Pattern Learning & Classification
- Learn patterns from entities that appeared 1-2 days
- Identify naming patterns (e.g.,
github-repo:org/repo:ref:branch) - Calculate confidence scores (0.0-1.0) based on:
- Days active (1 day = high confidence, 2 days = medium)
- Similar entities on same mount path
- Activity levels (low login counts)
- Gaps in activity (reduces confidence for sporadic access)
§Output
§Entity Lifecycle Classification:
- new_day_N: Entities first seen on day N (not in baseline)
- pre_existing_baseline: Entities that existed before analysis period
§Activity Patterns:
- consistent: Appeared in most/all log files
- sporadic: Appeared intermittently with gaps
- declining: Activity decreased over time
- single_burst: Appeared only once
§Ephemeral Detection:
- Confidence levels: High (≥70%), Medium (50-69%), Low (40-49%)
- Detailed reasoning for each classification
- Top ephemeral entities by confidence
- Pattern statistics and mount path analysis
§JSON Output Fields
When using --output, each entity record includes:
entity_id: Vault entity identifierdisplay_name: Human-readable namefirst_seen_file/first_seen_time: When first observedlast_seen_file/last_seen_time: When last observedfiles_appeared: List of log files entity was active intotal_logins: Total login count across all fileslifecycle: Entity lifecycle classificationactivity_pattern: Behavioral pattern classificationis_ephemeral_pattern: Boolean flag for ephemeral detectionephemeral_confidence: Confidence score (0.0-1.0)ephemeral_reasons: Array of human-readable reasons
Only tracks entities that performed login operations (paths ending in /login).