Hacker News

calebevans

Show HN: Cordon – Reduce large log files to anomalous sections github.com

Cordon uses transformer embeddings and density scoring to identify what's semantically unique in log files, filtering out repetitive noise.

The core insight: a critical error repeated 1000x is "normal" (semantically dense). A strange one-off event is anomalous (semantically isolated).

Outputs XML-tagged blocks with anomaly scores. Designed to reduce large logs as a form of pre-processing for LLM analysis.

Trade-offs: intentionally ignores repetitive patterns, uses percentile-based thresholds (relative, not absolute).