Overview
KOSMOS logs stack is build on:
Vector
Vector is a high-performance observability data pipeline developed by Datadog (originally by Timber.io before being acquired). It is designed to collect, transform, and route logs, metrics, and traces from various sources to different destinations such as monitoring platforms, storage backends, or analytics tools.
Vector acts as a lightweight, vendor-agnostic agent that can run in various environments (bare metal, containers, serverless, etc.) to stream and process observability data in real-time.
Key features
- Built in Rust → extremely fast and memory-efficient
- Designed to operate with minimal overhead
- Backpressure-aware for safe and reliable throughput
- Agent mode role
- Aggregator role
Vector Sink Types
Observability & Monitoring Tools
- Datadog
- Splunk HEC
- New Relic Logs
- Loki (Grafana Loki)
- Honeycomb
- Prometheus Remote Write
- StatsD
Storage & Search Systems
- Elasticsearch / OpenSearch
- ClickHouse
- Amazon S3
- Google Cloud Storage
- Azure Blob Storage
- Snowflake
- PostgreSQL
- MongoDB Atlas (via custom HTTP)
- BigQuery (via GCP HTTP or custom route)
Message Buses / Streaming Platforms
- Kafka
- NATS
- Amazon Kinesis
- Google Pub/Sub
- RabbitMQ
- Redis
- AWS SQS
- Azure Event Hubs
Networking / Generic
- HTTP (Generic HTTP API)
- gRPC
- Socket
- Syslog (RFC 3164 & 5424)
- Unix Datagram Socket
- File (flat files, JSON lines, etc.)
Debug & Development
- Console (stdout)
- File (local disk for debugging or archiving)
- Blackhole (drops all data — useful for testing)
Many sinks support batching, retries, TLS, and compression. You can define multiple sinks in a single config, routing different data to different destinations.
Grafana
Grafana is an open-source platform for data visualization and monitoring, widely used to create interactive, customizable dashboards. Here’s a concise summary:
Key Features
- Data Visualization: Build graphs, charts, maps, and other visualizations from multiple data sources (databases, APIs, etc.)
- Dynamic Dashboards: Intuitive interface for organizing and sharing real-time data
- Wide Data Source Support: Works with Prometheus, InfluxDB, MySQL, PostgreSQL, Elasticsearch, and many others
- Alerting & Notifications: Set up alerts based on thresholds or specific events
- Plugins & Extensibility: Large library of plugins to extend functionality
Use Cases
- IT infrastructure monitoring (servers, networks, applications)
- Performance and log analysis
- IoT and industrial data monitoring
- Business metrics tracking (sales, web traffic, etc.)
Clickhouse
ClickHouse is an open-source columnar database management system (DBMS) optimized for real-time analytical processing (OLAP). Here’s a concise overview:
Key Features
- High Performance: Designed to execute complex analytical queries on massive datasets with minimal latency
- Columnar Storage: Data is stored by columns rather than rows, significantly speeding up analytical queries
- Horizontal Scalability: Supports data sharding and replication, enabling easy scaling
- Extended SQL: Compatible with most standard SQL, plus analytical extensions
- Easy Integration: Works with Kafka, MySQL, PostgreSQL, and other tools via connectors
Use Cases
- Real-time log and monitoring analysis
- Marketing and behavioral data analytics
- IoT and telemetry data processing
- Business intelligence and reporting