Logs are the easiest observability signal to generate and the easiest to overspend on. This guide covers the trade-offs — structured vs unstructured, hot vs cold storage, sampling strategies, and how the major log tools price in 2026.
What is Log Management?
Log management is the practice of collecting, transporting, storing, indexing, and querying log data — the timestamped text events your services emit. It covers everything from "tail this file" to running a managed petabyte-scale log analytics platform.
Logs answer the question metrics and traces can't: what exactly happened? When metrics say latency spiked at 2:14 PM, logs tell you it was a database connection pool exhaustion caused by a slow query against the orders table.
Structured vs Unstructured Logs
Structured logs use a consistent format (usually JSON or logfmt) with named fields. Unstructured logs are free-form text strings.
Example — unstructured:
2026-05-16 14:23:45 ERROR Failed to process order 42 for user alice@example.com Example — structured (JSON):
{"ts":"2026-05-16T14:23:45Z","level":"error","msg":"failed to process order","order_id":42,"user":"alice@example.com","service":"orders-api"} Structured wins because you can query by any field — "all errors for user X", "all orders with order_id > 1000", "errors per service per minute". With unstructured logs, you're doing regex against text, slowly and brittlely.
The Cost Problem
Log management is one of the easiest places to overspend in observability. Three multiplying factors:
- Volume — every request, every error, every audit event. A medium-sized API can generate 100-500 GB/day.
- Indexing — turning text into searchable structure. Elastic-style indexing can be 10-50% on top of storage.
- Retention — keeping logs searchable for weeks or months. Hot storage is 10-100x cheaper than active indexing.
Typical pricing per GB ingested in 2026:
- Grafana Loki (self-hosted on S3): $0.005-0.02/GB
- Grafana Cloud Loki: $0.05/GB
- New Relic Logs: $0.30-0.50/GB
- Datadog Logs (Ingest + Indexing): $0.10/GB ingest + $1.06-1.59 per million events indexed (15-day retention)
- Splunk Cloud: $0.50-2.00/GB depending on workload pricing tier
Log Tools Landscape (2026)
| Tool | Best for | Cost shape |
|---|---|---|
| Grafana Loki | Cost-conscious K8s teams | Cheap. Index only labels, content in S3-compatible storage |
| Elasticsearch / OpenSearch | Full-text search heavy workflows | Expensive at scale, fast search |
| Datadog Logs | Teams already on Datadog APM | Pricey, deeply integrated |
| Splunk | Large enterprise, security + compliance | Premium pricing, mature ecosystem |
| New Relic Logs | Teams already on New Relic APM | Per-GB, integrates with traces |
| CloudWatch Logs | Pure AWS shops, simple needs | Cheap to ingest, expensive to query |
Retention Strategy
The mistake teams make: retain every log for 12 months. The fix: tier your storage.
- Hot (0-14 days) — fully indexed, sub-second queries. Where engineers actually look. This is the expensive tier.
- Warm (14-90 days) — searchable but slower. Used for trend analysis, recurring incident investigation.
- Cold (90+ days) — archived to S3 Glacier or similar. Used only for compliance audit. Re-hydratable into hot if needed.
Compliance requirements (SOC 2: 1 year, PCI-DSS: 1 year with 3 months immediately accessible, HIPAA: 6 years) drive the floor. Engineering value drives the ceiling. The right policy keeps logs searchable as long as they're useful, then drops them to cold storage.
Also see
- Metrics & Observability — cheaper to store, better for dashboards
- Distributed Tracing — when logs are too noisy to follow a single request
- AWS Pricing — CloudWatch Logs and S3 storage costs that underlie self-hosted log stacks
- Uptime Monitoring — outside-in checks that don't need log infrastructure
Log Management FAQ
How much does log management cost?
Managed log services charge per GB ingested + per GB-month retained. Typical rates: Datadog $0.10/GB ingested + indexing fees, Splunk Cloud $0.50-2.00/GB depending on plan, New Relic $0.30/GB, Grafana Loki $0.05/GB. Self-hosted Loki on S3 can be under $0.01/GB at scale. A 500 GB/day app logging baseline can cost $5,000-30,000/month managed.
Structured vs unstructured logging — which should I use?
Structured (JSON or logfmt) always wins for production. You can query by field, filter without regex, and aggregate. Unstructured text logs are fine for local dev but expensive to search and impossible to aggregate. Every modern logging library supports structured output — turn it on.
What is the difference between Loki and Elasticsearch?
Both store logs but with different trade-offs. Loki indexes only labels (small index), stores log content compressed in S3-compatible storage — cheap to run, slower full-text search. Elasticsearch indexes everything — expensive at scale, very fast full-text search. Loki is better for cost-conscious teams; Elastic for search-heavy workflows.
How long should I retain logs?
Most production teams keep hot/searchable logs for 7-30 days, then archive to cheaper storage for compliance. SOC 2 generally requires 1 year of access logs. PCI-DSS requires 1 year with 3 months immediately searchable. Engineers rarely look at logs older than 14 days — beyond that, the cost-of-keeping usually outweighs the value-of-having.
What is sampling for logs?
Like trace sampling, log sampling drops a fraction of low-value logs to reduce cost. Common strategies: (1) Sample by level — keep 100% of WARN/ERROR/FATAL, sample INFO/DEBUG. (2) Sample by route — health check logs at 1%, business logic at 100%. (3) Tail sampling — keep all logs from a request if the request errored or was slow.