0%

Loki Storage Optimization

1. Use Cold/Hot Storage Separation

Approach:

  • Hot Data (logs from the last few days or weeks): Stored in fast storage (like SSDs) for frequent access.
  • Cold Data (historical logs): Stored in object storage (like MinIO, S3) for long-term archiving and infrequent access.

Example Loki Configuration:

1
2
3
4
5
6
7
8
9
10
11
storage_config:
boltdb_shipper:
active_index_directory: /var/loki/index # Hot data directory
shared_store: s3 # Use MinIO/S3 as cold data storage
cache_location: /var/loki/cache # Cache directory

aws:
s3: http://minio-service.minio.svc.cluster.local:9000 # Address of MinIO
bucketnames: loki-logs
access_key_id: minio
secret_access_key: minio123

Optimization Effect:

  • Reduces local storage pressure by moving historical logs to object storage.
  • Improves query performance: prioritizes hot data queries, with slightly higher latency for cold data queries but at lower costs. ---

2. Compress Logs and Indexes

  • Enable Log Compression: Loki compresses logs by default using chunks, and you can adjust the chunk size and compression level to optimize storage space.
  • Choose Appropriate Compression Algorithm (like Snappy or GZIP):
1
2
3
4
ingester:
chunk_encoding: snappy # Use Snappy to compress chunks
chunk_idle_period: 5m # Set the closing time for idle chunks
max_chunk_age: 1h # Maximum lifespan for each chunk
  • Adjust Chunk Size: Larger chunks can reduce the number of index entries but will increase memory usage. Tune according to log volume and access patterns.

3. Optimize Index Configuration

  • Boltdb-shipper: It is recommended to use Boltdb-shipper to store indexes, reducing reliance on external databases (like Cassandra).
  • Reduce the Number of Index Entries: Lower the index size by disabling unnecessary labels or reducing the number of labels in the Loki configuration:
1
2
3
4
limits_config:
max_label_name_length: 1024
max_label_value_length: 2048
max_streams_per_user: 10000
  • Index Granularity Control: Reducing the frequency of index refreshes can decrease the write pressure on index data:
1
2
3
4
5
6
7
8
9
schema_config:
configs:
- from: 2023-01-01
store: boltdb-shipper
object_store: filesystem
schema: v11
index:
prefix: loki_index_
period: 24h # Refresh index every 24 hours

4. Use Object Storage to Optimize Long-Term Storage (MinIO/S3)

If the storage volume for Loki is very large, consider using object storage systems like MinIO/S3 to lower costs:

  • Automatic Migration of Cold/Hot Storage: Automatically migrate hot data older than a certain time to cold storage.
  • Partitioning and Sharding of Object Storage: Improve read performance in MinIO/S3 through partitioning and sharding.

5. Sharding and Partitioning

  • Horizontally Scale Loki's Ingesters, achieving parallel writes and queries through sharding:
    • Deploy multiple Ingester replicas, each responsible for different shards.
1
2
3
4
ingester:
lifecycler:
ring:
replication_factor: 3 # Ensure replicas write to multiple Ingester nodes
  • Optimize Query Performance: Distribute query requests across multiple Ingesters to enhance concurrent query speed.

6. Configure Appropriate Retention Policy (Log Retention Policy)

  • Set the retention period for logs to timely clean up expired logs and free up storage space:
1
2
limits_config:
retention_period: 30d # Retain logs for 30 days
  • If you need to retain some long-term data, you can combine the layered strategy of object storage to migrate old logs to cold storage.

7. Optimize Data Transfer Between Promtail and Loki

  • Enable Batch Sending: Reduce network overhead and increase throughput:
1
2
3
4
clients:
- url: http://loki:3100/loki/api/v1/push
batchsize: 102400 # Maximum size of each log batch
batchwait: 1s # Maximum wait time
  • Use gRPC Compression: Enable gRPC and Protobuf compression to further reduce the amount of data transferred:
1
2
3
clients:
- url: grpc://loki:3100
grpc_compression: snappy # Enable Snappy compression

8. Adjust Query Concurrency and Rate Limiting

  • Set query concurrency and rate limits to prevent the Loki cluster from crashing under high query loads:
1
2
3
4
limits_config:
max_concurrent_queries: 20 # Maximum number of concurrent queries
max_query_timeout: 2m # Query timeout duration
max_entries_per_query: 5000 # Maximum number of entries per single query

9. Use Headless Service to Improve Throughput

In the Loki cluster, use a Headless Service to avoid the load balancer becoming a bottleneck:

1
2
3
4
5
6
7
8
9
10
11
apiVersion: v1
kind: Service
metadata:
name: loki
spec:
clusterIP: None # Headless Service
selector:
app: loki
ports:
- port: 3100
name: http

Summary

By implementing the above optimization strategies, you can achieve efficient storage and querying in Loki, including:

  1. Cold/Hot Data Layering: Use MinIO/S3 as cold storage to reduce local storage pressure.
  2. Compression and Chunk Optimization: Adjust chunk size and compression algorithms to minimize storage costs.
  3. Optimize Index Strategies: Use Boltdb-shipper, and reduce the number of labels and indexing frequency.
  4. Sharding and Partitioning: Horizontally scale Ingesters to enhance system throughput.
  5. Transfer Optimization: Enable batch sending and gRPC compression to reduce network overhead.