Label Storage
String Interning
String interning means: instead of storing the same string over and over, store it
once in a byte buffer and give each copy a tiny 4-byte integer that points to the original.
In metric storage, labels like
region=us-west-2 appear in thousands of
series — without interning, each copy wastes memory.
Generate Kubernetes-style metrics and watch memory usage collapse as strings are reused.
A series is one stream of metric data
uniquely identified by its labels.
Labels are key=value pairs like job=api or
region=us-west. A monitoring system tracking 10,000 pods might have 50,000+
series — but most share the same small set of label values, making interning highly effective.
① Generate Series
Create realistic Kubernetes metric series with shared labels. Adjust the count and hit Generate.
② At a Glance
③ Naive vs Interned
Side-by-side: every series stores full strings (left) vs. integer IDs into a shared buffer (right).
Naive Storage
Interned Storage
④ Memory Comparison
How much space does interning actually save?
⑤ Intern a String
Step through the FNV-1a hash → open-addressing → store-or-reuse pipeline.
To check whether a string is already stored, we use a hash table — a data structure for near-instant lookups:
- Hash: run the string through a hash function (FNV-1a) to get a slot number
- Probe: check that slot; if taken by a different string, check the next slot (linear probing)
- Insert or reuse: if the string is new, store it and record its position; if it already exists, return the existing position
⑥ Cardinality Impact
Memory grows differently for naive vs interned as cardinality (the number of unique strings) changes.