BG Image
Semi-Structured Data
Jan 21, 2026

JSON in Snowflake: Flatten for Performance Keep VARIANT for Flexibility

VARIANT columns are convenient but can be expensive for frequently-accessed data. Learn when to flatten JSON and when to keep it semi-structured.

Raj
CEO, MaxMyCloud

JSON in Snowflake: Flatten for Performance, Keep VARIANT for Flexibility

VARIANT columns are convenient but can be expensive for frequently-accessed data. Learn when to flatten JSON and when to keep it semi-structured.

Cost Comparison

VARIANT Approach:

SELECT
event_data:user_id::INTEGER,
event_data:event_type::VARCHAR
FROM events
WHERE event_data:event_type::VARCHAR = 'purchase';
-- Cost: Scans entire VARIANT column (100GB)

Flattened Approach:

SELECT user_id, event_type
FROM events
WHERE event_type = 'purchase';
-- Cost: Scans only 2 columns (5GB) - 95% reduction

When to Flatten

  • Fields are frequently queried
  • Need to filter/aggregate on fields
  • Field types are known and consistent
  • Want optimal query performance
  • High query volume on specific fields

When to Keep VARIANT

  • Schema is unknown or highly variable
  • Infrequent access
  • Need full flexibility
  • Data exploration phase
  • Complex nested structures

Hybrid Approach (Best Practice)

CREATE TABLE api_logs (
log_id INTEGER,
timestamp TIMESTAMP,
-- Flatten commonly-queried fields
user_id INTEGER,
endpoint VARCHAR(200),
response_code INTEGER,
-- Keep full payload for detailed analysis
request_payload VARIANT,
response_payload VARIANT
);

Performance Impact

  • VARIANT query: 45 seconds, scans 100GB
  • Flattened query: 3 seconds (15x faster), scans 5GB
  • 95% cost reduction on this query

Real-World Example

A SaaS company stored 500M event records with 50+ JSON fields. Original: 2TB compressed storage, 100 credits/day query cost, 30-60 seconds average performance.

After flattening top 10 fields (80% of queries): 1.4TB compressed (30% reduction), 15 credits/day (85% reduction), 3-8 seconds average (5-10x faster).

Key Takeaways

  • VARIANT is flexible but has performance costs
  • Flatten frequently-queried fields to native types
  • Keep VARIANT for rarely-accessed or variable schema data
  • Hybrid approach (flatten common fields, keep full JSON) is best
  • Can reduce query costs by 80-95% and improve performance 5-10x

Recent blogs

Start Optimizing Your Snowflake Costs Today

Uncover hidden inefficiencies and start reducing Snowflake spend in minutes no disruption, no risk.