Skip to main content

add-cache-headers.sh

Adds cache control headers to Azure Blob Storage files that don’t have them set.

Description

This script scans all blobs in the specified container and adds the cache control header Cache-Control: public, max-age=3600, must-revalidate to files that don’t have it or have a different value. It skips files that already have the correct header set.

Prerequisites

  • Azure CLI installed and configured
  • jq for JSON processing
  • Authenticated Azure session (az login)
  • Write access to Azure Storage Account

Configuration

Edit these variables in the script:
ACCOUNT_NAME="testdinostr"
CONTAINER_NAME="docs"
CACHE_CONTROL="public, max-age=3600, must-revalidate"

Usage

Basic Syntax

./scripts/add-cache-headers.sh [PARALLEL]

Parameters

PARALLEL (optional, default: 10)
  • Number of parallel operations to run simultaneously

Examples

Default Usage (10 Parallel Operations)

./scripts/add-cache-headers.sh

High Parallelism

./scripts/add-cache-headers.sh 20

Conservative Approach

./scripts/add-cache-headers.sh 5

Output

Adding cache headers to blobs
Account: testdinostr
Container: docs
Cache-Control: public, max-age=3600, must-revalidate
Parallel operations: 10

Fetching storage account key...
Storage key retrieved successfully

Fetching blob list...
Found 156 blobs

Processing blobs...

✓ Updated: docs/ai-insights/emerging-failures.webp
⊘ Skipped (already set): docs/ai-insights/key-metrics.webp
✗ Failed: docs/some-file.webp
...

✓ Complete!

Summary:
  Total blobs: 156
  Updated: 89
  Skipped: 65
  Failed: 2

Features

  • Smart Detection: Only updates blobs that need it
  • Parallel Processing: Updates multiple blobs simultaneously
  • Progress Tracking: Real-time status with symbols:
    • ✓ Updated successfully
    • ⊘ Skipped (already has correct header)
    • ✗ Failed to update
  • Summary Report: Shows counts at the end

Cache Control Header

The script sets:
Cache-Control: public, max-age=3600, must-revalidate
Meaning:
  • public - Can be cached by browsers and CDNs
  • max-age=3600 - Cache for 1 hour (3600 seconds)
  • must-revalidate - Must check with server after expiration

Customizing Cache Duration

Edit the script to change cache duration:
# 1 hour (default)
CACHE_CONTROL="public, max-age=3600, must-revalidate"

# 1 day
CACHE_CONTROL="public, max-age=86400, must-revalidate"

# 1 week
CACHE_CONTROL="public, max-age=604800, must-revalidate"

# 1 year
CACHE_CONTROL="public, max-age=31536000, must-revalidate"

Performance

Blob CountRecommended Parallel
< 1005-10
100-50010-15
500-100015-20
> 100020-30

Execution Time

Approximate times (depends on network):
  • 100 blobs: ~30 seconds (10 parallel)
  • 500 blobs: ~2 minutes (15 parallel)
  • 1000 blobs: ~3-4 minutes (20 parallel)

Error Handling

Failed to Fetch Storage Key

Failed to fetch storage account key
Solution: Run az login and verify access

Failed Updates

Individual blob update failures are tracked and reported in the summary. Common causes:
  • Blob is locked
  • Insufficient permissions
  • Blob doesn’t exist (deleted during operation)

Missing jq

jq: command not found
Solution:
# macOS
brew install jq

# Linux
apt-get install jq

Verification

Check a Specific Blob

az storage blob show \
  --account-name testdinostr \
  --container-name docs \
  --name "docs/ai-insights/key-metrics.webp" \
  --query "properties.contentSettings.cacheControl"

List All Blobs Without Cache Headers

az storage blob list \
  --account-name testdinostr \
  --container-name docs \
  --query "[?properties.contentSettings.cacheControl==null].name"

Use Cases

Initial Setup

Run once to add cache headers to all existing blobs:
./scripts/add-cache-headers.sh

After Bulk Upload

After uploading new files without cache headers:
./scripts/add-cache-headers.sh

Periodic Maintenance

Run periodically to ensure all blobs have proper caching:
# Add to cron job
0 2 * * 0 /path/to/scripts/add-cache-headers.sh

Best Practices

  1. Test First: Run on a test container before production
  2. Monitor Progress: Watch for failed updates
  3. Verify Results: Check a few blobs manually after completion
  4. Schedule Wisely: Run during low-traffic periods
  5. Backup First: Consider backing up blob metadata before bulk updates

Troubleshooting

All Updates Failing

  • Verify write permissions on storage account
  • Check if container has immutability policy
  • Ensure blobs aren’t locked

Slow Performance

  • Reduce parallel count
  • Check network latency to Azure region
  • Run from Azure VM in same region for best speed

Inconsistent Results

  • Some blobs may be updated by other processes
  • Re-run the script to catch any missed blobs