S3 Intelligent-Tiering vs Glacier: A Cost Analysis

Understanding S3 Intelligent-Tiering vs Glacier for large-scale storage is critical for cost efficiency. This article details their mechanisms, trade-offs, and

Ahmet Çelik

11 min read
0

/

S3 Intelligent-Tiering vs Glacier: A Cost Analysis

Most teams prioritize immediate data accessibility and often default to S3 Standard for all data. But this leads to significant, often unnecessary, cloud spend at scale when a substantial portion of that data is rarely accessed or qualifies as archival. Optimizing this can unlock substantial budget for other critical infrastructure.


TL;DR BOX

  • S3 Intelligent-Tiering automatically moves objects between access tiers based on access patterns, optimizing costs for data with unknown or changing usage.

  • Glacier storage classes are purpose-built for long-term archiving with infrequent access, offering the lowest storage costs but with higher retrieval complexity and latency.

  • Choosing between Intelligent-Tiering and Glacier depends on data access predictability and acceptable retrieval times; one is for dynamic access, the other for static archives.

  • Intelligent-Tiering incurs a small per-object monitoring and automation fee, which can add up for billions of small objects.

  • Glacier has minimum storage durations and tiered retrieval costs, making it unsuitable for data requiring frequent or immediate access.


The Problem: Uncontrolled Storage Costs in Large-Scale Systems


In large-scale production environments, data grows exponentially. For backend engineers managing petabytes of logs, backups, analytics data, or media assets, uncontrolled storage costs quickly become a primary concern. Storing all data in S3 Standard, while convenient, fails to account for the reality that much of this data cools down over time. Over-provisioning high-cost storage for cold data can inflate cloud bills by 30-50% for data-intensive applications, directly impacting operational budgets. Teams commonly report that 70% of data older than 90 days sees less than 5% of all access requests, yet it often resides in expensive, readily accessible tiers.


This challenge isn't about avoiding storage altogether; it's about matching storage class characteristics to actual data access patterns. Misalignment means paying a premium for instant access to data that may only be needed once a quarter, or worse, once a year. The goal is to implement a robust, automated strategy that ensures data availability aligns with business needs while minimizing expenses.


How It Works: Navigating AWS Storage Tiers


AWS offers a range of S3 storage classes, each designed for specific access patterns and cost profiles. For large-scale storage optimization, S3 Intelligent-Tiering and the Glacier family of classes are often central to cost-saving strategies. Understanding their operational mechanics and ideal use cases is crucial for effective implementation.


S3 Intelligent-Tiering: Adaptive Cost Efficiency


S3 Intelligent-Tiering is an S3 storage class designed to automatically optimize storage costs for data with unknown, changing, or unpredictable access patterns. When you configure an object to use Intelligent-Tiering, S3 monitors its access patterns and moves it between two access tiers:

  • Frequent Access Tier: Designed for regularly accessed objects, with a similar per-GB cost to S3 Standard.

  • Infrequent Access Tier: For objects that have not been accessed for 30 consecutive days, offering a lower storage cost. Retrieval from this tier is still immediate.


Crucially, if an object in the Infrequent Access Tier is accessed, it's automatically moved back to the Frequent Access Tier. This automatic movement eliminates manual lifecycle management and the need for complex analytics to predict access patterns.


Beyond these two tiers, Intelligent-Tiering also offers optional Archive Access and Deep Archive Access tiers, designed for even lower storage costs for data that becomes rarely or never accessed. These tiers have higher retrieval costs and longer retrieval times, similar to standard Glacier classes. You configure objects to automatically move to these tiers after 90 days (Archive Access) or 180 days (Deep Archive Access) of no access, providing a fully automated, multi-tiered solution for evolving data needs.


The primary trade-off with Intelligent-Tiering is the per-object monitoring and automation fee. While small per object, this fee can accumulate significantly for billions of small objects. Engineers must consider this overhead against potential savings, especially for data with stable and predictable access patterns.


Glacier Storage Classes: Deep Archive for Predictable Access


The Glacier storage classes—S3 Glacier Instant Retrieval, S3 Glacier Flexible Retrieval (formerly S3 Glacier), and S3 Glacier Deep Archive—are purpose-built for long-term, cost-effective archiving. These classes are ideal for data that is rarely accessed (once a quarter or less) and where retrieval latency is acceptable.


  • S3 Glacier Instant Retrieval: Offers milliseconds retrieval and is suitable for archives that need immediate access, like medical images or news media assets. It's more expensive than the other Glacier classes.

  • S3 Glacier Flexible Retrieval: Provides flexible retrieval options, ranging from minutes (Expedited) to hours (Standard) or even 5-12 hours (Bulk). This class is excellent for backups, disaster recovery, or long-term archives where retrieval isn't time-critical. It has a minimum storage duration of 90 days.

  • S3 Glacier Deep Archive: The lowest-cost storage option in AWS, designed for long-term data archival that might be accessed once or twice a year, if at all. Retrieval typically takes 12-48 hours. It has a minimum storage duration of 180 days.


The key characteristic of Glacier classes is their extremely low storage cost, offset by higher retrieval costs and explicit retrieval times. Unlike S3 Standard or Intelligent-Tiering, retrieving data from Glacier isn't instantaneous or free. You explicitly initiate a retrieval job, choose a retrieval speed (and associated cost), and wait for the data to become available. This makes Glacier a poor choice for operational data, active logs, or anything requiring frequent or real-time access.


Interactions and Trade-offs for Large-Scale Data


When deciding between Intelligent-Tiering and Glacier for large-scale storage, the fundamental driver is data access predictability.


  • Intelligent-Tiering is superior when access patterns are unknown or change frequently. It removes the operational burden of predicting data usage, automatically moving data to the most cost-effective tier. This is ideal for analytics lakes where query patterns evolve, user-generated content with varied popularity, or logs that might be accessed frequently initially but cool down over time. The monitoring fee is its primary cost consideration beyond storage.

  • Glacier is the undisputed champion for known archival data where infrequent access and longer retrieval times are acceptable. Think regulatory compliance archives, long-term backups, or scientific research data that's rarely re-analyzed. The cost savings per GB are significant, but this comes with strict minimum storage durations and tiered retrieval costs that must be factored into total cost of ownership. Accidentally retrieving a large amount of data from Glacier Deep Archive can quickly negate years of storage savings.


Consider a scenario involving a petabyte-scale data lake. Data ingested daily might be active for the first 30-60 days, then less frequently accessed for another 6 months, before becoming deep archive.

  • For the initial dynamic phase, Intelligent-Tiering would manage the transitions seamlessly.

  • For the truly deep archive data (e.g., historical records beyond one year), a lifecycle rule transitioning objects directly to S3 Glacier Deep Archive after 365 days of creation, or after they've spent time in Intelligent-Tiering's deepest tiers, offers maximum cost efficiency.


The critical interaction point is using S3 lifecycle rules to transition data from Intelligent-Tiering (or S3 Standard) to Glacier classes. This allows for a staged approach: active data in Standard, dynamic data in Intelligent-Tiering, and cold, archival data in Glacier. This strategy maximizes cost savings without compromising on the appropriate level of accessibility for different data lifecycles.


Step-by-Step Implementation: Automating Tiering with Terraform


Implementing S3 Intelligent-Tiering and lifecycle rules for Glacier transitions effectively requires infrastructure as code. Terraform provides a robust way to define these configurations, ensuring consistency and version control.


This example creates an S3 bucket configured for Intelligent-Tiering and includes a lifecycle rule to transition objects older than one year to S3 Glacier Deep Archive.


1. Define the S3 Bucket and Intelligent-Tiering Configuration


First, define your S3 bucket and specify the Intelligent-Tiering configuration. This tells S3 to automatically monitor and move objects between the Frequent and Infrequent Access tiers.


main.tf

resource "awss3bucket" "mydataarchive_bucket" {

bucket = "my-backendstack-data-archive-2026" # Unique bucket name

acl = "private"


tags = {

Environment = "Production"

Project = "BackendStack"

}

}


resource "awss3bucketversioning" "mydataarchivebucket_versioning" {

bucket = awss3bucket.mydataarchive_bucket.id

versioning_configuration {

status = "Enabled" # Versioning is a best practice for data integrity

}

}


resource "awss3bucketintelligenttieringconfiguration" "myintelligenttieringconfig" {

bucket = awss3bucket.mydataarchive_bucket.id

name = "default" # Default configuration for the entire bucket


tiering {

days = 30 # Move to Infrequent Access after 30 days of no access

accesstier = "ARCHIVEACCESS" # Optional: Move to Archive Access after 90 days no access

}


tiering {

days = 90 # Move to Archive Access after 90 days of no access

accesstier = "DEEPARCHIVE_ACCESS" # Optional: Move to Deep Archive Access after 180 days no access

}


# If you want to use the optional Archive Access and Deep Archive Access tiers

# for Intelligent-Tiering, you would define additional tiering blocks here.

# For this example, we'll demonstrate a direct transition to Glacier Deep Archive

# via lifecycle rule for ultimate cost savings on true archives.

}


output "bucket_name" {

value = awss3bucket.mydataarchive_bucket.bucket

}


This Terraform configuration creates an S3 bucket and enables versioning. It then sets up Intelligent-Tiering. Note that the `tiering` blocks here define the optional Archive Access and Deep Archive Access tiers within Intelligent-Tiering. If an object isn't accessed for 30 days, it moves to IT-IA. If it remains unaccessed for 90 days, it moves to IT-AA. For 180 days, IT-DAA.


2. Add a Lifecycle Rule for Glacier Deep Archive Transition


Next, define a lifecycle rule that will transition objects to S3 Glacier Deep Archive after a specified period. This is crucial for true archival data, taking advantage of Glacier's lowest cost.


main.tf (continued)

resource "awss3bucketlifecycleconfiguration" "myarchivelifecycle" {

bucket = awss3bucket.mydataarchive_bucket.id


rule {

id = "archiveolddatatodeep_glacier"

status = "Enabled"


# Apply to all objects in the bucket, or specify a prefix

filter {} # An empty filter means the rule applies to all objects


transition {

days = 365 # Transition objects to Glacier Deep Archive after 365 days

storageclass = "GLACIERDEEP_ARCHIVE"

}


# Optional: Define expiration for even older data, e.g., after 10 years

# expiration {

# days = 3650

# }


# For non-current versions (if versioning is enabled and you want to manage old versions differently)

noncurrentversiontransition {

noncurrent_days = 90

storageclass = "GLACIERDEEP_ARCHIVE"

}

noncurrentversionexpiration {

noncurrent_days = 365

}

}

}


This rule tells S3 that any object in `my-backendstack-data-archive-2026` that has not been modified for 365 days (or its non-current version for 90 days) should be moved to `GLACIERDEEPARCHIVE`. This setup leverages both automatic tiering for dynamic data and explicit archival for truly cold data.


3. Apply the Terraform Configuration


Initialize Terraform, review the plan, and apply it.


$ terraform init

Expected Output (truncated):

Initializing the backend...

Initializing provider plugins...

...

Terraform has been successfully initialized!


$ terraform plan

Expected Output (truncated):

An execution plan has been generated and is shown below.

Resource actions are indicated with the following symbols:

+ create


Terraform will perform the following actions:


# awss3bucket.mydataarchive_bucket will be created

+ resource "awss3bucket" "mydataarchive_bucket" {

+ acl = "private"

+ arn = (known after apply)

+ bucket = "my-backendstack-data-archive-2026"

...

}


# awss3bucketintelligenttieringconfiguration.myintelligenttieringconfig will be created

+ resource "awss3bucketintelligenttieringconfiguration" "myintelligenttieringconfig" {

+ bucket = (known after apply)

+ id = (known after apply)

+ name = "default"

+ tiering {

+ accesstier = "ARCHIVEACCESS"

+ days = 30

}

+ tiering {

+ accesstier = "DEEPARCHIVE_ACCESS"

+ days = 90

}

}


# awss3bucketlifecycleconfiguration.myarchivelifecycle will be created

+ resource "awss3bucketlifecycleconfiguration" "myarchivelifecycle" {

+ bucket = (known after apply)

+ id = (known after apply)

+ rule {

+ id = "archiveolddatatodeep_glacier"

+ status = "Enabled"

+ transition {

+ days = 365

+ storageclass = "GLACIERDEEP_ARCHIVE"

}

+ noncurrentversiontransition {

+ noncurrent_days = 90

+ storageclass = "GLACIERDEEP_ARCHIVE"

}

+ noncurrentversionexpiration {

+ noncurrent_days = 365

}

}

}


Plan: 3 to add, 0 to change, 0 to destroy.



$ terraform apply --auto-approve

Expected Output (truncated):

Apply complete! Resources: 3 added, 0 changed, 0 destroyed.


Outputs:

bucket_name = "my-backendstack-data-archive-2026"


Common mistake: Not enabling versioning when dealing with lifecycle rules, particularly `noncurrentversiontransition`. Versioning is crucial for data durability and preventing accidental overwrites, and many advanced lifecycle rules depend on it to manage previous object versions. Ensure `awss3bucket_versioning` is configured and enabled for the bucket.


Production Readiness: Monitoring, Costs, and Edge Cases


Deploying intelligent tiering and archival strategies into production demands careful consideration of monitoring, cost control, and potential failure modes.


Monitoring and Alerting


While Intelligent-Tiering automates data movement, it does not obviate the need for monitoring.

  • Cost Monitoring: Regularly review AWS Cost Explorer and S3 Storage Lens. Track the `StorageClass` metric for your buckets to observe data distribution across S3 Intelligent-Tiering (Frequent, Infrequent, Archive Access, Deep Archive Access tiers) and Glacier Deep Archive. Set up cost anomaly detection alerts for unexpected spikes. Pay particular attention to the per-object monitoring fee for Intelligent-Tiering; if you have billions of small objects, this fee can become significant.

  • Access Patterns: Use S3 Server Access Logging or AWS CloudTrail to monitor retrieval patterns for data residing in Intelligent-Tiering. If data in the Infrequent Access Tier is being accessed far more frequently than expected, it suggests a miscalibration, or a change in application behavior. This data can inform policy adjustments.

  • Glacier Retrievals: For data in Glacier Deep Archive, monitor retrieval requests and costs. Unexpectedly large or frequent Glacier retrievals signal an architectural issue where data that should be cold is being accessed for operational purposes. Implement CloudWatch alarms on Glacier retrieval request metrics and bytes returned to catch costly events.


Cost Optimization Beyond Tiering


  • Minimum Storage Durations: S3 Glacier Flexible Retrieval has a 90-day minimum storage duration, and S3 Glacier Deep Archive has 180 days. Deleting objects before these durations incur a pro-rated charge. Plan your lifecycle rules accordingly; only transition data to these tiers if you are confident it will remain there for at least the minimum duration.

  • Small Objects: Intelligent-Tiering charges a per-object monitoring fee. If your application stores vast numbers of very small objects (e.g., kilobyte-sized files), this fee can erode the savings from tiering. Consider aggregating small objects into larger archives (e.g., using Tar or Parquet) before uploading to S3 or transitioning to Glacier if suitable.

  • Retrieval Costs: Glacier retrieval costs are complex and vary by tier and speed. Expedited retrievals from Glacier Flexible Retrieval are significantly more expensive than Bulk retrievals. For Intelligent-Tiering, retrievals from its Archive Access or Deep Archive Access tiers also incur retrieval charges and latency. Always estimate retrieval costs before initiating large jobs.


Security Considerations


  • IAM Policies: Ensure IAM policies restrict who can access and retrieve data from specific S3 storage classes. For Glacier, strictly control `s3:RestoreObject` and related actions.

  • Encryption: All S3 storage classes support encryption at rest (SSE-S3, SSE-KMS, SSE-C). Ensure encryption is consistently applied across all tiers to maintain data confidentiality, especially for sensitive archival data.

  • Deletion Protection: S3 Object Lock can prevent objects from being deleted or overwritten for a fixed amount of time or indefinitely, providing an additional layer of protection, especially for compliance-driven archival data in Glacier.


Edge Cases and Failure Modes


  • Rapid Access Spikes on Glacier: If an emergency or new requirement suddenly dictates immediate access to petabytes of data in Glacier Deep Archive, be prepared for significant retrieval costs and multi-day waiting periods. This scenario highlights why Glacier is not for operational data. Architect your applications to use Glacier only for truly non-urgent archives.

  • "Hot" Data in Archive Tiers: Misconfigured lifecycle rules can push frequently accessed data into Glacier or Intelligent-Tiering's deeper archive tiers. This results in either expensive retrieval costs (for Glacier) or performance degradation (for Intelligent-Tiering's deeper tiers). Regular access pattern analysis is vital to prevent this.

  • Cost of Re-ingestion: If data is mistakenly deleted from S3 and needs to be restored from an offsite backup (not managed by S3 itself), the cost of re-ingesting and re-processing that data can far outweigh any storage savings. Comprehensive data lifecycle planning includes robust backup and recovery strategies, not just cost-tiering.


Summary & Key Takeaways


Navigating S3 Intelligent-Tiering and Glacier storage classes effectively optimizes costs for large-scale data while maintaining appropriate access.


  • Automate tiering for unpredictable access: Use S3 Intelligent-Tiering for datasets where access patterns are unknown or change frequently, allowing AWS to manage data movement between frequent and infrequent access tiers automatically.

  • Reserve Glacier for deep archives: Implement S3 Glacier or Glacier Deep Archive for data with known, extremely infrequent access needs and where retrieval latency (hours to days) is acceptable.

  • Monitor per-object costs: Be aware of the per-object monitoring fee in Intelligent-Tiering; it can impact cost-effectiveness for buckets with billions of tiny objects.

  • Respect minimum storage durations: Avoid premature deletion of objects from Glacier classes to prevent pro-rated charges for unmet minimum storage durations (90 days for Glacier Flexible Retrieval, 180 days for Deep Archive).

  • Validate lifecycle rules regularly: Continuously monitor access patterns and costs, adjusting S3 lifecycle policies and Intelligent-Tiering configurations to align with evolving data usage and optimize your AWS spend.

WRITTEN BY

Ahmet Çelik

Former AWS Solutions Architect, 8 years in cloud and infrastructure. Computer Engineering graduate, Bilkent University. Lead writer for AWS, Terraform and Kubernetes content.Read more

Responses (0)

    Hottest authors

    View all

    Ahmet Çelik

    Lead Writer · ex-AWS Solutions Architect, 8 yrs · AWS, Terraform, K8s

    Alp Karahan

    Contributor · MongoDB certified, NoSQL specialist · MongoDB, DynamoDB

    Ayşe Tunç

    Lead Writer · Engineering Manager, ex-Meta, Google · System Design, Interviews

    Berk Avcı

    Lead Writer · Principal Backend Eng., API design · REST, GraphQL, gRPC

    Burak Arslan

    Managing Editor · Content strategy, developer marketing

    Cansu Yılmaz

    Lead Writer · Database Architect, 9 yrs Postgres · PostgreSQL, Indexing, Perf

    Popular posts

    View all
    Sercan Öztürk
    ·

    # GitHub Actions Tutorial: Step-by-Step CI/CD Workflows

    # GitHub Actions Tutorial: Step-by-Step CI/CD Workflows
    Deniz Şahin
    ·

    BigQuery Tutorial: Quickstart for Backend Engineers

    BigQuery Tutorial: Quickstart for Backend Engineers
    Murat Doğan
    ·

    Azure Kubernetes Service Tutorial: Production Best Practices

    Azure Kubernetes Service Tutorial: Production Best Practices
    Ahmet Çelik
    ·

    Terraform AWS Tutorial: Production-Ready IaC Foundations

    Terraform AWS Tutorial: Production-Ready IaC Foundations
    Elif Demir
    ·

    Docker Compose vs Kubernetes: Production Orchestration

    Docker Compose vs Kubernetes: Production Orchestration
    Ahmet Çelik
    ·

    S3 Intelligent-Tiering vs Glacier: A Cost Analysis

    S3 Intelligent-Tiering vs Glacier: A Cost Analysis