If you’re not aware of Amazon S3 Glacier Service, I’d suggest you to read What is Amazon S3 Glacier? before going through the FAQ.

  • Amazon S3 Glacier is an extremely low-cost storage service that provides secure, durable, and flexible storage for data backup and archival. With Amazon S3 Glacier, customers can reliably store their data for as little as $0.004 per gigabyte per month.

  • You can also deploy compliance storage controls with Vault Lock to store regulatory and compliance archives in an immutable, Write Once Read Many (WORM) format.

  • You store data in Amazon S3 Glacier as an archive. Each archive is assigned a unique archive ID that can later be used to retrieve the data. An archive can represent a single file or you may choose to combine several files to be uploaded as a single archive. You upload archives into vaults. Vaults are collections of archives that you use to organize your data.

  • Individual archives are limited to a maximum size of 40 terabytes.

  • Amazon S3 Glacier offers a 10 GB retrieval free tier. You can retrieve 10 GB of your Amazon S3 Glacier data per month for free. The free tier allowance can be used at any time during the month and applies to Standard retrievals.

  • For Initiate multipart and Upload part, you will be charged at S3 Standard PUT and POST request rates. For Complete multipart, you will be charged the S3 Glacier PUT and POST request rate.

  • An archive is a durably stored block of information. You store your data in Amazon S3 Glacier as archives. You may upload a single file as an archive, but your costs will be lower if you aggregate your data. TAR and ZIP are common formats that customers use to aggregate multiple files into a single file before uploading to Amazon S3 Glacier.

  • When uploading large archives (100MB or larger), you can use multi-part upload to achieve higher throughput and reliability. Multi-part uploads allow you to break your large archive into smaller chunks that are uploaded individually. Once all the constituent parts are successfully uploaded, they are combined into a single archive.

  • A vault is a way to group archives together in Amazon S3 Glacier. You organize your data in Amazon S3 Glacier using vaults. Each archive is stored in a vault of your choice. You may control access to your data by setting vault-level access policies using the AWS Identity and Access Management (IAM) service. You can also attach notification policies to your vaults.

  • You can create up to 1,000 vaults per account per region.

  • Amazon S3 Glacier allows you to tag your Glacier vaults for easier resource and cost management. Tags are labels that you can define and associate with your vaults, and using tags adds filtering capabilities to operations such as AWS cost reports.

  • You may delete any S3 Glacier vault that does not contain any archives using the AWS Management Console, the Amazon Glacier direct APIs or the SDKs. Once a vault has been deleted, you can then re-create a vault with the same name. If your vault contains archives, you must delete all the archives before deleting the vault.

  • Access permissions can be assigned in two ways: as user-based permissions or as resource-based permissions. Access control based on IAM policies is user-based where you would assign IAM policies to IAM users or groups to control the read, write, and delete permissions on your S3 Glacier vaults. Access control with vault access policies is resource-based where you would attach an access policy directly on a vault to govern access to all users. Vault access policies can make certain use cases simpler.

  • Both policies govern access controls to your vault, however, a Vault Lock policy can be made immutable and provides strong enforcement for your compliance controls. You can use the Vault Lock policy to deploy regulatory and compliance controls that are typically restrictive and are “set and forget” in nature. In conjunction, you can use the vault access policy to implement access controls that are not compliance related, temporary, and subject to frequent modification. The two policies can be used in tandem to achieve governance and flexibility.

  • There are three options for retrieving data with varying access times and cost: Expedited, Standard, and Bulk retrievals

  • Standard retrievals allow you to access any of your archives within several hours. Standard retrievals typically complete within 3 – 5 hours.

  • To make a Standard retrieval, set the “Tier” parameter in the InitiateJob API request to “Standard”. If no tier is specified, the request will default to Standard.

  • Bulk retrievals typically complete within 5 – 12 hours.

  • Expedited retrievals allow you to quickly access your data when occasional urgent requests for a subset of archives are required. For all but the largest archives (250MB+), data accessed using Expedited retrievals are typically made available within 1 – 5 minutes. There are two types of Expedited retrievals: On-Demand and Provisioned. On-Demand requests are like EC2 On-Demand instances and are available the vast majority of the time. Provisioned requests are guaranteed to be available when you need them.

  • To make an Expedited retrieval, set the “Tier” parameter in the InitiateJob API request to Expedited. There is no need to designate whether an Expedited retrieval is On-Demand or Provisioned. If you have purchased provision capacity, then all Expedited retrievals will be automatically be served via your Provisioned capacity.

  • Amazon S3 Glacier data retrieval policies let you define your own data retrieval limits with a few clicks in the AWS console. You can limit retrievals to “Free Tier Only”, or if you wish to retrieve more than the free tier, you can specify a “Max Retrieval Rate” to limit your retrieval speed and establish a retrieval cost ceiling. In both cases, Amazon S3 Glacier will not accept retrieval requests that would exceed the retrieval limits you defined. Retrieval policies apply to Standard retrievals.

  • Yes, you can set one data retrieval policy for each AWS region which will govern all data retrieval activities in the region under your account. Data retrieval policies are region-specific because data retrieval costs vary across AWS regions.

  • Amazon S3 Glacier Select is a feature that allows you to run queries on your data stored in Amazon S3 Glacier, without the need to restore the entire object to a hotter tier like Amazon S3. With Amazon S3 Glacier Select, you can now perform filtering and basic querying using a subset of SQL directly against your data in Amazon S3 Glacier. You provide a SQL query and list of Amazon S3 Glacier objects, and Amazon S3 Glacier Select will run the query in-place and write the output results to a bucket you specify in Amazon S3.

  • You can use Amazon S3 Glacier Select when you need to perform pattern matching or custom analytics on your archived data stored in S3 Glacier.

Reference: Amazon S3 Glacier FAQs