Karl Robinson
March 26, 2021
Karl is CEO and Co-Founder of Logicata – he’s an AWS Community Builder in the Cloud Operations category, and AWS Certified to Solutions Architect Professional level. Knowledgeable, informal, and approachable, Karl has founded, grown, and sold internet and cloud-hosting companies.
Confused about the myriad of AWS data storage options? In this post, we take a look at the whole range of Amazon storage services, as well as the different types of storage available. By the end of the article, you should know your EBS from your EFS and be better equipped to choose the right storage services for your specific needs.
But before we dive into AWS cloud storage options, let’s first take a look at the different types of storage in general, so we can relate AWS services back to industry standard terms.
TL;DR warning: this post may be too long for some, so here are some quick links to jump straight to the different AWS storage solutions:
- Amazon Elastic Block Store (EBS)
- Amazon Elastic File System (EFS)
- Amazon FSx For Windows File Server (FSx)
- Amazon FSx For Lustre (FSx)
- Amazon Simple Storage Service (S3)
- AWS Snow Family
- AWS Storage Gateway
- AWS Backup
Different Types of AWS Storage
There are three AWS storage types: block, file and object. We’ll start by offering a high-level comparison of each before going into more detail on each one individually.
Block Storage
Block storage is the oldest type of data storage, where data is stored in fixed-length blocks. Block storage is usually accessed by SCSI, SAS, SATA or fibre channel interfaces, and is typically used for hosting operating system data. Therefore, disk images attached to virtual machines or cloud instances will run on block storage.
But, block storage can also be used for hosting databases, applications, virtual machines and containers. Block storage is the fastest type of storage available, as it does not have the file searching overhead required by both object and file storage.
File Storage
File storage uses a file system to map where data is stored on the storage device. File storage generally resides on a network or direct attached storage, and it takes care of organizing data and presenting it to the user.
Users can navigate a hierarchical file system from top to bottom using a unique locator such as a filename, location or URL. File storage generally sits on top of block storage, but access to the underlying blocks is restricted.
File storage can be used to store files that need to be accessed by multiple users or machines, application binaries, databases and virtual machines.
Object Storage
Object storage is a repository for unstructured data. An object consists of the data, which can be a file or multiple files together, and some metadata—data about the data, for example, the age or size of the object.
Every object has a unique identifier, which means that users or applications can access objects without knowing where they are stored. Object data is accessed via APIs.
Object data is contained in Object Stores, which are very flat in structure compared to file systems. This means that Object Stores can scale to many petabytes, while still delivering high-speed access to the object data. Object Stores are typically used to store large volumes of photos, video, audio, logs and analytics data.
So, now we have a better understanding of the different types of data storage, let’s take a look at the various AWS data storage options for each of these types of storage
AWS CLOUD STORAGE SERVICES
AWS Block Storage
Amazon Elastic Block Store (EBS)
Block storage on AWS is provided by EBS: Amazon Elastic Block Store. EBS is a scalable, high-performance block storage solution designed to be used with Amazon EC2 instances.
Essentially it provides the virtual disks for your virtual machines running in AWS. EBS can be used to host operating systems, databases, enterprise applications, containerized applications, file systems and more.
EBS volumes are durable and highly available—they are replicated within a single availability zone and offer 5 9’s availability and 99.8% to 99.999% durability depending on the volume type chosen.
At the time of writing, EBS is available in five different SSD-based volume types and two HDD-based volume types. The volume you choose will depend on your specific workload—always remember to balance price and performance, and choose the appropriate volume type for your use case.
SSD-Based EBS Volumes
Volume Type | Volume Description | Size | Durability | Max IOPS per Volume | Max Throughput per Volume |
---|---|---|---|---|---|
io2 | EBS Provisioned IOPS SSD | 4GB -16TB | 99.999% | 64,000 | 1,000 MB/s |
io2 Block Express (Preview) | EBS Provisioned IOPS SSD | 4GB – 64TB | 99.999% | 256,000 | 4,000 MB/s |
io1 | EBS Provisioned IOPS SSD | 4GB – 16TB | 99.8% – 99.9% | 64,000 | 1,000 MB/s |
gp3 | EBS General Purpose SSD | 1GB – 16TB | 99.8% – 99.9% | 16,000 | 1000 MB/s |
gp2 | EBS General Purpose SSD | 1GB – 16TB | 99.8% – 99.9% | 16,000 | 250 MB/s |
HDD-Based EBS Volumes
Volume Type | Volume Description | Size | Durability | Max IOPS per Volume | Max Throughput per Volume |
---|---|---|---|---|---|
st1 | Throughput Optimized HDD | 125GB -16TB | 99.8% – 99.9% | 500 | 500 MB/s |
sc1 | Cold HDD | 125GB -16TB | 99.8% – 99.9% | 250 | 250 MB/s |
For full specifications and pricing head on over to the EBS product pages on the AWS website.
Amazon EBS Encryption
Amazon EBS encryption enables all EBS volumes to be encrypted without the need for a separate key management solution. EBS volumes can be encrypted using Amazon-managed keys, or customer keys created and managed with AWS Key Management Service (KMS).
Amazon EBS Snapshots
Amazon EBS snapshots are a simple and cost-effective way to protect your data stored on EBS volumes, or indeed on any block storage in any location. EBS snapshots are incremental, which means only the changes since the previous snapshots are stored.
EBS Snapshots are stored on Amazon S3 object storage for long-term retention, which means they benefit from S3’s 11 9’s (99.999999999%) durability—the chances of losing your snapshots are extremely low!
EBS snapshots can be managed by the Data Lifecycle Manager (DLM), which enables the creation of policies for the creation, deletion, retention and sharing of snapshots. Logicata recommends a daily snapshot with 30-day retention as a default starting point for our AWS Managed Services clients.
If you’re looking to protect block storage outside of AWS, this can be achieved with the EBS API. This means you can snapshot your non-AWS block stores to AWS, and quickly recover to an EBS volume in AWS—a simple and cost-effective way to achieve basic disaster recovery.
EBS snapshots can easily be encrypted as with EBS volumes.
Amazon EBS Elastic Volumes
Amazon EBS elastic volumes enable users to change the performance and size attributes of an EBS volume with zero downtime—ensuring that your block storage remains aligned with business requirements.
This removes much of the headache of long-term capacity planning for block storage volumes, as they can be easily modified at a later date.
AWS File Storage
AWS has a couple of different file storage options, the choice of which is driven by whether you are a Windows or Linux shop.
Amazon Elastic File System (EFS)
mazon Elastic File System (EFS) is a managed service providing NFS-shared file system storage for Linux. EFS supports the NFS 4.0 and 4.1 protocols, enabling connections from thousands of EC2 instances across multiple availability zones and regions.
EFS is almost infinitely scalable, scaling to petabytes or even exabytes (EFS shares reflect in Logicata’s Datadog monitoring as having over 8 exabytes of capacity!).
As an EFS file system grows, IOPs and throughput also scale in line with capacity, and burst capacity is available for higher throughput. If sustained higher throughput is required, this can be achieved with provisioned throughput—EFS filesystems can scale to multiple GB/s of throughput.
Customers are billed for the volume of data stored in EFS, and EFS file systems are elastic, meaning they grow and shrink as files are added and removed, negating the need for any capacity planning.
EFS filesystems are highly available and durable. EFS is designed for 11 9’s durability, and by default everything stored in EFS is replicated across multiple availability zones. Don’t need this level of availability? AWS has you covered with EFS one-zone storage classes, which save up to 47% on standard EFS costs.
Storing data that isn’t accessed often? Again, AWS has you covered with EFS infrequent access storage classes, which can save up to 92% over EFS standard pricing. But be warned, you’ll be charged for accessing data in EFS infrequent access.
Here are the four storage classes offered by EFS:
- EFS Standard
- EFS Standard Infrequent Access
- EFS One Zone
- EFS One Zone Infrequent Access
Data stored in EFS file systems can be encrypted both in transit, using Transport Layer Security (TLS) and at rest, using KMS encryption keys.
EFS now also supports containers and serverless compute options—apps running in ECS and EKS can access shared file systems, as can apps running on Lambda.
Data stored in EFS file systems can be backed up with AWS Backup, which is covered later in this post. Additionally, AWS Transfer Family and AWS Datasync can be used to rapidly transfer on-premises files to EFS.
For full specs and pricing visit the Amazon EFS pages.
Amazon FSx For Windows File Server
Amazon FSx for Windows File Server is a fully managed file storage accessible over the Server Message Block (SMB) protocol. As with all AWS storage services, FSx is designed to be scalable, highly available and durable. FSx can scale to up to 64TB per file system, and DFS Namespaces can span multiple FSx file systems.
Throughput can scale to multiple GB/s, and additional throughput capacity can be purchased if required. Data de-duplication is available on FSx, which could save between 30-80% on data storage costs, depending on the type of data stored.
FSx is built on Microsoft Windows Server, and offers Active Directory integration—both on-premises and AWS Microsoft Managed AD.
FSx for Windows File Server is available in both single and multi-AZ deployments, with either SSD or HDD-backed storage. It can be accessed by many AWS services including EC2, ECS, VMware Cloud on AWS, Workspaces and AppStream.
FSX can also be accessed by on-premises machines over AWS VPN or AWS Direct Connect. All Windows desktop and server versions are supported from Windows 7 and Windows Server 2008 onward.
Much like NFS, on-premises file systems can be easily migrated to FSx in minutes using AWS DataSync. And like NFS, FSx file system data can be encrypted in transit and at rest with TLS and KMS.
FSx for Windows Server is backed up daily to Amazon S3 using Volume Shadow Copy Services (VSS). For full specs and pricing visit the Amazon FSx for Windows pages.
Amazon FSx for Lustre
Amazon FSx for Lustre is a fully managed high-performance file system used for High-Performance Computing (HPC), machine learning and video rendering applications. FSx for Lustre offers millions of IOPS, sub-millisecond latencies and up to hundreds of GB/s of throughput.
FSx for Lustre supports concurrent access to the same file or directory from thousands of compute instances. SSD and HDD options are available, and all FSx for Lustre file systems are supported by an SSD-backed metadata server ensuring all metadata operations are delivered with sub-millisecond latencies.
FSx for Lustre can be accessed by the most popular Linux AMIs—Red Hat, CentOS, Ubuntu and SUSE Linux. Data can easily be imported from and exported to Amazon S3 via native integration.
For full details and pricing visit the FSx for Lustre page on the AWS website.
AWS Object Storage
Amazon Simple Storage Service (S3)
Amazon Simple Storage Service (S3) is the AWS object storage offering. S3 provides secure, durable and highly scalable object storage as a service for IT teams and developers.
Amazon S3 is very ‘simple’ to use, offering a web services interface that enables customers to store and retrieve their data from anywhere on the web. I’ve already written a full Amazon S3 guide, so I’m just going to summarize it here in table format for reference. For full specifications and pricing visit the S3 pages on the AWS site.
Storage Class | Designed for Availability | Guaranteed Availability | Availability Zones | Storage Duration Charge | Retrieval Fee |
---|---|---|---|---|---|
S3 Standard | 99.99% | 99.99% | ≥3 | N/A | N/A |
S3 Intelligent Tiering | 99.99% | 99% | ≥3 | 30 Days | N/A |
S3 Standard-IA | 99.99% | 99% | ≥3 | 30 Days | per GB retrieved |
S3 One Zone IA | 99.5% | 99% | 1 | 30 Days | per GB retrieved |
S3 Glacier | 99.99% | 99.9% | ≥3 | 90 Days | per GB retrieved |
S3 Glacier Deep Archive | 99.99% | 99.9% | ≥3 | 180 Days | per GB retrieved |
AWS Snow Family
No self-respecting post on AWS storage types would be complete without a mention of the AWS Snow family. However, I’ve also covered that in detail in my AWS Snowball post.
AWS Storage Gateway
AWS Storage Gateway is a hybrid cloud storage service enabling on-premises access to virtually infinite cloud storage. AWS Storage Gateway is available as three different services:
File Gateway
File gateway enables customers to store files as objects in Amazon S3. Files can be accessed via the standard NFS or SMB protocols, or they can be accessed directly as objects in S3. Once files are uploaded to S3, they benefit from S3 features such as cross-region replication and lifecycle management.
Tape Gateway
Tape Gateway presents a Virtual Tape Library (VTL) interface to S3, enabling traditional on-premises tape backup systems to back up to S3 object storage using the standard iSCSI protocol. Tape Gateway is compatible with most industry-leading backup solutions including Veeam, Commvault, Veritas Backup exec, etc.
Volume Gateway
Volume Gateway presents block storage volumes over iSCSI, enabling block storage volumes to be backed up as EBS snapshots. Volume Gateway is therefore a cost-effective backup and DR solution.
For more details and pricing on AWS Storage Gateway, check out the AWS Storage Gateway pages.
AWS Backup
Last but not least in my complete AWS Storage guide is AWS Backup. AWS Backup offers centralized, automated backup of other AWS services including:
- EC2 Instances
- EBS Volumes
- RDS Database Instances
- DynamoDB Tables
- EFS Volumes
- FSx for Windows and Lustre file systems
- AWS Storage Gateway Volumes
AWS Backup makes it easy to manage backups for all of the above services via the AWS console, command line (CLI) or the AWS API. Backup plans can be easily created to automate data backup, and services can be backed up by simply tagging them.
The above ensures that your backup plans can be easily implemented across your entire AWS estate. Data can also be backed up to different regions and AWS accounts, making it easy to meet compliance and disaster recovery requirements. For more detailed info and pricing visit the AWS Backup pages.
WRAPPING UP
So, there you have it—my complete guide to AWS cloud storage services. Okay, I’ve cheated a little and pointed you to some earlier posts, but I hope I’ve given you a useful reference point for all things AWS Storage related.
If you want to keep up with AWS news, why not sign up for my weekly AWS News Roundup email? I promise not to send any marketing spam, just a once-a-week curated list of AWS news with all other vendor pitches weeded out by yours truly!
Feel free to get in touch if you have any questions about AWS Storage or simply want to know more about Logicata’s services. Thanks for reading.