S3 (Simple Storage Service)

S3

Stores data as objects with metadata, accessible via an API.

Key Features

Object storage
limitless storage
- No limit on the number of files you can have
Files can be uploaded/deleted/read with the console, CLI, APK, SDK, etc
Unlike EBS or EFS, it doesn’t have to be attached to a service first
Can be used by any workload, including serverless applications, data lakes, and backups
automatic scaling and low cost
- bucket storage size grows automatically as you add more objects into it, and there is virtually no limit on the bucket total storage capacity
No file system
- Even though you’re not managing a file system, there is organization
- you can and must create buckets
Storage classes
Resources
- EBS vs EFS vs S3
- S3 Transfer Acceleration

diagram
Version
- when you update a file, the new version is stored and the older version is not deleted
- This option costs extra money
Lifecycle management
- What if you have certain files that are accessed frequently first month but accessed less later?
- You can set certain rules which AWS will automatically transition files between classes
Inventory & Analytics
- Sometimes you wanna get an overview/summary
- Various features on top of S3 (advanced) that gives you more insights in your bucket
Compliance & object lock
- Achieving compliance if required in your company
- Telling certain files shouldn’t be deletable
Replication
- Cross bucket replication - you can set certain rules to AWS to copy different files to different buckets automatically
- Backup files — make sure your data is saved in case a bucket /file gets deleted/compromised
- Available in a single region (between buckets in same region), or cross-region (outside of current region)
Data Protection & Encryption
- File level or bucket level encryption
- Your files can be encrypted automatically when you upload them & decrypted automatically when you’re accessing them
Static website hosting
- You can put website files in there if it doesn’t have server-side code
- consists of only HTML, CSS, and JS
- Properties > enable Static Website hosting
- You can use a custom domain instead of the AWS generated one

diagram
“folders”
nested buckets are NOT allowed
Scalibility
- You can create multiple sibling buckets, and every bucket can store infinite number of files
- No need to choose a bucket size in advance
Buckets are regional
- You can distribute multiple buckets across multiple regions
Unique name
- Must have a unique name
- For ALL AWS BUCKETS in the ENTIRE WORLD
Flat structure
- Has no directories
- Can assign prefixes to your files (so you can create very long file names), then you can organize your files by prefix
Detailed permissions system
- You can control a bucket/file level access, who has access to which
- bucket policies or access control lists (more detailed but X recommended)
- S3 Block Public Access
  - It can forcibly block new or existing public ACL/policy changes at the bucket or account level
During configuration
- You can make your bucket public, but by default it’s set as private
  - You need to make extra configurations after you uncheck that
  - Bucket policy → (read more) → you can see different examples (ex. granting read-only permission to an anonymous user)
- When you upload a file, go through the settings first (scroll down) and upload
  - you can choose a Storage Class

Bucket policies are IAM policies that define permissions, but NOT attached to users or roles, but to buckets
They control who/what is allowed to certain buckets, and which kinds of actions are allowed on a given bucket

diagram
Key feature of S3, a feature that can help you save money
Different storage classes for different file access patterns
- Frequently
  - Person/application needs access very frequently (every min/sec)
  - Instant access to frequently used data
  - Highest flexibility with no/little cost savings
  - Standard, Reduced redundancy
- Infrequent Access
  - time to time (ex. once a month)
  - You still want instant access → retrieval cost
  - cost savings → if you don’t access your files, you’re paying less for storing them
  - One Zone-IA
- Archive - Glacier
  - access almost never again or very rare access
  - Maybe you just have to store it for legal reasons
  - Instant access not always possible (some needs waiting for couple hours)
  - High cost savings but less flexibility
  - Glacier series
S3 Intelligent Tiering
- AWS analyzes the access patterns and automatically move it to a fitting category
- you save less than doing manually but better than doing nothing

Main classes
- S3 Glacier Instant Retrieval
  - data must be stored long-term and access is highly unlikely but instant access is needed, if data must be accessed
- S3 Glacier Flexible Retrieval (formerly just “S3 Glacier”)
- Amazon S3 Glacier Deep Archive
“Glacier” is sometimes referenced as a standalone service, but it’s always referred to the S3 Glaciers
it’s always these three storage classes which are optimized for long-term file storage with low-frequency file access patterns