S3 - Done

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 10

Amazon S3

Amazon S3 Basics
● S3 is a global service and buckets are region specific but bucket names are globally
unique.
● Amazon S3 provides unlimited storage and stores data as objects within buckets.
An object consists of a file and optionally any metadata that describes that file.
● To store an object in Amazon S3, you upload the file you want to store into a bucket.
When you upload a file, you can set permissions on the object as well as any
metadata.
● Buckets are the containers for objects. You can have one or more buckets. For each
bucket, you can control access to it (who can create, delete, and list objects in the
bucket).
● Amazon S3 creates buckets in the AWS Region that you specify. You can choose
any AWS Region that is geographically close to you to optimize latency, minimize
costs, or address regulatory requirements.
● You are not charged for creating a bucket. You are only charged for storing objects
in the bucket and for transferring objects out of the bucket.
● Amazon S3 bucket names are globally unique, regardless of the AWS Region in
which you create the bucket.
● Object can be from 0 Bytes to 5 TB in size.
● Largest object that can be uploaded in a single put: 5 GB
● You can use a multipart upload for objects from 5 MB to 5 TB in size.
● When you upload a file to S3, you will receive an HTTP 200 code, if the upload was
successful.
● Data consistency model for S3:
○ Read after write consistency for PUTs of new objects.
○ Eventual consistency for overwrite PUTs and DELETEs.
● S3 URL format: https://s3-<region>.amazonaws.com/<bucket>/<object>
● Core fundamentals of S3 object:
○ Key (Name)
○ Value (Data)
○ Version ID
○ Metadata
2

S3 storage classes
● General purpose
○ S3 standard
● Unknown or changing access patterns
○ S3 intelligent tiering
● Infrequently accessed objects
○ S3 Standard IA
○ One zone IA (Replaced RR)
● Archive
○ S3 Glacier
○ S3 Glacier Deep Archive

Storage recommendations

● Standard - The default storage class, suitable for frequently accessed data.
● Intelligent-Tiering - This storage class is ideal if you want to optimize storage costs
automatically for long-lived data when access patterns are unknown or
unpredictable.
● Standard-IA - Use for long-lived, infrequently accessed data where your primary or
only copy of data that can't be recreated.
● One Zone-IA - Use for long-lived, infrequently accessed, non-critical data where you
can recreate the data if the Availability Zone fails, and for object replicas when
setting cross-region replication.
● Glacier - This storage class is suitable for archiving data where data access is
infrequent with retrieval times ranging from minutes to hours.
● Glacier Deep Archive - Archive data that rarely, if ever, needs to be accessed with
retrieval times in hours.
● Reduced Redundancy (Not recommended) - Frequently accessed, non-critical data

Comparing the Amazon S3 Storage Classes:

● https://aws.amazon.com/s3/storage-classes/

Note:
3

● All of the storage classes except for One Zone-IA are designed to be resilient to
simultaneous complete data loss in a single Availability Zone and partial loss in
another Availability Zone.
● Because S3 One Zone-IA stores data in a single AWS Availability Zone, data stored in
this storage class will be lost in the event of Availability Zone destruction.
● S3 Intelligent-Tiering charges a small tiering fee and has a minimum eligible object
size of 128KB for auto-tiering. Smaller objects may be stored but will always be
charged at the Frequent Access tier rates.

Versioning
● Versioning allows to keeps up with multiple versions of an object in the same
bucket.
● Once enabled, versioning cannot be disabled, only suspend.
● Therefore, buckets can be in one of three states: unversioned (the default),
versioning-enabled, or versioning-suspended.
● Metadata are not inherited between different versions, therefore you have to set
permission to every version explicitly. Eg: Set object read permission to everyone.
● When versioning enabled, if you delete an object, a delete marker will be created. If
you were to delete the delete marker, the object will be restored.
● You can see all the available versions for a particular object and are able to delete
any specific version of that object.
● Versioning support MFA delete.
● Versioning can be integrated with Lifecycle management rules.

Cross region replication


● CRR is applicable in between two buckets in two different regions.
● Versioning must be enabled on both the source and destination bucket.
● You cannot replicate to multiple buckets.
● Replication will apply for the files uploaded from this point onward.
● Versions and their permissions will be replicated (Active permission replication).
● Delete markers were used to replicate but they are no longer replicate if we are using
the latest version of replication configuration.
4

● Deletion of individual versions or delete markers will not be replicated.

S3 Lifecycle Management
● Lifecycle rules defines how S3 manages objects during their lifetime.
● Automatically transition objects to different storage classes so that they are stored
cost effectively throughout their lifecycle.
● Allows you to configure object expiration and automatically deletes expired objects
on your behalf. Can be enabled for current & previous versions.
○ Expire current version of object - Puts a delete marker and move the current
version to a previous version.
○ Permanently delete previous versions - Deletes a previous version permanently
as it becomes eligible for expiration.
● Can be used in conjunction with versioning (with or without versioning).
● Can be applied to current version or previous versions of objects.

S3 Security and Encryption


Security

● By default, all newly created buckets are private, including all objects within those
buckets.
● Buckets and objects are Amazon S3 resources and you can grant access
permissions to them by using resource based access policies.
○ Bucket policies - You can add a bucket policy to a bucket to grant access
permission for the bucket and the objects init. Object permissions only apply to
the objects that the bucket owner creates.
○ Access Control List (ACL) - Grant access permissions to buckets and objects.
ACL can be applied on individual object level.

Encryption

● In transit: SSL
● At rest:
5

○ Server Side Encryption (SSE)


■ SSE - S3: S3 managed keys
■ SSE - KMS: AWS Key Management Service (KMS) managed keys.
■ SSE - C: SSE with customer provided keys.
○ Client Side Encryption

Read: https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingEncryption.html

Server access logging


● Log requests for access to your bucket.

Object-level logging
● Record object-level API activity using AWS CloudTrail for an additional cost.
● Amazon S3 object-level API activity (for example, GetObject, DeleteObject, and
PutObject API operations)

S3 Transfer Acceleration
● S3 transfer acceleration utilize the CloudFront edge network to accelerate your
uploads to S3.
● Instead of uploading directly to your S3 bucket, you can use distinct URL to upload to
an edge location which will then transfer that file to S3 across the AWS backbone
network.

S3 Static web hosting


URL format: https://<bucket>.s3-website-<region>.amazonaws.com
6

Exercise
Create bucket

● Set name and region


○ Bucket is region based.
○ Bucket name is unique across all existing bucket names in Amazon S3 and
after you create the bucket you cannot change the name.
○ After a bucket is deleted, the name becomes available to reuse, but it might
take some time.
● Configure options - Keep default
● Set permissions - Keep default
● Review and create bucket

Upload to bucket

● Select file
● Set permissions - Keep default
● Set properties - Set Tag
○ Key - Name
○ Value - mudi-aws-s3-obj-1
● Review & upload

Object properties
https://docs.aws.amazon.com/AmazonS3/latest/user-guide/view-object-properties.html

● Object > properties > encryption


○ Encryption at rest (while it is stored on disks in Amazon S3 data centers).
○ Keys can be created from: IAM > encryption keys
○ Selecting AWS-KMS: [Key - aws/s3] is same as AES-256: Amazon S3 server-
side encryption.
○ Custom KMS ARN allows us to use keys from different accounts.
7

Object permissions
● Object > permission > public access > everyone
○ Read object permission allows everyone to download the file using the URL.

Bucket properties
https://docs.aws.amazon.com/AmazonS3/latest/user-guide/view-bucket-properties.html

Bucket permissions (Access control mechanisms)


https://docs.aws.amazon.com/AmazonS3/latest/user-guide/block-public-access.html

● Bucket > permissions > access control list (basic permission)


○ Selecting everyone allows to one or all of the following: list objects, write
objects, read and write permissions.
● Bucket > permissions > bucket policy (advance permission)
○ Generate JSON using policy generator
■ Policy type - S3 bucket policy
■ Effect - Allow
■ Principal - *
■ AWS services - Keep default
■ Actions - *
■ ARN - Use what's at the top bucket policy editor
■ Click - Add statement
■ Generate policy
○ Paste it to bucket policy editor & save

IAM, Bucket policy and ACL:


● IAM policies define what principles can do in your AWS environment.
8

● Bucket policies specify what actions are allowed or denied for which principles on
the bucket that the policy is attached to.
● ACL is a legacy access control mechanism that predates the IAM. Therefore, as a
general rule, AWS recommends using S3 bucket policies or IAM policies for access
control. An ACL is a subresource that is attached to every bucket & object. It defines
which AWS accounts or groups are granted access and the type of access.

How does authorization works with multiple access control mechanisms?

● Union of all IAM policies, S3 bucket policies and S3 ACL that apply.

https://aws.amazon.com/blogs/security/iam-policies-and-bucket-policies-and-acls-oh-my-
controlling-access-to-s3-resources/

Bucket management
● Bucket > management > replication
○ Cross-region replication
■ Require versioning in both buckets
■ Add rule
● Source - Default
● Destination - Set other bucket
● Permissions - Select existing if any or create new IAM role
● Review & save

Amazon S3 Data Consistency Model


Amazon S3 provides read-after-write consistency for PUTS of new objects in your S3
bucket in all regions with one caveat. The caveat is that if you make a HEAD or GET
request to the key name (to find if the object exists) before creating the object, Amazon
S3 provides eventual consistency for read-after-write. Amazon S3 offers eventual
consistency for overwrite PUTS and DELETES in all regions.
9

S3 provides read-after-write consistency for PUTS of new Objects. To understand


what that means, it is important to understand that S3 achieves High Availability (HA) by
replicating the data across multiple servers that could even span multiple data centers.
So until you get back a 200 OK response to the PUT call, you cannot be sure that the
new Object was created successfully and any immediate GET or HEAD call (like listing
the keys within the bucket) for the same object might result in not showing the object.
On the other hand, once the previous call has returned with a 200 OK, any subsequent
GET calls for the new Object is guaranteed to return the object, as 200 OK signifies that
the data is stored safely in S3.

S3 provides eventual consistency for overwrite PUTS and DELETES for existing
objects. It is easy to follow this from the above-established premise, until the change
(PUT or DELETE) has been propagated to all copies of data in S3, anyone else
requesting the same object might get the previous data or deleted object.

Accessing S3 using CLI


https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-welcome.html

https://docs.aws.amazon.com/cli/latest/userguide/cli-services-s3-commands.html

Working with Amazon S3 Objects


https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingObjects.html

S3 important FAQ
1. How much data can I store in Amazon S3?
2. How reliable is Amazon S3?
3. How secure is my data in Amazon S3?
4. How can I control access to my data stored on Amazon S3?
5. Does Amazon S3 support data access auditing?
6. What options do I have for encrypting data stored on Amazon S3?
7. How durable is Amazon S3?
10

8. How are Amazon S3 and Amazon S3 Glacier designed to achieve 99.999999999%


durability?
9. How long will it take to restore my objects archived in S3 Glacier and can I upgrade
an in-progress request to a faster restore speed?
10. How am I charged for deleting objects from Amazon S3 Glacier that are less than 90
days old?

You might also like