Securing Your Data Lake Using S3 Access Points
IAM policies, Access Control Lists, bucket policies, KMS policies, and just when you thought S3 security couldn’t get any harder AWS introduces a new way to manage access control for your buckets called “access points”.
What’s their use case?
Released at re:Invent 2019, access points are the newest way of managing access to multi-tenant S3 buckets at scale and make it easier to implement fine-grained access control for each application accessing the S3 buckets.
Before access points, S3 bucket policies needed to be updated with each change in the scope of the permissions of a particular application. These made bucket policies big and cumbersome to manage.
Access points give you the ability to implement policies specific to your application without having to update your bucket policy with each change. They can also be restricted to only allow traffic from a specific VPC which helps protect your objects from accidental exposure.
And in contrast to S3’s global namespace, access points require their name to be unique only to your account and region. This makes it easier to maintain consistent names across your environment without having to change your application’s code.
How do we use S3 Access Points?
Let’s explore how access points work and any gotchas with an example.
In this example, we’ll create an S3 bucket and roles with different access requirements accessing the bucket through access points. Here’s what the architecture looks like:
As the picture shows, we will have 3 AWS IAM roles accessing the yummy-food S3 bucket. The first will be a “chef” role who will only have access to create objects in the bucket. The keto-dieter role will have access to any objects prefixed with “keto/”. The omnivore role will have access to get any objects in the bucket. The 3 roles will be limited to only be able to access the S3 bucket through their corresponding access point.
At the time of this writing terraform doesn’t support S3 access points, so we will use CloudFormation instead. If you want to view the full template it’s available in this repository.
Let’s walk through the important parts of the template.
The bucket policy
When using access points, the S3 bucket policy must allow at least the same level of access you intend to grant through the access point.
As you can see, we’re granting any principal access to perform any action on the bucket (s3:*) as long as the request comes through an S3 access point in the same AWS account we’re provisioning the CloudFormation template. Since Access Points are bucket specific, we’re keeping this policy broad in order to avoid touching it with each new principal (e.g. IAM role) we want to grant access or change in permissions.
The IAM Roles
Something that might surprise you is that we don’t need to attach any IAM policies to the roles themselves. As long as the access is not being explicitly denied, AWS will grant access to the objects in S3 based on the allow statements in the bucket policy and S3 access point.
Here’s what the CloudFormation template to create the roles looks like:
As you can see, we’re not attaching any policies and have a trust policy where we’re using an IAM user to assume these roles. I plan on testing accessing the S3 buckets from inside an EC2 instance in a VPC that has an S3 VPC endpoint and also from my laptop.
The Access Points
The access points will have IAM policies that allow the minimum access needed for each of the roles. We’ll create an access point per role.
The chef-access-point will allow s3:PutObject on the bucket.
The omnivore-access-point allows the omnivore role the s3:GetObject action on any objects through the access point.
Finally, the keto-dieter-access-point allows the keto dieter access to s3:GetObject, but only objects prefixed “keto/*”.
Testing the stack
Let’s test if everything is working as expected. We can start by creating a CloudFormation stack based on our template. You can use the command below on the AWS CLI changing any parameters as appropriate for your environment.
You can view the stack’s status by describing it. The creation will be finished when "StackStatus" shows "CREATE_COMPLETE".
Once the creation has completed, let’s capture the outputs as variables so we can use them in subsequent steps.
Testing the Chef Role
Now let’s assume the chef role to start testing:
To confirm your shell is currently configured to use the chef role you can use the following command:
Now, let’s put a few objects into the bucket. This is the functionality you’ve allowed on this role and you shouldn’t get any permission errors.
If you try retrieving an object, you will get an error message.
Testing the Omnivore Role
The first step to start testing the omnivore role is to reconfigure your shell to use it. Let’s unset the environment variables and assume the role.
Now let’s try getting objects saved into both the “keto/*” and “standard’*” directory. As configured in the CloudFormation template, the omnivore role should have access to retrieve any objects through its access point.
If you try to put an object through the omnivore access point, or try to get an object from a different access point you will get an access error.
Testing the Keto Dieter Role
Let’s unset the environment variables and assume the keto dieter role.
As configured in the CloudFormation template, you should be able to get objects from “keto/*” without permission issues.
If you try to get an object prefixed with “standard/*”, you would get a permissions error.
Conclusion & Gotchas
Access Points allow for managing S3 bucket permissions at an application level. They facilitate managing access control of S3 buckets shared across multiple applications and can help ensure these are only being accessed through a VPC endpoint.
At the same time this additional access control layer makes it harder to audit your permissions as now you have to consider Organizations’ SCPs, IAM policies, S3 bucket policies, ACLs, and Access Point policies when determining if a particular principal has access to objects in your S3 buckets. If you want to learn more about S3 security I suggest watching the Deep dive on Amazon S3 security and management (STG301-R3) session from re:Invent 2019.