Cloud Security Musings

View Original

8 Things to Look For Securely Introducing AWS Services Into Your Environment

You know the feeling. Every year after re:Invent, developers (and technologists in general) are excited with the announcements of new features and services that AWS released and thinking of all of the possibilities. 

They see how taking advantage of the newest services they can improve their applications and services, making their life easier, and the business happier. It’s a win win!

They want to use the latest and greatest features and they want to do it now. So everyone starts calling you to enable these services in their AWS accounts.

But what should you do? You know that newly released AWS services are usually minimum viable products (MVPs) that may lack security and operational features and that after the initial Generally Available (GA) release, AWS keeps iterating on their products making them more secure and safer to use.

You need to have a service certification process

It should be clear what are the expectations and the organization’s risk tolerance for every AWS service you use based on the shared responsibility model for the service. 

The service certification process should include researching the state of all security and operational features that have been determined to be important to your organization and verifying if the service supports them. 

Here’s some of the things to look for as part of your service certification process. 

1. What’s the shared responsibility model for this service?

The shared responsibility model changes depending on the type of service (e.g. infrastructure, container, or abstracted service). Make sure you understand in which category the requested service falls into as this would help inform the requirements for turning on the service in your environment. 

If you’re not familiar with the shared responsibility model or need to refresh on it the AWS Security Best Practices white paper includes a good overview. 

2. Can the service be publicly exposed? 

Many AWS services can be exposed to the public internet and some of them can be exposed with no authentication or authorization. Understanding how the service is publicly exposed would help you determine if you need to develop automated security controls to prevent accidental exposure before enabling the service. 

The way the service gets exposed depends on the service type. For example, to expose an infrastructure service such as an EC2 instance to the public internet, you would need to explicitly stand up the EC2 instance on a public subnet, request that it’s provisioned with a public IP, and attach a security group allowing 0.0.0.0/0. 

Similarly, a container service like RDS would require configuring the RDS instance within your VPC to be publicly exposed. 

In contrast, abstracted services such as SQS would only require the Identity and Access Management (IAM) resource policy attached to the SQS queue to enable unauthorized access to the queue. 

3. Does the service supports logging to CloudTrail?

CloudTrail is an AWS service that captures a trail of supported API activity for auditing and governance purposes. CloudTrail is also useful when troubleshooting permission errors when developing restrictive AWS IAM policies. 

You might think that with the amount of emphasis AWS puts on enabling and using CloudTrail, all AWS API calls are logged to CloudTrail, but unfortunately that’s not the case. 

For example, let’s say you have a requirement to log whenever there are attempts to access sensitive data in your environment and you choose to use DynamoDB as your data store. You would expect that the GetItem API call is logged in CloudTrail, but if you look at the list of DynamoDB API calls supported by CloudTrail GetItem is missing as of this writing. 

You can find the list of AWS services supported by CloudTrail here

4. Does the service provides logs?

CloudTrail only logs activity against AWS APIs. Some services provide logs for requests accessing the resource itself. This might be important depending on your compliance and security requirements, but they are also helpful for general troubleshooting. 

If you have a requirement to enable logs for a particular service, you’ll probably need automation to ensure they are enabled by default when provisioning new resources. 

5. Are resource-level permissions or tag based restrictions supported?

Having fine grain IAM policies is important in order to avoid privilege escalation and increased risk of any security vulnerability in your applications. This is even more important when you have multi-tenant accounts with multiple applications deployed into a single AWS account. 

Resource-level permissions give you the ability to use naming conventions to enforce access control. For example, if you have an application named “hello world”, you could write an IAM policy to restrict developers working in this app to only be able to create or manage S3 buckets that are pre-fixed “helloworld*”. Here’s how a policy statement like that would look like:

See this content in the original post

Some services support using tags to enforce access control. For example, we could say that a developer could only terminate EC2 instances if they have been tagged with the developer’s application name. 

See this content in the original post

Support for tagging is also important if you’re using tags for cost allocation within your environment.

For services that don’t support resource-level or tag permissions, you would only be able to restrict API actions, but these could be used on any resources even if the resource is not owned by the role with the policy attached. You’ll need to determine if this is acceptable within your environment. 

You can find the full reference of IAM supported features per service in the Actions, Resources, and Condition Keys for AWS Services page.

6. Is your region supported?

Many AWS services do not support all AWS regions at launch time. You should verify if the regions you are operating in support the service otherwise using the service would require you to open access to an additional region.

The AWS Regional Table page is a good resource to research which services are supported in which regions. 

7. Does the service support encryption?

You’ll need to research if the service you’re looking to enable supports encryption at-rest and in-transit. 

Although I’ve always seen HTTPS endpoints released at launch for new services, for AWS APIs there’s usually no information in the documentation on weather in-transit encryption is enforced. 

This means that if you have an IAM policy that enforces in-transit encryption, you’ll sometimes notice by trial an error that the policy might break accessing a new service through the AWS console. If you have such a policy, you’ll have to add testing this as part of your certification process. A policy that enforces in-transit encryption for AWS services look like this: 

See this content in the original post

For at-rest encryption the best scenario from a security perspective would be to support encryption using AWS Key Management Service (KMS) with Customer Managed Keys (CMKs). This means that your organization owns the access control and management of the KMS keys. 

This is important as it provides an additional layer of authorization for resources encrypted with KMS. Principals accessing a resource would need both access to the resource and to the KMS key in order to access the resource. This serves as an additional protection in case the resource has been exposed outside of your AWS account.

8. Does this service solve my need in a unique way?

You should rationalize how the new service fits into your overall technology portfolio. If you’re already using a competing technology, document how this new service compares to the technology you’re currently using in terms of price, cost of operations, and capability. 

Keeping in mind that AWS evolves their technology rapidly, and that the comparison you perform today will quickly change in the next few months as AWS releases new features. 

Who should perform the certification work?

Now that you have a baseline of things to look for as part of your certification process, it’s important to be upfront about who’s responsible for doing the actual work. 

The best approach is to involve the requesting team to perform the initial work, while your Cloud Center of Excellence (COE) team reviews and takes it to the finish line. 

The artifacts generated by the process should include: 

  1. Service certification document that captures how the service meets your organizational requirements and expectations.

  2. Least privilege AWS IAM policies. 

  3. Infrastructure as Code reference implementations. This will help you test IAM policies while giving you a baseline of security and operational expectations for the service.

  4. AWS Organizations Service Control Policies (SCPs) for any controls that should be enforced organization wide.

  5. Reactive security controls for anything that can't be enforced proactively with AWS IAM policies and SCPs. These could include Cloud Custodian rules, AWS Config rules, etc.

To avoid risk, experimentation with new services should happen in sandbox accounts that are isolated from your environment. This would give your team the greatest flexibility to fully explore the service without restrictions, while limiting the risk to your organization in case things go wrong.  

Next Steps

Having a service certification process can help you quickly adopt new AWS services, while ensuring the security and operational controls your organization requires are enforced before enabling the service. 

Federating the work helps grow expertise of the new service within your organization while enabling requesting teams to assess by themselves if the new service is a good fit. If also avoids bottlenecks as it doesn’t rely on experts to complete the process. 

As with anything cloud related, the best solution today might not be the best solution tomorrow. Having a documented process will help you evolve your architecture as new services becomes feasible when any barriers or red flags are solved by AWS.