<< Back to all Blogs
The Hitchhiker's Guide To S3 Access

The Hitchhiker's Guide To S3 Access

Matt Tyler

image

Introduction

S3 was the first service to become generally available (GA) in AWS, debuting in 2006. It would be fair to say that since then it has become an essential building block of the internet. It is the foundation for both services internal to AWS and externally by service providers. No service lasts for 14 years (as of now, it's 2020) and doesn't accumulate some level of baggage, and there are probably a few developers out there who would liken dealing with the myriad of S3 access mechanisms as being akin to dealing with a moody teenager. I deal with S3 on an almost daily basis, and have a boatload of relevant AWS certifications - and I still regularly get tripped up when dealing with S3. With that in mind, I've decided to take the time write out how S3 policies work, with extra attention to my method of determining the access level.

TLDR

  • If you don't need to grant access to a bucket outside identities within your own account then you probably don't need anything beyond IAM Policies
  • ACLs, Bucket Policies, and Access Points are useful when you need to grant access to identities outside your own account (though they do work for internal identities as well)
  • Bucket Policies and Access Points can be far more granular than ACL's, so they are typically more useful. Avoid ACLs wherever possible.
  • Access Points are useful if you want to create named aliases for buckets for specific bucket consumers. This is helpful if your bucket policies become too large to fit within the default limits. This is often the case for analytics workloads, e.g. publishing large data-sets for many consumers (universities, private sector etc.)
  • Buckets and Objects have a concept of 'owners'; this is part of the ACL system, and critical to understand even if you hardly ever touch ACLs.
  • The owner of either entity is always the identity that created it.
  • The bucket owner can never be changed.
  • Only the bucket owner can view the bucket in the console.
  • Object owners can only be changed by overwriting the object.
  • By default, only the owner of an entity can make administrative changes to the entity. This can cause confusion if another entity has put an object into a bucket owned by someone else, without setting the 'bucket-owner-full-control' ACL when the object was created.
  • Checks on the owner sometimes occur in strange places - e.g. a CloudFront Origin Identity can only serve an object from a bucket if that object is owned by the account that owns the bucket.

Terminology

Lets get some terminology out of the way first.

  • ACL

    Access Control List - to be honest, all types of the access mechanisms used by S3 are access control lists but most of the time when we are talking S3, we specifically mean the S3 ACL mechanism. This is the original access mechanism in S3. I tried to avoid it most of the time - the other access mechanisms are far more flexible and easy to work with (IMHO).

  • Policy

    A policy is the 'rule' for determining access - it is the sum of the statements in the policy (eg. Resource, Action, Effect, Principal and Conditions)

  • Resource

    In S3, this generally refers to things; buckets, and the objects that are inside those buckets. Wildcards work in resource policies for specifying multiple of something. e.g.

    {
        "Resource": "arn:aws:::s3:mybucket/*"
    }
    

    The above will match all items in the bucket. However, if you are giving bucket level permissions this won't work as the statement does not match "arn:aws:::s3:mybucket". Because of this, it's common to see the following in a resource policy.

    {
        "Resource": [
            "arn:aws:::s3:mybucket",
            "arn:aws:::s3:mybucket/*"
        ]
    }
    

    Note: A bucket policy can only have statements in the resource statement that refer to itself. It doesn't make sense to have a bucket policy on a bucket refer to the contents of another bucket. This might cause you to wonder why they include the resource statement at all - why not remove it like principal statements in IAM policies? The big difference is the ability to refer to specific objects within the bucket, so I believe they stuck with specifying ARNs for consistencies sake.

  • Action

    Actions are the things we are allowed to do with resources. This will usually be things like PuObject, GetObject, ListBuckets - etc. Some actions work on buckets, whilst others work on the objects in those buckets.

    It's usually reasonably obvious what is going on, except for the two things - "ListAllBuckets" and "ListBucket". The former lists all the buckets in the account, the latter lists all of the objects in a particular bucket. Yes, it's easy to mix up.

  • Effect

    The effect is the result of various conditions matching the rest of the policy. We can do two things - we can either ALLOW things to happen or DENY them.

  • Principal

    The principal is the identity of the 'thing' performing the action. For example, this lets us to block Mark from reading an object but allows Alice to read it. Users, roles, services, accounts, all of these things are 'principals'. Here are some examples.

    Give access to the AWS Config service.

    {
    
        "Principal": {
            "Service": "config.amazonaws.com"
        }
    }
    

    Give access to all accounts within AWS.

    {
        "Principal": {
            "AWS": "*"
        }
    }
    

    Give access to two accounts - the following two examples are equivalent. Important! Note the usage of 'root' doesn't mean root user, or root account. It just means the account is free to delegate permission as it sees fit.

    {
        "Principal": {
           "AWS": [ "0123456789", "123456790" ]
        }
    }
    
    {
        "Principal": {
           "AWS": [ "arn:aws:iam::0123456789:root", "arn:aws:iam::123456790:root" ]
        }
    }
    

    Give access to everyone in the world

    {
        "Principal": "*"
    }
    

    Give access to a specific role within an account

    {
        "Principal": {
            "AWS": "arn:aws:iam::1234567890:role/my-cool-role"
        }
    }
    

    Note that the only time wildcards work in a policy is when they are specified as the only thing in a value - "" will work, but "arn:aws:iam::1234567890:role/" will not.

  • Condition

    Condition statements allow us to further restrict access based on various circumstantial attributes. Maybe we block Mark all the time, except for when Mark is accessing the bucket from a particular office.

    There are a lot of conditions - some of which can be found here in the cloudonaut reference, and more specific condition keys can be found in the official reference which is available here.

    The following is an bucket policy (you can tell because there is a principal property in the policy) that blocks access to objects within 'examplebucket' that will only allow access to objects within a specific IP range (192.0.2.0/24), unless the address is 192.0.2.188.

    Note that this does not specifically deny accessing the bucket from 192.0.2.188 - so it could be still granted access by some other policy.

      {
          "Version": "2012-10-17",
          "Id": "S3PolicyId1",
          "Statement": [
              {
                  "Sid": "statement1",
                  "Effect": "Allow",
                  "Principal": "*",
                  "Action":["s3:GetObject"]  ,
                  "Resource": "arn:aws:s3:::examplebucket/*",
                  "Condition" : {
                      "IpAddress" : {
                          "aws:SourceIp": "192.0.2.0/24" 
                      },
                      "NotIpAddress" : {
                          "aws:SourceIp": "192.0.2.188/32" 
                      } 
                  } 
              } 
          ]
      } 
    
  • Canonical ID

    Canonical ID's pop up in a few places but you are more likely to see them in ACLs than anywhere else. They usually identify what accounts may have access to certain buckets. More information can be found here.d

  • Resource Policy & IAM Policies

    Resource policies are access policies that apply to 'resources' as opposed to 'identities' (like IAM Policies). Resource policies are most notable in that they include a 'principal' statement to narrow down the identity accessing the resource. Identity policies do not have a principal statement because they attached to identities - the principal is therefore implied.

    Note: As dicussed before, principal statements can not include wildcard ('*') statements.

  • Owner

    All resources in S3 have a concept of an 'owner'. The owner of the bucket is always the account that created it, and so far as I know can't be changed. However, objects in that bucket may be owned by other accounts. The bucket owner pays the bills, and can therefore deny access to objects and delete objects that are owned by others in the bucket. The bucket owner only has full access to objects that they own, unless the object owner grants them full access.

    The object owner can only be changed by overwriting the object - there is no way to change ownership of an existing object.

    This admittedly doesn't come up too often, but it is good to be aware of it.

With that out of the way, let's get a bit more practical by looking at most of the access mechanisms in S3.

What is Public Access?

A public bucket is one that grants access to its objects without requiring any proof of identity. Because of how confusing S3 policies can be, AWS now has methods to deny public access to buckets, regardless of whether a policy will technically allow it. If you are trying to either allow or deny public access to a bucket, you should check those settings first - of which we will detail now.

Public Access Restrictions

Public access restrictions are easiest way to ensure that a bucket can't ever be made public. These settings can be set at both an account level (active across all buckets within the account) or at an individual bucket level.

  • Block public access to buckets and objects granted through new access control lists (ACLs)
  • Block public access to buckets and objects granted through any access control lists (ACLs)
  • Block public access to buckets and objects granted through new public bucket policies
  • Block public access to buckets and objects granted through any public bucket policies
  • Block all public access (union of the above four)

Basically, the two mechanisms for allowing access to an entity that isn't in your account are bucket policies and ACLs. You have two settings that force them to be private ('any' policies'), and another two settings that will allow any buckets that are already public to remain public - but no others!

More information can be found in the official documentation.

S3 ACLs

ACLs are the oldest mechanism and as such are very 'spray-and-pray'. Access is more or less limited to read and/or write - no other calls to change ACL's or enabled object-locks or other advanced features. Access is granted via 'Canonical ID's' which is just another identifier for an AWS account (think of it as a stand-in for an account ARN). ACL's can be used at both the bucket and object level.

Note that ACL's at the bucket level do not give access to objects within the bucket - just bucket level operations. E.g. A read ACL on a bucket won't let you read objects inside the bucket - it'll let you list things inside the bucket.

If you are only ever using the bucket within your account, and have no need to make the bucket public, you will likely never touch one. But if you need to grant access to one or more principals outside the account that owns the bucket, ACL's are one mechanism to do so (though there are better ways to do it).

There are also 'canned' ACLs which are ACL's which a special name that grants access for certain kinds of scenarios e.g. PublicRead, PublicReadWrite. These are not user-configurable, and are the only ACL's that can applied via CloudFormation. For me, that's enough reason alone to prefer bucket policies over ACLs which are capable of doing the same thing but more.

Resource Policies

Resource policies are similar into ACL's in that they are away to grant access to resources that are outside the account boundary. Where they differ is that they are in a similar format to IAM polices so they have a lot more flexibility. 99.99999999 percent-of-the-time, if you are dealing with cross-account bucket access, what you want to use is a bucket policy. Additionally, if you need to give bucket access to an AWS service, bucket policies are how you will do this.

One important thing to mention - they can provide access to resources within your account boundary, provided that the principals within your account do not have an policy attached to them explicitly denying them access. The absence of an IAM policy, and the presence of an 'Allow' bucket policy will grant the privileges encoded within the bucket policy.

IAM Policies

As said before, these are more less the same as bucket policies, but are scoped against the IAM identities that they are attached to. As such, These are exactly like the bucket policies, but will not have a 'principal' statement in the policy. These are only used to grant and revoke privileges for identities that are within the account that owns the bucket. If all you want to do is manage access between bucket and IAM principals with your account, IAM policies are the only policies you need.

IAM Policies are sometimes referred to as 'User policies' in AWS documentation for S3.

Access Points

Access points are the newest way to grant access to S3 Buckets. They have come about as a way to manage access at a larger scale; there is a limit to how large the bucket policy can be, and only one bucket policy document can be attached to a bucket. Using bucket access points creates an 'alias' for a bucket, and all the access through that alias is managed by a policy document designed specifically for that alias.

If you find that confusing, think of like this; you may have a bank account (bucket) and multiple debit cards (access points) for different family members. You might have different daily spending limits on those cards, but they all fundamentally draw from the same source.

Access points are still relatively new, but I imagine they may become more popular than bucket policies due to the ability to have multiple of them, and to have bucket aliases that are scoped to a particular account (e.g. no issues with global naming collisions like normal bucket names).

Evaluating Multiple Policies

It's a little confusing to understand how all of these policies apply

  1. Are block-public-access restrictions enabled?

    If you're accessing the bucket publicly, and these are enabled, stop now. You can't access it. If only a couple of them are, you'll need to scrub out the associated policy (assuming it is an 'Allow' policy) and evaluated the remaining policies.

    Note: If you are blocking all public ACLs, this doesn't mean that the bucket is not public - You could have a '*' in the principal of a bucket policy which may be granting public access.

  2. Check if anything has an explicit deny.

    If even /one/ thing, no matter what it is, matches a deny condition for the principal, they will be denied access to the bucket.

  3. Check if anything has an explicit allow.

    As long as there are no matching 'deny' statements, and at least one allow statement, the principal will be allowed to access the bucket.

  4. No explicit allow anywhere?

    The principal will be denied access to the bucket.

A bucket that is created via the console (without changing any default settings), will tend to only allow access via IAM to the owning account.

Conclusions

This was a whirlwind tour of S3 access controls. We had a look at some common terminology that is both specific and common to all S3 access mechanisms. We also learnt a little about public access, and we can prevent public access by using the public access restriction controls. We had a look at each mechanism which includes ACLs, Bucket Policies, IAM Policies and Access Points. We learnt about the problems that each access mechanisms solves, and how to recognise the appropriate scenarios to use them in. Finally, we learn how S3 evaluates whether to allow access in the prescence of multiple policies.

Need help avoiding the S3 Bucket Negligence Award? Contact Mechanical Rock to Get Started!