AWS Networking & Identity

IAM Roles & Policies in Depth

18 min Lesson 7 of 28

IAM Roles & Policies in Depth

AWS Identity and Access Management is the authorization backbone of every production system on AWS. Most engineers understand the basics — users, groups, policies — but production-grade security depends on a deeper model: role assumption, trust policies, permission boundaries, and policy conditions. Getting these wrong is how privilege escalation, data exfiltration, and compliance failures happen. This lesson closes that gap.

How Role Assumption Works

An IAM Role is not an identity you log in as — it is a set of permissions that any trusted principal can assume temporarily. When an EC2 instance, a Lambda function, a CI/CD pipeline, or a human assumes a role, AWS STS (Security Token Service) issues short-lived credentials: an AccessKeyId, SecretAccessKey, and a SessionToken that expire (default 1 hour, maximum configurable per role up to 12 hours).

The assumption flow has two gates. Gate 1 is the trust policy — who is allowed to call sts:AssumeRole. Gate 2 is the permission policy attached to the role — what the resulting session can do. Both gates must pass. This dual-gate model is what makes roles fundamentally safer than long-lived access keys.

IAM role assumption: two authorization gates plus an optional permission boundary cap on the session.

Trust Policies — The First Gate

A trust policy is a JSON resource-based policy attached to the role itself. It answers: which principals are allowed to call sts:AssumeRole on this role? The Principal element can reference AWS accounts, specific IAM users/roles, AWS services (like ec2.amazonaws.com for instance profiles), or OIDC/SAML federated identities.

# Trust policy for a cross-account CI pipeline assuming a deploy role
# This grants GitHub Actions (OIDC) and a specific role in account 222222222222
# the right to assume this deploy role — with session conditions enforced.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowGitHubOIDC",
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::111111111111:oidc-provider/token.actions.githubusercontent.com"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "token.actions.githubusercontent.com:aud": "sts.amazonaws.com",
          "token.actions.githubusercontent.com:sub": "repo:myorg/myrepo:ref:refs/heads/main"
        }
      }
    },
    {
      "Sid": "AllowCrossAccountPipeline",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::222222222222:role/PipelineOrchestrator"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
          "sts:ExternalId": "prod-deploy-external-id-abc123"
        },
        "Bool": {
          "aws:MultiFactorAuthPresent": "true"
        }
      }
    }
  ]
}

Confused Deputy attack: when a third-party SaaS assumes a role in your account using only your account ID as proof, a malicious actor can trick that SaaS into operating on a different customer's account. Always require an sts:ExternalId in the trust policy when granting access to external services. The external ID should be unique per customer and treated as a secret shared between you and the vendor.

Permission Policies — The Second Gate

Permission policies define what the session can do. AWS evaluates them with an explicit-deny-first model: any matching Deny statement in any policy — identity policy, resource policy, SCP, or permission boundary — overrides every Allow. Know the evaluation order: SCPs → Resource-based policies → Identity-based policies → Permission boundaries → Session policies.

Production roles should follow least-privilege religiously. Use Resource ARNs instead of *, scope conditions to specific VPCs or tags, and never grant iam:* or sts:AssumeRole on * to workload roles.

# Minimal deploy role — allows ECS task updates only in the prod cluster
# Scoped by both resource ARN and a required tag condition

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "ECSDeployProd",
      "Effect": "Allow",
      "Action": [
        "ecs:UpdateService",
        "ecs:DescribeServices",
        "ecs:DescribeTaskDefinition",
        "ecs:RegisterTaskDefinition"
      ],
      "Resource": [
        "arn:aws:ecs:us-east-1:111111111111:cluster/prod",
        "arn:aws:ecs:us-east-1:111111111111:service/prod/*",
        "arn:aws:ecs:us-east-1:111111111111:task-definition/prod-*"
      ]
    },
    {
      "Sid": "ECRPull",
      "Effect": "Allow",
      "Action": [
        "ecr:GetDownloadUrlForLayer",
        "ecr:BatchGetImage",
        "ecr:GetAuthorizationToken"
      ],
      "Resource": "arn:aws:ecr:us-east-1:111111111111:repository/prod-*"
    },
    {
      "Sid": "DenyDeleteResources",
      "Effect": "Deny",
      "Action": [
        "ecs:DeleteCluster",
        "ecs:DeleteService",
        "ecr:DeleteRepository"
      ],
      "Resource": "*"
    }
  ]
}

Permission Boundaries — Capping Delegation

A permission boundary is a managed policy you attach to an IAM role (or user) that acts as a ceiling on what that identity can ever do — even if more permissive policies are attached later. The effective permissions are the intersection of the identity's permission policies and the boundary.

The canonical production use case: you want a CI/CD pipeline to be able to create IAM roles for microservices, but you never want those pipeline-created roles to exceed the permissions the pipeline itself has. You enforce this by requiring that any role the pipeline creates must have the same boundary applied.

# Attach a permission boundary when creating a role — via AWS CLI
aws iam create-role \
  --role-name microservice-reader \
  --assume-role-policy-document file://trust.json \
  --permissions-boundary arn:aws:iam::111111111111:policy/ServiceBoundary

# The pipeline's own permission policy restricts CreateRole + PutRolePermissionsBoundary:
{
  "Sid": "AllowCreateRoleWithBoundary",
  "Effect": "Allow",
  "Action": [
    "iam:CreateRole",
    "iam:AttachRolePolicy",
    "iam:PutRolePermissionsBoundary"
  ],
  "Resource": "arn:aws:iam::111111111111:role/microservice-*",
  "Condition": {
    "StringEquals": {
      "iam:PermissionsBoundary": "arn:aws:iam::111111111111:policy/ServiceBoundary"
    }
  }
}

Without the Condition check on iam:PermissionsBoundary, a pipeline with iam:CreateRole can create a role with no boundary and full AdministratorAccess — a classic privilege escalation path catalogued in AWS security research. Always pair iam:CreateRole with this condition.

Policy Conditions — Precision Control

Conditions are the most underused IAM feature. They let you make permissions context-sensitive: enforce MFA, restrict to specific source IPs or VPCs, require encryption, or gate on resource tags. Condition operators are typed: StringEquals, ArnLike, IpAddress, Bool, NumericLessThan, DateGreaterThan, etc.

Key condition keys for production hardening:

aws:MultiFactorAuthPresent — require MFA for sensitive actions
aws:SourceVpc / aws:SourceVpce — restrict S3 access to traffic from your VPC
aws:RequestedRegion — deny actions outside approved regions (often paired with SCPs)
aws:PrincipalTag / aws:ResourceTag — attribute-based access control (ABAC)
s3:x-amz-server-side-encryption — deny S3 puts without encryption
ec2:Region, ec2:InstanceType — cap instance types developers can launch

ABAC at scale: at large organizations (Netflix, Airbnb scale), RBAC (one role per team/service) explodes into thousands of roles. Attribute-Based Access Control (ABAC) via aws:PrincipalTag and aws:ResourceTag compresses that: one role, many principals, access scoped dynamically by tags like team=payments. Tag every resource consistently and use a Service Control Policy to enforce tagging at creation time.

Operational Patterns

Instance profiles are how EC2 gets a role — you attach a role to an instance profile, and the EC2 metadata service (http://169.254.169.254/latest/meta-data/iam/security-credentials/ or IMDSv2 equivalent) vends rotating credentials automatically. Always enable IMDSv2 (token-required mode) to block SSRF-based credential theft.

Service-linked roles are pre-created by AWS services (e.g., AWSServiceRoleForECS) with trust policies locked to the service. You cannot modify their trust policies, only their permission policies — and many have neither, as the permissions are AWS-managed.

Use aws sts get-caller-identity to verify which identity your current credentials represent. Use aws iam simulate-principal-policy to test policy logic before deploying — critical for debugging "Access Denied" in complex multi-policy environments.

# Verify current identity (works for roles too — shows assumed-role ARN)
aws sts get-caller-identity

# Simulate whether a role can perform an action on a resource
aws iam simulate-principal-policy \
  --policy-source-arn arn:aws:iam::111111111111:role/microservice-reader \
  --action-names s3:GetObject \
  --resource-arns arn:aws:s3:::prod-artifacts/builds/*

# Check role's trust policy
aws iam get-role --role-name my-deploy-role \
  --query 'Role.AssumeRolePolicyDocument'

# List all policies attached to a role
aws iam list-attached-role-policies --role-name my-deploy-role
aws iam list-role-policies --role-name my-deploy-role   # inline policies

Credential leakage via CloudTrail: every AssumeRole call is logged with the session name in CloudTrail under sts:AssumeRole events. Always set a meaningful --role-session-name (e.g. ci-pipeline-run-12345) so security teams can trace which pipeline run generated which API calls. Never use generic names like session1.