Eliminate Static Keys for Cloud Resources Access Using OpenID Connect

Get up to speed with OIDC in a real-world use case

Gonzalo Peci
Trade Republic Engineering

--

Old keys on a wall
Old Keys — Photo by Richard Payette

Intro

Managing Cloud Infrastructure (AWS, GCP, Azure, Digital Ocean, etc.) through automated workflows has become the standard, but most teams struggle to manage the identities, keys and permissions required to do it in a secure way.

To make things more challenging, organizations frequently want to delegate teams the ability to manage their own cloud resources. Permission management, access management and secret management now become even more complex as it spread across teams, tools and needs.

The traditional way of providing access has been through static service accounts. These accounts are granted the required permissions and finally their credentials shared or injected into different tools.

In this article we would like to explore a modern approach for accessing cloud resources which leverages dynamic short-lived credentials to mitigate many of the problems and risks involved with static key management.

Static keys-based access management

Access management to service accounts is commonly done through static keys. These keys are created in your cloud provider and then shared or injected into your CI/CD systems or other automation tools. This means that teams must create, distribute, and manage the lifecycle of static keys (update, delete, rotate, etc.).

More importantly, these keys often become long-lived keys, which are inherently risky due to their long-lived nature and elevated permissions. If through any means an attacker got ahold of the keys, he could use them indefinitely to cause havoc until detected and revoked.

It’s also hard to identify “where” are these keys used, how to distribute them and later manage their lifecycle.

The operational burden of managing these keys correctly in many will inadvertently favor other bad practices, like multiple repositories sharing credentials (and because multiple repositories share credentials you now need wider scope permission), secrets not being rotated for fear of breaking the automations, insecure sharing, or management of the keys, etc.

As you can see, managing static keys has several challenges, so what other options do we have?

OIDC based access management

Diagram of the OIDC Flow
Diagram of the OIDC Flow

Enter OpenID Connect (OIDC for short). OIDC is an identity layer built on top of OAuth 2.0 which allows third party applications to verify the identity of end-users. In human terms, OIDC allows a cloud provider to verify the identity of an automation job as if it was an authenticated user.

Check this article from Okta to learn more about OIDC

You might already see how OIDC can help make our access management problem easier and several CI/CD like GitHub Actions and CircleCI announced support for this integration mechanism, and I hope most providers will soon support this as well.

GitHub Actions announcement of OIDC

With OIDC our jobs can authenticate to our cloud provider to perform activities for us (like deploying a website) without having to share static keys and sharing their job context. This not only eliminates the static key problems (like sharing and rotating) but also enables us to have more control over what we allow.

For example, as we will show later with GitHub Actions, you can require that certain AWS Role can only be assumed from a job that is triggered from your main branch.

Likewise, with their ease of use, we can create multiple scoped down roles for each repository, team, or job. We don’t want to explode the complexity, but having a role restricted to single repository and branch does help us sleep better at night.

Other technologies are also adopting OIDC, for example if you are using Kubernetes in AWS through EKS you might already be using OIDC.

Demo Time

We have arrived at the demo of this article but before we get started, let’s look at a brief diagram of how our workflow will work:

  1. GitHub Actions workflow provides a JWT token to the Job
  2. In the job we exchange the JWT token for AWS temporary credentials for a Role
  3. We use the temporary Role credentials to work with AWS resources
A diagram of OIDC, GitHub Actions and AWS interaction
A diagram of OIDC, GitHub Actions and AWS interaction

For this demo we are going to need a few things:

  • A GitHub account with GitHub Actions
  • An AWS account with permissions to create a Role and OIDC trust

We will be showing some resources as Terraform definitions, you can create the same resources using the CLI, Pulumi, CloudFormation or any other method

We are also going to need a pipeline definition:

# .github/workflows/list-buckets.yaml
name: "List Buckets"
on:
push:
permissions:
id-token: write # This is required for GitHub OID Connect in AWS
contents: read # This is required for actions/checkout
env:
AWS_REGION : "<example-aws-region>"
jobs:
list-buckets:
steps:
- name: configure aws credentials
uses: aws-actions/configure-aws-credentials@v1
with:
role-to-assume: arn:aws:iam::1234567890:role/example-role
role-session-name: samplerolesession
aws-region: ${{ env.AWS_REGION }}
- name: List all buckets we have access
run: aws s3 ls

Configuring AWS

OIDC Integration

# oidc.tf
resource "aws_iam_openid_connect_provider" "github" {
url = "https://token.actions.githubusercontent.com"
thumbprint_list = ["6938fd4d98bab03faadb97b34396831e3780aea1"]
client_id_list = ["sts.amazonaws.com"]
}

output "principal" {
value = aws_iam_openid_connect_provider.github.arn
}

IAM Role

We now need to configure an IAM Role that the CI job will be able to use to perform actions.

# role.tf
resource "aws_iam_role" "this" {
name = local.name

# Who can assume the role?
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Principal = {
# The ARN of the aws_iam_openid_connect_provider created before
# as this is the Federated Principal that we trust
Federated = "arn:aws:iam::123456123456:oidc-provider/token.actions.githubusercontent.com"
}
Action = "sts:AssumeRoleWithWebIdentity"
Condition = {
# A set of conditions to limit assuming the role to our repository
StringLike = {
"token.actions.githubusercontent.com:aud" = "sts.amazonaws.com"
# Filter for particular subjects like branch, environment, etc
# in this case, we allow all in the octo-org/octo-repo
"token.actions.githubusercontent.com:sub" = "repo:octo-org/octo-repo:*"
}
}
}]
})
# What is the role allowed to do?
inline_policy {
name = "my-s3-permissions"
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = [
"s3:ListBucket",
"s3:GetBucketLocation",
]
Resource = [
"arn:aws:s3:::my-super-bucket-2022-11"
]
}]
})
}
}

For more information on the OIDC token and the possible values for our Condition check About security hardening with OpenID Connect

Configure GitHub Actions

There is nothing to configure on the GitHub Actions side, we just need a pipeline that correctly defines the permissions for getting tokens.

# .github/workflows/list-buckets.yaml
# ...
permissions:
id-token: write # This is required for GitHub OID Connect in AWS
contents: read # This is required for actions/checkout
# ...

When you specify workflow permissions, all unspecified permissions are set to no access

Conclusion

In this article we explored an alternative and modern approach to manage cloud resources from our automated jobs, which simplifies the complexity of key management while increasing the security by removing static keys.

Cloud Providers and CI/CD providers are adopting open standard like OpenID Connect which make it possible to use concepts from Zero-Trust security in our environments, leveraging the context of a request (a job in a trusted CI/CD partner, its branch, etc.) to determine what actions are allowed.

With these secure, easy to manage and flexible foundations set, we would like to explore next other tools we have at our disposal to manage security and delegation, integrating AWS Permissions Boundaries and GitHub Actions job context into our workflow.

--

--