26.3.2021

AWS multi-account CI/CD with Gitlab runners

Tech Culture
AWS multi-account CI/CD with Gitlab runners

Miguel Fontanilla, senior DevOps engineer, and Jan Carsten Lohmüller, platform engineer, explain how the platform team re-architected sennder’s entire cloud infrastructure into a new cloud setup called SennCloud. With multiple AWS accounts, we are able to boost our teams’ productivity levels while scaling and paving the way towards self-service infrastructure.

SennCloud, based on AWS, is designed with cloud computing best practices in mind. It breaks with the current status quo at sennder of having one single AWS account that aggregates all the environments and resources together. Instead, we now have individual, separate AWS accounts, paving the way for new security features and, equally important, new ways of working.

What are we using (and why)?

sennder started off its journey with one AWS account. The account hosted everything:

  • Our services

  • Our CI/CD infrastructure

  • Access management

  • Business intelligence applications

  • Roll out AWS accounts on-demand

  • Scale fast

  • Seamlessly apply policies, enrolling access management

  • Maintain centralized control across the whole organization

Control Tower is enabled in our root account, the first entry point to our cloud world. It creates two AWS accounts: One account for audits and another one for log archives. These two accounts get associated with an organizational unit (OU), called the core OU.

After the enrollment of those accounts is done, we can choose how to structure your overall cloud setup.

  • Do AWS accounts represent environments?

  • Does my team get its own AWS account?

  • How do we monitor the overall setup?

  • Who is responsible for maintaining the development environments?

  • And most tricky: What is an environment?

How did we connect the different parts?

Since we use GitLab as a SaaS git repository manager, we decided to use GitLab CI to set up our pipelines. This approach allows us to store the pipeline definitions alongside the code.

Runner types

Due to the nature of the different pieces of software, we build and the pipeline configurations, not all the jobs require the same amount of resources from the cluster. So we introduced three types of runners, deployed using the Helm Chart with different values:

  • Small runners: Intended for lightweight pipelines, mostly based on command-line interfaces (CLIs): terraform, kubectl, helm, lint processes.

  • Medium runners: Used for intensive pipelines such as Docker image building.

  • Big runners: Dedicated to extremely intensive workloads such as compilation, webpage rendering, or performance testing.

image: xxxx.dkr.ecr.eu-central-1.amazonaws.com/platform/docker-base-images/docker-19.03-tf-13:latest
 
.prepare:
 variables:
   PLAN: ${CI_ENVIRONMENT_NAME}.plan.tfplan
   PLAN_JSON: ${CI_ENVIRONMENT_NAME}.tfplan.json
   TF_ROOT: ${CI_PROJECT_DIR}/iac/terraform/app
 before_script:
   - cd ${TF_ROOT}
   - terraform init
 tags:
   - small

Delegated roles

Once the runners were in place, the next step was to ensure that they could deploy applications and infrastructure into the different accounts that make up SennCloud. This was done by means of AWS Identity Access Management (IAM) Roles. We added a delegated role in each account to which the GitLab CI pipelines would have to deploy, providing the required permissions over the resources to deploy in those accounts.

The next step was to determine what entity would assume those delegated roles. The simplest solution would have been to authorize the role that is attached to the EKS nodes instance profiles to assume the delegated roles. However, if this approach is implemented, any pod running in the cluster could assume the delegated roles and deploy infrastructure into the accounts. To avoid such risk, we used Kube2IAM, a solution that associates AWS IAM roles with Kubernetes pods, to make sure that only the GitLab Runner pods can assume the roles and deploy.

apiVersion: v1
kind: Pod
metadata:
  annotations:
    iam.amazonaws.com/role: arn:aws:iam::xxxx:role/cicd-kube2iam-gitlab_runner_role
    kubernetes.io/psp: eks.privileged
  labels:
    app: small-gitlab-runner-gitlab-runner
    chart: gitlab-runner-0.25.0
    heritage: Helm
    pod-template-hash: 6b6cf4b866
    release: small-gitlab-runner
  name: small-gitlab-runner-gitlab-runner-6b6cf4b866-nkrw4
  namespace: default

AWS profiles & Terraform

As we said before, the infrastructure is provisioned by means of Terraform, and it needs to know which IAM Role to use when deploying to a specific account. Terraform handles account switching by means of AWS profiles that can be specified as environment variables or within the Terraform code itself. The following snippet shows a provider definition that will use the admin profile. Profiles are also specified within the backend configuration so that Terraform can access the remote state buckets where it will persist its data.

provider "aws" {
 region              = var.region
 allowed_account_ids = var.allowed_account_ids
 profile             = "admin"
}

In order for Terraform to assume the role associated with a specific profile, the profile configuration needs to be placed in the ~/.aws/config file within the system where Terraform CLI is executed. In our case, it is executed inside a CI/CD pipeline, which runs in a Docker container within a Kubernetes cluster. To make the profiles available within the pipeline container, we use tailored runner images in which the config file is added during the image build process.

[profile admin]
role_arn = arn:aws:iam::xxxxxxxxxxxx:role/cicd/delegated-role
credential_source = Ec2InstanceMetadata
[profile cicd]
role_arn = arn:aws:iam::xxxxxxxxxxxx:role/cicd/delegated-role
credential_source = Ec2InstanceMetadata
[profile dev]
role_arn = arn:aws:iam::xxxxxxxxxxxx:role/cicd/delegated-role
credential_source = Ec2InstanceMetadata
[profile prod]
role_arn = arn:aws:iam::xxxxxxxxxxxx:role/cicd/delegated-role
credential_source = Ec2InstanceMetadata

How does SennCloud boost our business?

After reading through our story of sunrising a new cloud setup called SennCloud for sennder, you might ask yourself, “But why? I am running on a single AWS account and everything works fine.” We understand that there are 100 ways of architecting a platform and there is definitely no right or wrong way. The key to enabling teams to use the platform as successfully as possible is to architect the platform in close collaboration with the teams.

We improved the SDLC in our organization by segregating everything else from the production environments and enabling every team member to work on their infrastructure. It might seem risky, but nowadays it is very possible with a combination of consolidated billing and guardrails in place. Be aware that you still need to monitor what is happening on your platform. Teams will forget to clean up after themselves and will need some guidance in setting up their initial setup. We see this as an investment into our future, boosting the capabilities of our teams to new levels.

In order to gain experience and confidence in the new platform, you need rapid feedback loops. Our new CI/CD flow supports this because our pipelines can build software in a fast and reliable manner, making it easy for the teams to maneuver. Enhanced observability also supports this approach. Again, the key is to work closely with the teams. If the teams don’t know what is failing, you will have a hard time explaining it to them. But if they already know upfront where and how to dig deeper into their setup, real DevOps happens.

One topic that’s still on our plate is our production environments. Isolating production and development environments from each other seems right at first glance, but can definitely be improved. In the coming months, we will be working on new solutions to enable teams to maintain their own productive software, paving the way for self-service infrastructure.

If you’re interested in learning more about sennder, our culture, and open positions, go to our career site HERE.

You can also find our tech blog on Medium and follow us to stay updated on new posts HERE.

...

About the authors

Miguel Fontanilla (he/him)

Hello, hello, my name is Miguel and I joined sennder in October of 2020 to become a member of the platform team. Apart from being an amateur kite-surfer and snowboarder, I am a senior DevOps engineer and I’m always eager to investigate and try new technologies and tools within the modern software engineering landscape. I also love to help people understand these technologies, and that’s why I run my own blog and participate in open source projects and communities.

Jan Carsten Lohmüller (he/him)

Hi, I am Jan Carsten, platform engineer, software developer, coffee enthusiast, and passionate cyclist. I have been involved in both startups as well as established global organizations and research. I joined sennder in August 2020 as a consultant from Netlight. My passion is sharing knowledge, enabling teams, and tweaking my .zshrc. My current scope of work is building a new cloud infrastructure framework for sennder’s engineering teams.
Also, check out Jan Carsten’s Medium page here.