Using Amazon ECS with AWS Fargate to automate Azure DevOps Hosted Agents | Microsoft Workloads on AWS

In this blog post, we will show you how to use Amazon Elastic Container Service (Amazon ECS) with AWS Fargate as hosted agents to deploy applications to Amazon Web Services (AWS) using Microsoft Azure Pipelines. This is a continuation of a previous post about using Amazon Elastic Compute Cloud (Amazon EC2) Auto Scaling with your self-hosted Amazon EC2-based agents to deploy to AWS.

AWS offers various services to build and deploy applications, including AWS CodeBuild, AWS CodePipeline, and Amazon CodeCatalyst. If you’re using Microsoft’s Azure DevOps, you can also use Azure Pipelines to build and release applications on AWS. Azure Pipelines works with both the cloud-based Azure DevOps Services and the on-premises Azure DevOps Server.

AWS customers using Azure DevOps (referred to as ADO from here onward) for their CI/CD pipelines can use self-hosted agents to build, test, and deploy AWS applications. Self-hosted agents provide more control and customization.

With self-hosted Azure Pipelines static agents on Amazon EC2 instances, there is no built-in dynamic scaling capability for agent pools. Providing too few agents can lead to long build times due to insufficient capacity. On the other hand, provisioning too many agents will result in paying for excess unused capacity when they are idle.

In this blog post, we will demonstrate how to use Amazon ECS with AWS Fargate to orchestrate container-based, dynamic, on-demand, self-hosted agents, which will provide a simple, secure, and automated solution for your ADO agent pools. Note: Because the agents run as containers themselves, this solution is not suitable for pipeline jobs that include building container images.

We will break up this solution into two parts. First, we will show you how you can use AWS developer tools to build and push a customized agent image to Amazon Elastic Container Registry (Amazon ECR). Then we’ll show you how to provision self-hosted hosted agents with the use of ADO agent pool Approvals and Checks.

Figure 1 includes a Terraform stack that is designed to simplify the deployment process. When you deploy this Terraform stack, it creates an AWS CodeCommit repository, AWS CodePipeline, Amazon ECR repository, Amazon ECS cluster with a task definition, AWS Lambda functions and Amazon API Gateway (item 1).

The task definition is configured with CPU and memory at 0.25 vCPU and 512 MiB, respectively, along with environment variables for ADO-related parameters. While the provided Terraform stack and Dockerfile help deploy the stack with a custom image to be built and pushed to Amazon ECR, you could alternatively choose to make use of your own existing tools and container images. The AWS CodeCommit repository contains a Dockerfile with instructions on how to build the ADO agent Docker image, along with any application-specific build tooling. The AWS CodePipeline is configured to source from AWS CodeCommit, and it has an AWS CodeBuild stage that builds and pushes a container image, including the ADO agent software, to an Amazon ECR repository.

When an Azure Pipelines job is triggered in ADO (item 2), it invokes the Amazon API Gateway endpoint, which is configured in the ADO agent pool’s settings – Approvals and Checks option.

Amazon API Gateway invokes the integrated Lambda function (create_ecs_task), which, in turn, triggers the Amazon ECS task (item 3). A response is sent back to ADO, and the pipeline waits for the agent to be provisioned. The Amazon ECS cluster uses the container image retrieved from the Amazon ECR repository to run an Amazon ECS task. Simultaneously, the create_ecs_task function invokes the get_ecs_task Lambda function, passing details from the initial API call, including job specifics, in the Lambda event. Then the get_ecs_task function begins polling the status of the Amazon ECS task. When it detects that the task status is in the RUNNING state, it sends a callback to ADO agent pool’s Approvals and Checks process to proceed with its pipeline execution. The Lambda functions, therefore, act as intermediaries between ADO and the Amazon ECS task by creating an agent and updating its availability in the ADO agent pool.

When the Amazon ECS task container instance transitions to the RUNNING state, it gets registered in the ADO agent pool. Azure Pipelines can then use the Amazon ECS task to run the pipeline. It’s important to note that the lifespan of the Amazon ECS task is directly tied to the duration of the corresponding pipeline job within ADO.

The authentication procedure for enrolling the Amazon ECS container instance into the ADO agent pool is accomplished by using a personal access token (PAT). There is no need to configure AWS credentials because the access to AWS resources is handled via the Amazon ECS task and task execution Identity and Access Management (IAM) roles, thus eliminating the need to configure AWS credentials in ADO.

The sequence diagram in Figure 2 illustrates this agent provisioning process from the initial point of the pipeline being triggered in ADO, to agent provisioning, to registering with the pool, to completing the pipeline job.

Here are the prerequisites to use this solution for your Azure Pipelines agents:

The first procedure is to create the AWS components needed to deploy the Azure Pipelines agents.

With the Terraform stack deployed, you will need to prepare a few more resources to ensure the solution functions as expected. At this stage, there will be an AWS CodeCommit repository created as listed in the output clone_url_http_grc.

1. Navigate locally to the ado_agent_repo subdirectory of amazon-ecs-for-azure-devops-hosted-agents.

2. Run the following commands to push to your AWS CodeCommit repository. Replace with the region you are currently using:

This will automatically trigger the AWS CodePipeline ado-ecs-runner-pipeline, which will use AWS CodeBuild to build and push the container image for the Azure Pipelines agent. You can navigate to AWS CodePipeline and AWS CodeBuild in the AWS Management Console to ensure this process completes.

3. Once complete, validate that the image is published by navigating to Amazon ECR in the AWS Management Console and checking for the ado-ecs-ecr repository, which should contain an image with the tag ado-ecs. This is the container image that will be used by the Amazon ECS task later in the process.

4. Refer to the updating secret value in AWS Secret Manager guide, and update the ADO PAT value as plaintext (generated as part of the prerequisite steps) in the AWS Secrets Manager console. Update the secret resource created during terraform deployment: ecs-ado-pat-secret.

You are now ready to configure ADO to use Amazon ECS by following these steps.

1. In ADO, configure a Service Connection of type Generic with the Server URL value set with API Gateway endpoint listed as ecs_ado_api_invoke_url from the Terraform output. Ensure that Grant access permissions to all pipelines is checked (Figure 4).

2. Configure the agent pool’s Approvals and Checks by navigating to ADO project settings, and choose Agent Pools. Select the agent pool that the dynamic agents need to be assigned to. Choose Approvals and checks and then choose the “+” sign to add “Invoke REST API” based checks (Figure 5).

3. Select the service connection created in the previous step. The Headers field will be populated automatically, as shown in Figure 6.

Figure 6: Configuring Approvals and Checks using Invoke REST API for the agent pool

4. Next, you will create an empty repository to test the agent running in Amazon ECS as a short- lived task. Create the repository in your project using the provided instructions. 5. Create a new file in the repository with the name ecs-ado-pipeline.yaml. This will open an inline editor in the console.

Add the following content and commit changes into the main branch. Replace [agent-pool-name] with the agent pool you previously configured.

6. Create a new pipeline for testing by navigating to Pipelines, and choose New Pipeline. 7. Select “Azure Repos Git” as the source for the pipeline. Choose the repository you just created. 8. Choose “Existing Azure Pipelines YAML file”. 9. Select ecs-ado-pipeline.yaml from the drop-down, and choose Continue (Figure 7).

1. Once you have created the Azure Pipeline, choose Run. This will display a prompt for permissions on the agent pool (Figure 8). Choose Permit to proceed.

2. This will trigger Approvals and Checks, which will, in turn, invoke Amazon ECS to provision an ADO agent. At this stage, ADO makes an API call with the payload and waits for a callback from the create-ecs-task Lambda function (Figure 9).

3. Navigate to the Amazon ECS console, and choose your Amazon ECS cluster. 4. Choose the Tasks tab to access active tasks. 5. There should be single task at this stage in this cluster, with a Last status of `Running` as shown in Figure 10.

6. In the ADO agent pool, once the Approvals and Checks has passed, the job is queued to run (Figure 11).

Figure 11: Screenshot indicating successful completion of `Approvals and Checks`

7. The job will process an agent running as an Amazon ECS task. Figure 12 displays example output.

8. If you choose the ‘Test aws cli’ step, it will show (Figure 13) that the task running in Amazon ECS was able to successfully assume the task IAM role and run the example command to fetch caller identity information.

Figure 13: Screenshot indicating agent task running with configured IAM Role

Deploying this solution will provision resources and incur costs. Once you have completed testing and you no longer need the setup, remove provisioned resources to avoid unintended costs:

In this blog post, we demonstrated a solution that uses Amazon ECS, AWS Fargate, and AWS Lambda to provision ephemeral, container-based Azure Pipelines agents on demand. Once a pipeline is triggered, the agent, running as an Amazon ECS task, is automatically registered. When the pipeline job is completed, the agent gets de-registered from the ADO agent pool and the Amazon ECS task is de-provisioned, stopped, and deleted.

This solution will help you build cost-optimized, secure, hosted agents for running Azure Pipelines to deploy applications to your AWS account. When deploying your cloud-based applications to AWS using Azure DevOps, you can also use the AWS Toolkit for Azure DevOps to provide you with specific functionality you need to deploy any workload to AWS.

AWS has significantly more services, and more features within those services, than any other cloud provider, making it faster, easier, and more cost effective to move your existing applications to the cloud and build nearly anything you can imagine. Give your Microsoft applications the infrastructure they need to drive the business outcomes you want. Visit our .NET on AWS and AWS Database blogs for additional guidance and options for your Microsoft workloads. Contact us to start your migration and modernization journey today.

Nagaraju is a seasoned DevOps Architect at AWS, UKI. He specializes in assisting customers in designing and implementing secure, scalable, and resilient hybrid and cloud-native solutions with DevOps methodologies. With a profound passion for cloud infrastructure, observability and automation, Nagaraju is also an avid contributor to Open-Source projects related to Terraform and AWS CDK.

Debojit is a DevOps consultant who specializes in helping customers deliver secure and reliable solutions using AWS services. He concentrates on infrastructure development and building serverless solutions with AWS and DevOps. Apart from work, Debojit enjoys watching movies and spending time with his family.

Devashish is a dedicated DevOps consultant specialising in scalable , reliable and secure architectures by leveraging AWS services and DevOps methodologies. Devashish enjoys working with AWS customers to help solve their unique problems. Beyond work, Devashish finds joy in playing tennis and exploring new destinations through travel, cherishing moments with his family along the way