AWS Lambda begin to support using container images as the execution context since about one year ago. That enabled the possibility of using other Linux distros. I happened to dig into this because of a use case I had few months ago. I found the steps of creating a container image is actually incredibly smoother comparing to trying to achieve a similar goal via Lambda Layer or deployment package. You may bootstrap your image from my article here.

Background

AWS Lambda, as we used to know it, was based off the original Amazon Linux or Amazon Linux 2. This is great for a lot of use cases if your code does not require any special binaries from the operating system. Even if it does, oftentimes you can get away by using layers or deployment packages with tweaks of environment variable and customizing configure and make commands. For one use case I had, this trick became very hard to deal with. A package I needed to include in my function requires a newer version of gcc. Then, I ran into other issues when trying to install newer version of gcc because make is also a very old version. Simply because of the chain of dependencies, I decided to give container image a try. This time, I wanted to use a different Linux distro, just to see if Lambda would play nicely with it.

For testing purposes, I’m going to use Ubuntu 20.04. It has packages that are new enough for my use case, so it was a no-brainer choice.

Bootstrapping a Docker Image

To begin, I need to have Docker installed. I’m using Mac with both Intel chip and also Apple Silicon. So, except the command line, I also have the desktop application running at the background.

Next, I need a Dockerfile. I began to use the example provided on the AWS Documentation. Soon enough, I figured it doesn’t meet my requirements. I do need to have the additional binaries installed in the base image, so a multi-stage build would simply not work. Therefore, I added and removed some commands. Now it basically has these features:

The image is based off Ubuntu 20.04 (Focal Fossa).
Removed interactive questions thus it will not prompt and stall when installing via apt.
To simplify the article, I only install Python 3 here from apt.
For Python libraries, I have awslambdaric, boto3, and botocore.

Here is the Dockerfile I have:

FROM public.ecr.aws/ubuntu/ubuntu:focal

# Define function directory
ARG FUNCTION_DIR="/function"

# Remove interactive questions
ARG DEBIAN_FRONTEND="noninteractive"

# Install build dependencies
RUN apt update && \
  apt install -y \
    python3 \
    python3-pip

# Create function directory
RUN mkdir -p ${FUNCTION_DIR}

# Copy function code
COPY app/* ${FUNCTION_DIR}

# Install the runtime dependencies
RUN pip3 install --target ${FUNCTION_DIR} \
  awslambdaric \
  boto3 \
  botocore

# Set working directory to function root directory
WORKDIR ${FUNCTION_DIR}

# Set up the entry point and command parameters
ENTRYPOINT [ "/usr/bin/python3", "-m", "awslambdaric" ]
CMD [ "lambda_function.lambda_handler" ]

In Python, only awslambdaric library is compulsory as it is the Runtime Interface Client for AWS Lambda. The equivalent also exists for Node.js. There is a list of supported programming languages here.

I am also using Amazon ECR Public Gallery for the base image. Using Docker Hub would also achieve the goal in a similar way.

Next, based off the definition in the Dockerfile, I also need a directory called app/ for my Lambda function source code files. In this example, it is just one: lambda_function.py:

.
├── Dockerfile
└── app
    └── lambda_function.py

The Lambda function in this example, is just echoing back the request, along with CORS headers for API Gateway proxy integration:

import json
import logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)

def lambda_handler(event, context):
  logger.info('Event: {}'.format(json.dumps(event)))
  response = {
    'statusCode': 200,
    'body': json.dumps(event),
    'headers': {
      'Access-Control-Allow-Origin': '*',
      'Access-Control-Allow-Credentials': '*'
    }
  }
  logger.info('Response: {}'.format(json.dumps(response)))
  return response

Next, to build the image, there are some caveats I ran into. Since I have both Intel and M1 Mac, they are of different CPU architectures, x86_64 (a.k.a. x64 or amd64) and ARM64 respectively. And also because Lambda begin to support ARM64 architecture as well, there are something else needs attention.

Docker itself will build the image based off the CPU architecture of the hardware you are running it on by default. Thus, you need to add a --platform parameter to build the image to the target architecture you plan to use on your Lambda function. The first of the following command will build the image for x86_64 CPU, and the second one will build for ARM64:

1 2	$ docker build --platform=linux/amd64 -t ubuntu-hello-world-x86 . $ docker build --platform=linux/arm64 -t ubuntu-hello-world-arm64 .

Here, I have different tag for those images, I will take the x86_64 one as the example below. The tag will be used later when pushing the image to ECR (Elastic Container Registry).

For Mac with Apple Silicon, installing Rosetta 2 may also be required to translate the machine code from x86_64 to ARM64:

1	$ softwareupdate --install rosetta

Testing the Image Locally

Now, if everything is completed successfully, I can run it locally in a Docker image. To do that, I need to download a binary called Lambda RIE (Runtime Interface Emulator). It can be either installed locally, or embedded in the image. I opted for the former one. There is an AWS Documentation about installing it to the local computer. Basically, for x86_64 emulator on macOS or Linux, run the following command:

1
2
3

$ mkdir -p ~/.aws-lambda-rie && curl -Lo ~/.aws-lambda-rie/aws-lambda-rie \
https://github.com/aws/aws-lambda-runtime-interface-emulator/releases/latest/download/aws-lambda-rie \
&& chmod +x ~/.aws-lambda-rie/aws-lambda-rie

This would make a directory called .aws-lambda-rie in the current user’s home directory and download the latest Lambda RIE binary from a GitHub repository to that directory and make it executable. Similar instructions can also be found for ARM64 emulator, and also the executables for Windows.

Next, I spin up a Docker container off the image I just built:

1	$ docker run -d -v ~/.aws-lambda-rie:/aws-lambda -p 9000:8080 --entrypoint /aws-lambda/aws-lambda-rie ubuntu-hello-world-x86 /usr/bin/python3 -m awslambdaric lambda_function.lambda_handler

This command will bind the directory where I have the Lambda RIE installed to the container directory of /aws-lambda, then map the local port 9000 to container port 8080. The --entrypoint parameter overrides the default entrypoint I have in the Dockerfile to the emulator as it is not running in the real Lambda runtime. Following by the image tag and then spin off the Lambda RIC, listening to the incoming requests.

Next, we can invoke the function simply by a cURL command:

1	$ curl -vsL -XPOST 'http://localhost:9000/2015-03-31/functions/function/invocations' -d '{"Message":"Hello World!"}'

If you are familiar with Lambda API, you will notice the path here follows the same convention. The function name in this case, is simply “function”, as how I created the directory by the argument FUNCTION_DIR in the Dockerfile.

Here, the command should respond back something like this:

1	{"statusCode": 200, "body": "{\"Message\": \"Hello World!\"}", "headers": {"Access-Control-Allow-Origin": "", "Access-Control-Allow-Credentials": ""}}

The Docker Desktop app now comes in handy for testing locally. You can check the function execution log by simply clicking the individual container, or you can SSH into the container to inspect the artifacts within.

AWS Credentials

The container running locally does not contain any AWS credentials, unlike the real Lambda runtime which assume the execution role upon invocation. You can add environment variables in the Dockerfile for local testing, just before the ENTRYPOINT instruction. But, do remember removing it before building the image for the real Lambda environment:

1
2
3

# Set AWS Credentials
ENV AWS_ACCESS_KEY_ID=
ENV AWS_SECRET_ACCESS_KEY=

Creating the Function

Next, I need to push it to ECR before referencing it in Lambda. Pushing to ECR is rather straightforward. Simply following step 7 to 9 in this documentation:

$ aws ecr get-login-password --region eu-west-1 | docker login --username AWS --password-stdin 123456789012.dkr.ecr.eu-west-1.amazonaws.com
$ aws ecr create-repository --region eu-west-1 --repository-name ubuntu-hello-world-x86 --image-scanning-configuration scanOnPush=true --image-tag-mutability MUTABLE
$ docker tag ubuntu-hello-world-x86:latest 123456789012.dkr.ecr.eu-west-1.amazonaws.com/ubuntu-hello-world-x86:latest
$ docker push 123456789012.dkr.ecr.eu-west-1.amazonaws.com/ubuntu-hello-world-x86:latest

On a side note, by leveraging the permission policy of the repository, the image can be shared to other accounts.

Deploying the container image to Lambda is a bit different from the usual. First, there is no “switch” to change a code-based Lambda function to image-based function. You have to pick it when creating a function:

$ aws lambda create-function --region eu-west-1 --function-name ubuntuHelloWorldx86 --architectures x86_64 --code ImageUri=123456789012.dkr.ecr.eu-west-1.amazonaws.com/ubuntu-hello-world-x86:latest --role arn:aws:iam::123456789012:role/LambdaExecutionRole --package-type Image --memory-size 128

Once the function state is Active, you can invoke the function:

1	$ aws lambda invoke --region eu-west-1 --function-name ubuntuHelloWorldx86 --payload '{"Message":"Hello World!"}' /dev/stdout

Here it is. Now the function is running in the real Lambda environment.

Conclusion

Since AWS Lambda begin to support container image, it enabled many new use cases. In this article, although I began with a specific use case of solving the problem about the chain of dependencies, I didn’t focus specifically on that topic. Instead, that problem gave me a motivation to try setting this all up, and sharing it here. This is rather a minimal Dockerfile for readers who want to use a base image of their choice, install binaries via common package management systems. From there, one can be inspired to create some more sophisticated.

I have mentioned in the beginning that I didn’t need a multi-stage build for this article because the dependencies are installed via apt and pip commands. However, if you are going to compile code in your container, you may want to use multi-stage build to only keep the compiled binaries instead of having everything else jammed in your image. The concept is to have a clean base image for each compile job and only copy the final artifacts to the end image.

As for the reason of caring the size of the image, is due to the time it takes to invoke a function. The larger the image is, it will consume more time in the “cold start” of the function.

On another side note, after experiencing the build process on both Intel and M1 Mac, strangely I feel it runs much better in M1 even it was building for x86_64 architecture. One major reason for me to upgrade to an M1 MacBook was also due to the fact that last time I tried to build an image off the old MacBook I have, not only other software were just not responding, but also the laptop tried to take off from my desk…

P.S. I am comparing the following MacBook:

MacBook Pro 13’ (2017, 8 GB RAM, 3.1 GHz 2-core Intel Core i5)
MacBook Pro 13’ (2020, 16 GB RAM, Apple M1)