AWS CDK with Poetry 2.x and pre-commit hooks

With the Poetry 2.x announcement earlier this month, it's the perfect time to modernize your existing CDK projects. The plan is to stay current while minimizing technical debt.

Btw, if you're new to Poetry, it's a Python dependency manager, just like npm for NodeJS - essentially replaces global pip with a project-specific virtual environment and isolated packages.

Some highlights include:

pyproject.toml specification change
new Poetry CLI commands: poetry check, poetry lock
new poetry plugins
deprecation of some CLI commands: poetry shell and poetry export

And more! You can read full announcement here.

All of my current CDK projects heavily rely on poetry export and poetry shell commands, so here little step-by-step how I'm going about migrating CDK projects to Poetry 2.x, and work around the changes.

First, update Poetry to 2.x version (I use homebrew for all MacOS dependencies) and verify the version.

brew upgrade poetry
poetry version

If you haven't got brew or Poetry installed, check out my MacOS Setup Guid

Next, let's remove the virtual environment (if migrating existing project):

rm -rf ./.venv

Now we can update pyproject.toml file to the new specification. Unfortunately there isn't a --migration command to automate this process, so I recommend temporarily renaming your old file to something like pyproject.toml.bak (so you can compare the old vs. new configs side by side), and run poetry init to generate a new one.

When you install aws-cdk-lib, ensure backwards compatibility by adding requires-python = ">=3.13,<4.0" in the new pyproject.toml.

Now add CDK dependencies to the new pyproject.toml file:

poetry add aws-cdk-lib constructs

and development dependencies (if you have any):

poetry add --group dev moto pytest pytest-cov boto3 boto3-stubs

If you have dependencies used by CDK Lambda, it's time to add them in too, into their own groups:

poetry add --group email_processor requests

If you wonder why I use boto3 as a dev dependency, it's because I use moto for testing only, and it comes by default within lambda runtime.

At this point your pyproject.toml file should be pretty much ready. Run poetry check to ensure everything is "All set!", and regenerate the lock file with poetry lock.

Dynamic dependencies

Usually I'd run Makefile command with poetry export to install all the lambda dependencies, but since poetry export has been moved into poetry plugins, I've started to employ a new CDK approach to dynamically create a requirements.txt file for lambda dependencies using reusable LambdaFunction construct.

Here is the LambdaFunction construct:

...

class LambdaFunction(_lambda.Function):
    """
    Lambda function with Powertools layer and optional requirements bundling
    """
    def __init__(
        self,
        scope: Construct,
        id: str,
        path: str,
        handler: str,
        bundle: bool,
        **kwargs,
    ) -> None:
        
        location: Path = Path(path)
        # Check if handler file exists
        if not (location / (handler.split(".")[0] + ".py")).exists():
            raise FileNotFoundError(f"Handler file {handler} not found in path {path}")

        kwargs["code"] = lambda_.Code.from_asset(path)
        kwargs["handler"] = handler
        if "architecture" not in kwargs:
            # Default to ARM if not specified to save costs
            kwargs["architecture"] = lambda_.Architecture.ARM_64

        # Enable active tracing for all lambdas
        kwargs["tracing"] = lambda_.Tracing.ACTIVE

        # Powertools layer to improve logging and o11y
        power_tools_layer_name = (
            "AWSLambdaPowertoolsPythonV3-python312-x86:5"
            if kwargs["architecture"] == lambda_.Architecture.X86_64
            else "AWSLambdaPowertoolsPythonV3-python312-arm64:5"
        )

        kwargs["layers"] = [
            lambda_.LayerVersion.from_layer_version_arn(
                scope,
                f"{id}PowerToolsPython",
                layer_version_arn=f"arn:aws:lambda:{Aws.REGION}:017000801446:layer:{power_tools_layer_name}",
            ),
        ]
        
        if bundle:
            # Dynamically create requirements.txt file for each lambda by path directory name
            project_file = open("pyproject.toml", "rb")
            poetry_config = tomllib.load(project_file)
            group = location.name.split("/")[-1]
            
            if group not in poetry_config["tool"]["poetry"]["group"]:
                existing_groups =  "', '".join(poetry_config["tool"]["poetry"]["group"].keys())
                raise ValueError(f"Poetry dependency group '{group}' not found in pyproject.toml. Existing groups: '{existing_groups}'")
            
            requirements = "\n".join(
                [f"{dep}=={version.lstrip('^')}" for dep, version in poetry_config["tool"]["poetry"]["group"][group]["dependencies"].items()]
            )
            with open(location / "requirements.txt", "w") as f:
                f.write(requirements)

        if bundle and (location / "requirements.txt").exists():
            kwargs["layers"].append(
                lambda_.LayerVersion(
                    scope,
                    f"{id}RequirementsLayer",
                    description=f"Requirements file for {id}",
                    compatible_runtimes=[
                        config.lambda_runtime,
                    ],
                    compatible_architectures=[
                        kwargs["architecture"],
                    ],
                    layer_version_name=f"LayerFor{id}",
                    code=lambda_.Code.from_asset(
                        path,
                        bundling=cdk.BundlingOptions(
                            image=config.lambda_runtime.bundling_image,
                            command=[
                                "bash",
                                "-c",
                                "pip install -r requirements.txt -t /asset-output/python && cp -au . /asset-output",
                            ],
                            user="root",
                        ),
                    ),
                ),
            )
        
        # Set default log retention and log group
        if "log_retention" not in kwargs and "log_group" not in kwargs:
            kwargs["log_retention"] = RetentionDays.ONE_WEEK

        if "environment" not in kwargs:
            kwargs["environment"] = {}

        # Set default log level to DEBUG
        kwargs["environment"]["POWERTOOLS_LOG_LEVEL"] = "DEBUG"

        super().__init__(
            scope,
            id,
            runtime=Runtime.PYTHON_3_12,
            **kwargs,
        )

So the class above aims to create a reusable Lambda construct with Powertools layer for observability, and optional bundling of requirements for each lambda. It sets all logs to DEBUG, weekly retention, and active tracing, as well and standardizes the lambda runtime to Python 3.12.

You can read more about Lambda AWS Powertools here.

Let've move into how to actually use the CDK construct. Do note bundle boolean parameter, which is responsible for creating a requirements.txt file and bundling it with the lambda code:

  email_processor = LambdaFunction(
      self,
      id="PaymentProcessor",
      handler="lambda_handler.handler",
      path="path/to/email_processor",
      bundle=True,
  )

With Poetry 1.8.x, you'd now be able to run poetry shell to access your environment, and run your cdk diff and cdk synth commands, now that has been deprecated.

All due to the amount of issues with subshell, from launching within different shell types, to keyboard commands playing up. Take a look at the issue tracker here.

Instead, run poetry run cdk diff and poetry run cdk synth, that should run within the virtual environment and create dynamic requirements.txt files for each lambda.

Pre-commit hooks

With the new year, I've also decided to include pre-commit hooks to my CDK projects, to ensure code quality and consistency. Here is how you can add it to your project.

Install, pre-commit with brew install pre-commit (for MacOS) and add touch .pre-commit-config.yaml file to the root of your project, with the following content:

# See https://pre-commit.com for more information
# See https://pre-commit.com/hooks.html for more hooks
repos:
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v5.0.0
    hooks:
      - id: trailing-whitespace # checks for trailing whitespace.
      - id: end-of-file-fixer # fixes missing new line at the end of files.
      - id: check-docstring-first # checks that files have a docstring.
      - id: check-added-large-files # prevents giant files from being committed.
      - id: check-case-conflict # checks for files that would conflict in case-insensitive filesystems.
      - id: check-merge-conflict # checks for files that contain merge conflict strings.
      - id: check-yaml # checks yaml files for parsable syntax.
      - id: check-json # checks json files for parsable syntax.
      - id: detect-private-key # detects the presence of private keys.
      - id: fix-byte-order-marker # removes utf-8 byte order marker.
      - id: mixed-line-ending # replaces or checks mixed line ending.

Et voila! Now run pre-commit install to install the git hooks. When you commit your changes, pre-commit will run the hooks and ensure your code is clean and consistent.

I hope this guide helps you to modernize your CDK projects with Poetry 2.x and pre-commit hooks. If you have any questions or suggestions, feel free to reach out to me on X.