Optimising AWS SnapStart and Spring Boot Java Lambdas

This article looks at optimising a Java Spring Boot application (Cloud Function style) with AWS SnapStart, and covered advanced optimisation with lifecycle management of pre snapshots and post restore of the application image by AWS SnapStart. We cover optimising a lambda for persistent network connection style conversational resources, such as an RDBMS, SQL, legacy messaging framework, etc.

How Snap Start Works

To import start up times for a cold start, SnapStart snapshots a virtual machine and uses the restore of the snapshot rather than the whole JVM + library startup time. For Java applications built on frameworks such as Spring Boot, this provides order of magnitude time reductions on cold start time. For a comparison with raw, SnapStart and Graal Native performance see our article here.

What frameworks do we use with Spring Boot?

For our Java Lambdas we use Spring Cloud Function with the AWS Lambda Adaptor. For an example for how we set this up, and links to our development frameworks and code, see our article AWS SnapStart for Faster Java Lambdas

Default SnapStart: Simple Optimisation of the Lambda INIT phase

When the lambda version is published SnapStart will run up the Java application to the point that the lambda is initialised. For a spring cloud function application, this will complete the Spring Boot lifecycle to the Container Started phase. In short, all your beans will be constructed, injected and started from a Spring Container perspective.

AWS: Lambda Execution Lifecycle

SnapStart will then snapshot the virtual machine with all the loaded information. When the image is restored, the exact memory layout of all classes and data in the JVM is restored. Thus any data loaded in this phase as part of a Spring Bean Constructor, @PostCreate annotated methods and ContextRefresh event handlers will have been reloaded as part of the restore.

Issues with persistent network connections

Where this breaks down is if you wish to use a “persistent” network connection style resource, such as a RDBMS connection. In this example, usually in a Spring Boot application a Data Source is configured and the network connections initialised pre container start. This can cause significant slow downs when restoring an image, perhaps weeks after its creation, as all the network connections will be broken.

For a self healing data source, when a connection is requested the connection will check, timeout and have to reconnect the connection and potentially start a new transaction for the number of configured connections in the pool. Even if you smartly set the pool size to one, given the single threaded lambda execution model, that connection timeout and reconnect may take significant time depending on network and database settings.

Advanced Java SnapStart: CRaC Lifecycle Management

Project CRaC, Co-ordinated Restore at Checkpoint, is a JVM project that allows responses to the host operating system having a checkpoint pre a snapshot operation, and the signal that a operating system restore has occurred. The AWS Java Runtime supports integration with CRaC so that you can optimise your cold starts even under SnapStart.

At the time of our integration, we used the CRaC library to create a base class that could be used to create a support class that can handle “manual” tailoring of preSnapshot and postRestore events. Newer versions of boot are integrating CRaC support – see here for details.

We have created a base class, SnapStartOptimizer, that can be used to create a spring bean that can respond to preSnapshot and postRestore events. This gives us two hooks into the lifecycle:

  1. Load more data into memory before the snapshot occurs.
  2. Restore data and connections after we are running again.

Optimising pre snapshot

In this example we have a simple Spring Component that we use to exercise some functionality (http based) to load and lazy classes, data, etc. We also exercise the lookup of our spring cloud function definition bean.

@Component
@RequiredArgsConstructor
public class SnapStartOptimisation extends SnapStartOptimizer {

    private final UserManager userManager;
    private final TradingAccountManager accountManager;
    private final TransactionManager transactionManager;

    @Override
    protected void performBeforeCheckpoint() {
        swallowError(() -> userManager.fetchUser("thisisnotatoken"));
        swallowError(() -> accountManager.accountsFor(new TradingUser("bob", "sub")));
        final int previous = 30;
        final int pageSize = 10;
        swallowError(() -> transactionManager.query("435345345",
                                                    Instant.now().minusSeconds(previous),
                                                    Instant.now(),
                                                    PaginatedRequest.of(pageSize)));
        checkSpringCloudFunctionDefinitionBean();
    }
}

Optimising post restore – LambdaSqlConnection class.

In this example we highlight our LambdaSqlConnection class, which is already optimised for SnapStart. This class exercises a delegated java.sql.Connection instance preSnapshot to confirm connectivity, but replaces the connection on postRestore. This class is used to implement a bean of type java.sql.Connection, allowing you to write raw JDBC in lambdas using a single RDBMS connection for the lambda instance.

Note: Do not use default Spring Boot JDBC templates, JPA, Hibernate, etc in lambdas. The overhead of the default multi connection pools, etc is inappropriate for lambda use. For heavy batch processing a “Run Task” ECS image is more appropriate, and does not have 15 minute timeout constraints.

So how does it work?

Instances and interfaces managed by LambdaSqlConnection
  1. The LambdaSqlConnection class manages the Connection bean instance.
  2. When preSnapshot occurs, LambdaSqlConnection closes the Connection instance.
  3. When postRestore occurs, LambdaSqlConnection reconnects the Connection instance.

Because LambdaSqlConnection creating a dynamic proxy as the Connection instance, it can manage the delegated connection “behind” the proxy without your injected Connection instance changing.

Using Our SQL Connection replacement in Spring Boot

See the code at https://github.com/LimeMojito/oss-maven-standards/tree/master/utilities/aws-utilities/lambda-sql.

Maven dependency:

<dependency>
   <groupId>com.limemojito.oss.standards.aws</groupId>
   <artifactId>lambda-sql</artifactId>
   <version>15.0.2</version>
</dependency>

Importing our java.sql.Connection interceptor

@Import(LambdaSqlConnection.class)
@SpringBootApplication
public class MySpringBootApplication {

You can now remove any code that is creating a java.sql.Connection and simply use a standard java.sql.Connection instance injected as a dependency in your code. This configuration creates a java.sql.Connection compatible bean that is optimised with SnapStart and delegates to a real SQL connection.

Configuring your (real) DB connection

Example with Postgres driver.

lime:
  jdbc:
    driver:
      classname: org.postgresql.Driver
    url: 'jdbc:postgresql://localhost:5432/postgres'
    username: postgres
    password: postgres

Example spring bean using SQL

@Service
@RequiredArgsConstructor
public class MyService {
    private final Connection connection;

    @SneakyThrows
    public int fetchCount() {
      try(Statement statement = connection.createStatement()){
         try(ResultSet results = statement.executeQuery("count(1) from some_table")) {
             results.next();
             results.getInt(1);
         }
      }
    }
}

References

Deploying Java Lambda with Localstack

We deploy and debug our Java Lambda on development machines using Localstack to emulate and Amazon Web Services (AWS) account. This article walks through the architecture, deployment using our open source java framework to local stack and enabling a debug mode for remote debugging using any Java integrated development environment (IDE).

These capabilities live in our test-utilities module, LambdaSupport.java.

Localstack development architecture

Our build framework uses Docker to deploy a Localstack image, then we use AWS Api calls to deploy a zip of our lambda java classes to the Localstack lambda engine. Due to the size of the zip files, we need to deploy the lambda using a S3 url. We use Localstack’s S3 implementation to emulate the process.

When the lambda is deployed, the Localstack Lambda engine will pull the AWS Lambda Runtime image from public ECR and then perform the deployment steps. Using the Localstack endpoint for lambda we now have a full environment where we can perform a lambda.invoke to test the deployed function.

Figure 1: Development architecture using Localstack for lambda deployment

Viewing lambda logs

With the appropriate Localstack configuration we can view lambda logs for both startup and run of the lambda. Note these logs appear in the docker logs for the AWS Lambda Runtime Container. This container spins up when the lambda is deployed.

The easiest method we use to see the logs is to:

  1. Run the Junit test in debug, with a breakpoint after the lambda invoke.
  2. When the breakpoint is hit, use docker ps and docker logs to see the output of the Lambda Runtime.
  3. In IntelliJ Ultimate, you can see the containers deployed via the Services pane after connecting to your docker daemon.

Using the architecture in debug mode

We can use this architecture to remote debug the deployed lambda. Our LambdaSupport class includes configuration on deploy to enable debug mode as per the Localstack documentation https://docs.localstack.cloud/user-guide/lambda-tools/debugging/. With our support class you simply switch from java() to javaDebug() and the deploy will configure the runtime for debug mode (port 5050 by default).

In your docker-compose.yml, set the environment variable LAMBDA_DOCKER_FLAGS=-p 127.0.0.1:5050:5050 -e LS_LOG=debug.

This enables port passthrough for the java debugger from localhost to port 5050 of the container (assuming that is where the JVM debugging is configured for).

Do not commit this code as it will BLOCK test threads until a debugger is connected (port 5050 by default).

Figure 2: Localstack Java Lambda debug architecture

References:

Code examples

See https://github.com/LimeMojito/oss-maven-standards/blob/master/development-test/jar-lambda-poc/src/test/java/ApplicationIT.java for a full example.

Adding test-utilities to your maven project

These are included by default if you use our jar-lambda-development parent POM.

See our post about using our build system for maven.

Otherwise you can manually add the support as below (version omitted),

<dependency>
    <groupId>com.limemojito.oss.test</groupId>
    <artifactId>test-utilities</artifactId>
    <scope>test</scope>
</dependency>
<dependency>
    <!-- Access for LambdaSupport -->
    <groupId>software.amazon.awssdk</groupId>
    <artifactId>lambda</artifactId>
    <scope>test</scope>
</dependency>
<dependency>
    <!-- Access for LambdaSupport -->
    <groupId>software.amazon.awssdk</groupId>
    <artifactId>s3</artifactId>
    <scope>test</scope>
</dependency>

Loading the lambda as a static variable in a unit test.

We recommend a static initialised once a junit setup function due to the time to deploy the lambda.

The LambdaSupport.java method performs deployment of the supplied module zip to Localstack S3, then invokes the AWS Lambda API to confirm that the lambda has started cleanly (state == Active).

private static Lambda LAMBDA;
...
// environment variables for the lambda configuration
final Map<String, String> environment = Map.of(
                    "SPRING_PROFILES_ACTIVE", "integration-test"
                    "SPRING_CLOUD_FUNCTION_DEFINITION","get"
            );
// using the lambda zip that was built in module ../jar-lambda-poc
LAMBDA = lambdaSupport.java("../jar-lambda-poc",
                            LimeAwsLambdaConfiguration.LAMBDA_HANDLER,
                            environment);

Invoking the lambda for black box testing

This example is using a static variable for the Lambda, JUnit 5 and assert4J. An AWS API Gateway event JSON is loaded and invoked to the deployed lambda. The result is asserted.

Full example is in our oss-maven-standards repository as in integration test (IT, run by failsafe).

@Test
public void shouldCallTransactionPostOkApiGatewayEvent() {
    final APIGatewayV2HTTPEvent event = json.loadLambdaEvent("/events/postApiEvent.json",
                                                             APIGatewayV2HTTPEvent.class);

    final APIGatewayV2HTTPResponse response = lambdaSupport.invokeLambdaEvent(LAMBDA,
                                                                              event,
                                                                              APIGatewayV2HTTPResponse.class);

    assertThat(response.getStatusCode()).isEqualTo(200);
    String output = json.parse(response.getBody(), String.class);
    assertThat(output).isEqualTo("world");
}

Localstack lambda deployment debug example

We alter the setup to use the deprecated javaDebug function. Do not commit this code as it will BLOCK test threads until a debugger is connected (port 5050 by default).

For a clean setup in Intelij that waits for the lambda to start in debug mode, see the excellent article on Localstack https://docs.localstack.cloud/user-guide/lambda-tools/debugging/ “Configuring IntelliJ IDEA for remote JVM debugging”.

// using the lambda zip that was built in module ../jar-lambda-poc
LAMBDA = lambdaSupport.javaDebug("../jar-lambda-poc",
                                 LimeAwsLambdaConfiguration.LAMBDA_HANDLER,
                                 environment);

Maintainable builds – with Maven!

Maven is known to be a verbose, opinionated framework for building applications, primarily for a Java Stack. In this article we discuss Lime Mojito’s view on maven, and how we use it to produce maintainable, repeatable builds using modern features such as automated testing, AWS stubbing (LocalStack) and deployment. We have OSS standards you can use in your own maven builds at https://bitbucket.org/limemojito/oss-maven-standards/src/master/ and POM’s on maven central.

Before we look at our standards, we set the context of what drives our build design by looking at our technology choices. We’ll cover why our developer builds are setup this way, but not how our Agile Continuous Integration works in this post.

Lime Mojito’s Technology Choices

Lime Mojito uses a Java based technology stack with Spring, provisioned on AWS. We use AWS CDK (Java) for provisioning and our lone exception is for web based user interfaces (UI), where we use Typescript and React with Material UI and AWS Amplify.

Our build system is developer machine first focused, using Maven as the main build system for all components other than the UI.

Build Charter

  • The build enforces our development standards to reduce the code review load.
  • The build must have a simple developer interface – mvn clean install.
  • If the clean install passes – we can move to source Pull Request (PR).
    • PR is important, as when a PR is merged we may automatically deploy to production.
  • Creating a new project or module must not require a lot of configuration (“xml hell”).
  • A module must not depend on another running Lime Mojito module for testing.
  • Any stub resources for testing must be a docker image.
  • Stubs will be managed by the build process for integration test phase.
  • The build will handle style and code metric checks (CheckStyle, Maven Enforcer, etc) so that we do not waste time in PR reviews.
  • For open source, we will post to Maven Central on a Release Build.

Open Source Standards For Our Maven Builds

Our very “top” level of build standards is open source and available for others to use or be inspired by:

Bitbucket: https://bitbucket.org/limemojito/oss-maven-standards/src/master/

The base POM files are also available on the Maven Central Repository if you want to use our approach in your own builds.

https://repo.maven.apache.org/maven2/com/limemojito/oss/standards/

Maven Example pom.xml for building a JAR library

This example will do all the below with only 6 lines of extra XML in your maven pom.xml file:

  • enforce your dependencies are a single java version
  • resolve dependencies via the Bill of Materials Library that we use too smooth out our Spring + Spring Boot + Spring Cloud + Spring Function + AWS SDK(s) dependency web.
  • Enable Lombok for easier java development with less boilerplate
  • Configure code signing
  • Configure maven repository deployment locations (I suggest overriding these for your own deployments!)
  • Configure CheckStyle for code style checking against our standards at http://standards.limemojito.com/oss-checkstyle.xml
  • Configure optional support for docker images loading before integration-test phase
  • Configure Project Lombok for Java Development with less boilerplate at compile time.
  • Configure logging support with SLF4J
  • Build a jar with completed MANIFEST.MF information including version numbers.
  • Build javadoc and source jars on a release build
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>my.dns.reversed.project</groupId>
    <artifactId>my-library</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <packaging>jar</packaging>

    <parent>
        <groupId>com.limemojito.oss.standards</groupId>
        <artifactId>jar-development</artifactId>
        <version>13.0.4</version>
        <relativePath/>
    </parent>
</project>

When you add dependencies, common ones that are in or resolved via our library pom.xml do not need version numbers as they are managed by our modern Bill of Materials (BOM) style dependency setup.

Example using the AWS SNS sdk as part of the jar:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>my.dns.reversed.project</groupId>
    <artifactId>my-library</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <packaging>jar</packaging>

    <parent>
        <groupId>com.limemojito.oss.standards</groupId>
        <artifactId>jar-development</artifactId>
        <version>13.0.4</version>
        <relativePath/>
    </parent>

    <dependencies>
        <dependency>
            <groupId>software.amazon.awssdk</groupId>
            <artifactId>sns</artifactId>
        </dependency>
    </dependencies>
</project>

Our Open Source Standards library supports the following module types (archetypes) out of the box:

TypeDescription
java-developmentBase POM used to configure deployment locations, checkstyle, enforcer, docker, plugin versions, profiles, etc. Designed to be extended for different archetypes (JAR, WAR, etc.).
jar-developmentBuild a jar file with test and docker support
jar-lamda-developmentBuild a Spring Boot Cloud Function jar suitable for lambda use (java 17 Runtime) with AWS dependencies added by default. Jar is shaded for simple upload.
spring-boot-developmentSpring boot jar constructed with the base spring-boot-starter and lime mojito aws-utilities for local stack support.
Available Module Development Types

We hope that you might find these standards interesting to try out.

Native Java AWS Lambda with Graal VM

Update: 20/8/2023: After the CDK announcement that node 16 is no longer supported after September 2023 we realised that we can’t run CDK and node on Amazon Linux2 for our build agents. We upgraded our agents to AL2023 and found out the native build produces incompatible binaries due to GLIBC upgrades, and Lambda does not support AL2023 runtimes.
We have given up with this native approach due to the fragility of the platform and are investigating AWS Snapstart which now has Java 17 support.

Update: 02/9/2023: We have switched to AWS Snap Start as it appears to be a better trade off for application portability. Short builds and no more binary compatibility issues.

Native Java AWS Lambda refers to Java program that has been compiled down to native instructions so we can get faster “cold start” times on AWS Lambda deployments.

Cold start is the initial time spent in a Lambda Function when it is first deployed by AWS and run up to respond to a request. These cold start times are visible to a caller has higher latency to the first lambda request. Java applications are known for their high cold start times due to the time taken to spin up the Java Virtual Machine and the loading of various java libraries.

We built a small framework that can assemble either a AWS Lambda Java runtime zip, or a provided container implementation of a hello world function. The container provided version is an Amazon Linux 2 Lambda Runtime with a bootstrap shell script that runs our Native Java implementation.

These example lambdas are available (open source) at https://bitbucket.org/limemojito/spring-boot-framework/src/master/development-test/

Note that these timings were against the raw hello java lambda (not the spring cloud function version).

@Slf4j
public class MethodHandler {
    public String handleRequest(String input, Context context) {
        log.info("Input: " + input);
        return "Hello World - " + input;
    }
}

Native Java AWS Lambda timings

We open with a “Cold Start” – the time taken to provision the Lambda Function and run the first request. Then a single request to the hot lambda to get the pre-JIT (Just-In-Time compiler) latency. Then ten requests to warm the lambda further so we have some JIT activity. Max Memory use is also shown to get a feel system usage. We run up to 1GB memory sizing to approach 1vCPU as per various discussions online.

Note that we run the lambda at various AWS lambda memory settings as there is a direct proportional link between vCPU allocation and the amount of memory allocated to a lambda (see AWS documentation).

This first set of timings is for a Java 17 Lambda Runtime container running a zip of the hello world function. Times are in milliseconds.

Java Container1282565121024
Cold Start6464506640543514
19052165
10X603054
Max Mem126152150150
Java Container Results
Native Java1282565121024
Cold14271002773670
110445
10X4433
Max Mem111119119119
Native Java Results

The comparison of the times below show the large performance gains for cold start.

Conclusion

From our results we have a 6X performance improvement in cold starts leading to sub second performance for the initial request.

The native version shows a more consistent warm lambda behaviour due to the native lambda compilation process. Note that the execution times seem to trend for both Java and native down to sub 10ms response times.

While there is a reduction in memory usage this is of no realisable benefit as we configure a larger memory size to get more of a vCPU allocation.

However be aware that build times increased markedly due to the compilation phase (from 2 minutes to 8 for a hello world application). This compilation phase is very CPU and memory intensive so we had to increase our build agents to 6vCPU and 8GB for compiles to work.