© Copyright 2022 Michael Simons.

Preface and introduction

This is a small ebook about maintaining a medium-sized Java project in 2022. The project in question Neo4j-Migrations available at github.com/michael-simons/neo4j-migrations. Things addressed in here are among others the choice of Java 8 as baseline, the build systems, build-plugins considered to be useful, recommendations about structure but also concepts of certain libraries, GraalVM native and how to make things work with each other.

Neo4j-Migrations has been conceived in early 2020 and has developed from basically one single Spring InitializingBean with a couple of logic attached to it to a multi-module project with a core API integrated into several ecosystems. The domain of that project is not too complex to understand: It’s goal is to run one or more migrations or database refactoring against a Neo4j graph database instance in a reliable fashion, making sure each refactoring is applied only once. Those migrations can come in to flavors: Either as script (a textfile containing Cypher code) or as a class runnable on the JVM.

This will give a couple of interesting things to discuss:

  • Which Java version to pick (taking into account it was 2020 when that decision was made)?

    • How to benefit from modern Java nevertheless?

    • How to prepare for or even embrace the Java module system?

  • Which build tool did I chose and why?

    • What are my approaches to make the build as readable as possible?

    • How do I tackle programmatic tasks in the build in contrast to declarative ones

    • How to automate testing?

    • How to automate releasing and where to release?

  • Can Java be good choice for CLI tooling?

    • What is possible with GraalVM native image and what isn’t?

  • How to provide uniform access to this library in peoples favorite application frameworks?

    • Revisiting Spring Boot starter

    • Learning about Quarkus Extensions

This publication is of course highly subjective. Actually, like most books, posts and what have you out there: It works for me and this project. It doesn’t work necessarily for your setup. Some things might even be harmful for your cases, probably you find some of them utterly and outright stupid.

What’s the size of the project?

"Sloc Cloc and Code" [scc] gives me the following output for version 1.3.0, along with some cost estimation based on the COCOMO [COCOMO] model:

scc --exclude-dir docs/book
───────────────────────────────────────────────────────────────────────────────
Language                 Files     Lines   Blanks  Comments     Code Complexity
───────────────────────────────────────────────────────────────────────────────
Java                       136     12581     1935      3764     6882        273
XML                         29      3430      151       116     3163          0
AsciiDoc                     8      1507      329         0     1178          0
Properties File              7        36        3        13       20          0
Shell                        5       373       44        63      266         37
YAML                         5       305       28         0      277          0
Smarty Template              3        44        7         0       37          2
Groovy                       2        71        7        32       32          1
JSON                         2        23        0         0       23          0
Plain Text                   2       219       33         0      186          0
Batch                        1       182       35         0      147         30
Markdown                     1       128       35         0       93          0
gitignore                    1        33        5         4       24          0
───────────────────────────────────────────────────────────────────────────────
Total                      202     18932     2612      3992    12328        343
───────────────────────────────────────────────────────────────────────────────
Estimated Cost to Develop (organic) $377,600
Estimated Schedule Effort (organic) 9.499187 months
Estimated People Required (organic) 3.531523
───────────────────────────────────────────────────────────────────────────────
Processed 644610 bytes, 0.645 megabytes (SI)
───────────────────────────────────────────────────────────────────────────────

The estimated schedule effort is actually not completely wrong.

1. Which Java version to pick in early 2020?

Shiny new things are always neat, but not every target environment supports the latest bits and pieces While an application that bring its own runtime, deployed in a container of its own or similar has a lot of freedom to just pick the latest and greatest, a library may have not.

1.1. Use the current LTS version or the most widely adopted one?

JDK 11 had been published as LTS-Version in September 2018. 11 incorporates most of the features from 9 and 10, with a couple of things I think would have been useful to have in the core API of this project such as:

  • actually and without joking, the module system

  • a way nicer Optional with methods such as or, flatmap and ifPresentOrElse

  • finally, factory methods for collections

  • effective-final variables as resources in the try-with-resource statement (var resource = new CypherResource(); try(resource) {} or coming in as a method argument)

  • some nice additions to the stream API (dropWhile and takeWhile)

  • the reserved type name var defined by [jep286] in a couple of places

  • the new not-predicate method

  • the Flight Recorder from [jep328]

As this project is intended as a library and incorporated into other peoples programs I am not that much interested in the addition of new garbage collectors. If this had been an application, those additions would have had a much bigger influence.

Neo4j-Migrations is a library, however. I want as many people as possible to be able to use it in their applications if they have the need for such a functionality. In 2020 JDK 8 was still the correct choice: It’s a given fact that enterprise users are slow to migrate their runtime. Of course, it is valid to argue that a nudge might help them to migrate, but the reality seems to be different. And as a matter of fact, it’s not only enterprises. Or at least not only enterprises directly. One of the biggest drivers of the Java ecosystem, the Spring Framework, still targets JDK 8. And so do value adding libraries on top of it, like Spring Data, JHipster and many more: I wanted Neo4j-Migrations to benefit from that ecosystem without forcing users to bump their JDK if they use my library.

So looking at the above list and seeing a couple of things that would have been nice to have in contrast to actual value, the decision was made for JDK 8, and I am happy with it.

At the time of writing JDK 17, the current LTS, has been out for 4 months already. JDK 17 brings many more things to the table. The most prominent features being text blocks, records, sealed classes and pattern matching. Those, together with the fact that some runtimes like Helidon [helidon] and Quarkus [quarkus] already made the jump to 11 since 2020 and Spring Framework 6 will even have JDK 17 as baseline and has just dropped the first beta[1] would have probably led to a different decision.

1.2. Caveats and how to address them

The caveats with rooting for JDK 8 come in the form of a couple of "But what if…".

  • What if it isn’t the worst idea to actually be prepared for [jep261], the Java module system delivered in JDK 9 and used in JDK 17 via [jep396] to encapsulate JDK internals by default

  • What if some features are so useful that I still want them? Like at least being able to put up bigger sign than just naming a package internal to prevent people from using things they should not use because I plan to change them as I see fit?

  • What if I myself want to use a library that already made the jump beyond 8?

Those can be addressed by a couple of things:

1.2.1. Don’t rely on a derived automatic module name

What is an automatic Java module name? An automatic module name basically indicates the absence of a module descriptor (module-info.java). If the JAR file containing the classes in question has the attribute "Automatic-Module-Name" in its main manifest then its value is the module name. The module name is otherwise derived from the name of the JAR file.

First and foremost, be a good citizen and define an automatic module name right from the start and don’t rely on a name derived from your libraries JAR file. This prevents that users of your library that are running on JDK 9+ on the module path have to deal with a derived name in their module-info.java when referring your library. Why would that be an issue? Because you will break them as soon as you want to benefit from modules yourself in which you need to declare a non-automatic name which will most likely be different from the one derived from the JAR file. Christian Stein has a couple of suggestions in [stein-maven-coordinates-and-module-names] I followed loosely when choosing the module names of Neo4j-Migrations and other current projects.

1.2.2. Use the latest and greatest JDK

Use the latest and greatest JDK to compile and know about the --release flag available for javac[2] since JDK 9. The --release flag is different from --source and --target: In contrast to the combination of the latter, --release will not only make sure that the --target matches the --source but also that only API is being used that is available in the targeted release. In other words: With --source and --target alone you would be able to compile Java 8 syntax into Java 8 byte code but still using API only available in higher JDKs when not using JDK 8 itself. The release flag prevents this.

1.2.3. Learn about Multi-Release Jars

Multi-Release Jars, defined in [jep238], extend the JAR file format to allow multiple, Java-release-specific versions of class files to coexist in a single archive. This allows for a couple of things:

  • You are not restricted to an automatic module name for your library targeting primarily Java 8. A valid module-info.java can be part of the Jar for JDK 9+

  • In case you would benefit so much from new API that it is worth the effort of maintaining two versions of class, you can do that

  • One thing I did for JDK 17 was making use of sealed classes so that I can provide more guidance in the API which interfaces I introduced only for the API itself and which one and through what hierarchy people are supposed to implement.

The downside to it is IDE support in my experience. I haven’t found a good way to work with the source structure of a Multi-Release Jar, but I will write more about it in Chapter 2.

1.2.4. What about JDK 9+ dependencies?

This is a problem if you need such dependencies and still want to support JDK 8. The only solution here is not to have them.

2. Building things

In this part the choice of the build tool, plugins for it and externalized programmatic additions will be discussed

2.1. Welcome to Burning Geek Con: The choice of a build tool

We have suggested the "Insult Con" back in 2016 already. Choosing a build tool is probably worth a whole track alone. Let’s discuss the options:

We might add ant and a couple of others to it as well, plus every homegrown solution out there. Of course, a homegrown solution can be totally fine as well. Neo4j-Migrations can be seen as homegrown, too of course. It solved one issue and evolved later. Just because a thing is on GitHub it doesn’t mean it’s automatically better than what you would write inside your company.

To cut it short: I opted for good, old, boring Maven. Why? Because I am not only familiar with it, but it’s boring in a positive way: If you don’t escalate in a pom file, most projects look more or less the same and most of the time, they behave the same.

2.1.1. Why not Gradle?

I never felt like I need the promoted flexibility it brings. I don’t need the Groovy (or Kotlin) these days for the mast majority of tasks I have in my builds. Also, I did observe several times that using newer versions of Java than the ones that have been latest when a specific version of Gradle has been released, causes problems. This is because Gradle depends on [ASM]. ASM is an all-purpose Java bytecode manipulation and analysis framework. It can be used to modify existing classes or to dynamically generate classes, directly in binary form. As such it is tailored to specific Java versions.

While this is not inherently a bad thing, I prefer not having to deal with it, especially as there wasn’t anything on the table I couldn’t get from Maven.

Of course there is the valid argument of incremental builds: Gradle seems to be a lot better in that regard in contrast to Maven. Also, there is the build cache who has done great things for Spring Boot as reported back in 2020[3]:

The Spring Boot team’s primary reason for considering a switch to Gradle was to reduce the time that it takes to build the project. We were becoming frustrated with the length of the feedback loop when making and testing changes.
— Andy Wilkinson in "Migrating Spring Boot’s Build to Gradle"

In this project here I would benefit much more in shaving more than just seconds from the lengthy restart time of the database containers being used to verify migrations against.

2.1.2. Why not Bazel?

Bazels focus seems to be a lot on incremental build, local or distributed caching. While it was intriguing to try out something completely new, it would have added a steep learning curve for me and I was more focussed on solving a business usecase.

2.1.3. Why not Bach?

Bach is an interesting contender in that regard that it uses Java and Java Modules to build Java Modules. I did work a bit with the author, Christian Stein, on it, and we migrated one other pet project of mine to use it. I like the "Java only" approach a lot, and I think it has potential. However, it is Java Modules only. While I am gonna speak about modules in this publication, Neo4j-Migrations is not only Java modules.

In addition, Bach is a lot of work in progress. Of course, this is not bad and someone has to solve the egg-and-hen-problem here, but I am just not up for that call.

2.1.4. Maven it is

Before I jump into how I configure my projects these days with Maven, just a quick recap: The word Maven is a Yiddish word for "expert", derived from the Hebrew word mayvin (מבין), which means "who understands." A maven is an expert who understands the skill or subject at hand:

Maven approaches a couple of things with "convention over configuration": It tries to tackle dependency management, the build process and the deployment. It all evolves around a standard cycle of validate, compile, test, packaging, integration testing, installation and deploymenet.

One of the biggest critics here is having the "Project Object Model descriptor" - POM for short - materialized as pom.xml doing both dependency management, build description and eventual being a representation of a deployed artifact.

I have decided that I can happily live with that.

Read the results from the Maven Dependencies Pop Quiz[4] by Andres Almiray. Especially before you post stuff like this thing here to the internet. When in doubt, Andres is right.

2.2. My Maven best practices

2.2.1. Use properties and dependency management for versions

Even in single-module project (Neo4j-Migrations is a multi-module project), I put every version of every dependency and every plugin I use into a property in the <properties /> element. For dependencies, I use the <dependencyManagement /> element. The later might seem like overkill in a single module, but it will be a lifesaver when you encounter non-converging dependencies: Those are dependencies that are transitive dependencies of your direct dependencies with mismatching versions: Your direct dependency A depends on X:1, your direct dependency B on X:2.

When you declare a dependency on A before B, Maven will resolve X:1. When B is declared before A, it will resolve X:2. WHy? Maven does build a graph of dependencies. For transitive dependencies, it will pick out the nearest dependency. That is, the one that has the fewest number of hops from the direct dependency. This severely threatens the stability of your build.

Instead of excluding the transitive dependency from one direct dependency, you should use dependency management to exactly specify which version you want (and from which you hope that is compatible with both direct dependencies).

The list of properties containing the versions helps me to keep an overview of what I have. The dependency management allows me in submodules to omit the version altogether. Have a look at my pom.xml, in the properties section[5] and the dependency management[6].

If you don’t want to read all the following, you might consider having a look at the OSS Quickstart archetype [oss-quickstart] from Gunnar Morling.

2.2.2. Make use of the enforcer plugin

The self-titled "The Loving Iron Fist of Maven™", the [maven-enforcer-plugin]. It helps you to keep things straight. I used it to

  • enforce the Java version required to build this project

  • the Maven version being used

  • that all dependencies converge

See my configuration in Listing 1.

Listing 1. My enforcer config
<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-enforcer-plugin</artifactId>
    <version>${maven-enforcer-plugin.version}</version>
    <executions>
        <execution>
            <id>enforce</id>
            <goals>
                <goal>enforce</goal>
            </goals>
            <phase>validate</phase>
            <configuration>
                <rules>
                    <requireJavaVersion>
                        <version>17</version>
                    </requireJavaVersion>
                    <DependencyConvergence />
                    <requireMavenVersion>
                        <version>${maven.version}</version>
                    </requireMavenVersion>
                </rules>
            </configuration>
        </execution>
    </executions>
</plugin>

It is so good, it brings tears to peoples faces[7].

2.2.3. More conventions

What is better than conventions? Well, more conventions[8]. The first one who brought the order of elements in a POM-file to attention was longtime Java User Group contributor and friend Franz van Betteraey. The second one came from the Spring Boot team. I remember in one PR we discussed the order of dependencies.

All of that got me thinking: I don’t want to think too much, and just get it done. Ever since then, I add the "Sortpom Maven Plugin" [sortpom] to new projects like that:

Listing 2. Apply automatic sorting to pom files
<plugin>
    <groupId>com.github.ekryd.sortpom</groupId>
    <artifactId>sortpom-maven-plugin</artifactId>
    <version>${sortpom-maven-plugin.version}</version>
    <configuration>
        <encoding>${project.build.sourceEncoding}</encoding>
        <keepBlankLines>true</keepBlankLines>
        <nrOfIndentSpace>-1</nrOfIndentSpace>
        <sortProperties>true</sortProperties>
        <sortDependencies>scope,groupId,artifactId</sortDependencies>
        <createBackupFile>false</createBackupFile>
        <expandEmptyElements>false</expandEmptyElements>
    </configuration>
    <executions>
        <execution>
            <goals>
                <goal>sort</goal>
            </goals>
            <phase>verify</phase>
        </execution>
    </executions>
</plugin>

Either it automatically sorts the poms in the verify phase or call if independent: ./mvnw sortpom:verify@sort. It orders the elements of my pom in a stable, consistent order and also the dependencies, grouped by scope. Remember what has been written about resolving transitive versions? When you just add a dependency on top of the dependency list, it might have the same dependency as another direct dependency has below. Now suddenly the new transitive dependency will be picked up. Sorting does not completely solve this problem, especially not without dependency management, but in many cases results might be less surprising.

2.2.4. Escalate away from conventions

The moment I would have to jump through so many loops to get a thing done via Maven configuration, I escalate in two steps. Both are done via the Exec-Maven-Plugin [exec-maven-plugin]. The plugin is able to execute Java in the same JVM or alternatively, external processes, either other programms or Java based calls. The latter can be configured to use the same class path like the build.

So here are my 3 approaches to go from conventions to a programmatic approach: The farer away it get’s from Maven and convention, the more issues might arise on "other peoples" machines:

Execute a Java program with your current classpath

Maybe everything you need for a task is already inside your project. First step of escalating. For the CLI module I use [picocli] and I added the AutoComplete.GenerateCompletion subcommand in MigrationsCli [9]. Thus, my CLI learned how to generate a shell completion script. There’s a lot about that topic in the "Autocomplete for Java Command Line Applications"[10] manual. I want to distribute the script as a resource, too. I don’t want to manually add it as a resource. Therefore, my build should call the classes being build. This is an excellent use case to for [exec-maven-plugin] calling Java with the current classpath as shown in Listing 3:

Listing 3. Using Maven exec plugin to call single program (neo4j-migrations-cli/pom.xml)
<plugin>
    <groupId>org.codehaus.mojo</groupId>
    <artifactId>exec-maven-plugin</artifactId>
    <executions>
        <execution>
            <id>generate-cli-completion</id>
            <goals>
                <goal>exec</goal>
            </goals>
            <phase>package</phase>
            <configuration combine.self="override">
                <executable>java</executable>
                <arguments>
                    <argument>-classpath</argument>
                    <classpath />
                    <argument>${name-of-main-class}</argument>
                    <argument>generate-completion</argument>
                </arguments>
                <outputFile>target/neo4j-migrations_completion</outputFile>
            </configuration>
        </execution>
    </executions>
</plugin>
  1. Note that we can use the <classpath /> element here that passes the whole classpath to the java executable

  2. Of course there is a property in the pom.xml holding the name of my main class, which is used in several places

  3. The rest are a couple of args to the actual class being called

Call a single program

This is the next step of escalation. If I am just dealing with a single thing I need todo outside Maven or a plugin, I check if it is a single call, simple to parameterize. Compressing my GraalVM native binaries with UPX is such a thing as shown in Listing 4:

Listing 4. Using Maven exec plugin to call s single program (neo4j-migrations-cli/pom.xml)
<plugin>
    <groupId>org.codehaus.mojo</groupId>
    <artifactId>exec-maven-plugin</artifactId>
    <executions>
        <execution>
            <id>compress-binary</id>
            <goals>
                <goal>exec</goal>
            </goals>
            <phase>package</phase>
            <configuration combine.self="override">
                <executable>upx</executable>
                <skip>${skipCompress}</skip>
                <arguments>
                    <argument>${project.build.directory}/neo4j-migrations${executable-suffix}</argument>
                </arguments>
            </configuration>
        </execution>
    </executions>
</plugin>
Scripting things

If there is neither a class in my project solving my needs nor a single executable, I resort to scripting. A [bash] script seems reasonable these days, as bash or something compatible should be available in most places. If you want to stick to what you most like have in your project, use a Java script with [JBang].

I use [japicmd] to compare previous and current versions of the project checking for semantic version or rather for changes that would break semantic versioning. For this I need the current version number (given of course via the <version> tag) and the previous version (given via a property named neo4j-migrations.previous.version).

The previous version number needs to be updated when a release has been prepared, just before the release plugin will commit and tag the changes. I added a corresponding exec-maven configuration in the parent pom.xml

Listing 5. Using Maven exec plugin to call a script (pom.xml)
<plugin>
    <groupId>org.codehaus.mojo</groupId>
    <artifactId>exec-maven-plugin</artifactId>
    <version>${exec-maven-plugin.version}</version>
    <configuration>
        <skip>true</skip>
    </configuration>
    <executions>
        <execution>
            <id>release-prepared</id>
            <goals>
                <goal>exec</goal>
            </goals>
            <configuration>
                <executable>bin/update-previous-version.sh</executable>
                <skip>true</skip>
            </configuration>
        </execution>
    </executions>
</plugin>

You notice the execution has an identifier. It is referred in the release-plugin section:

Listing 6. Calling a release completion goal (pom.xml)
<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-release-plugin</artifactId>
    <version>${maven-release-plugin.version}</version>
    <configuration>
        <autoVersionSubmodules>true</autoVersionSubmodules>
        <useReleaseProfile>false</useReleaseProfile>
        <releaseProfiles>release</releaseProfiles>
        <tagNameFormat>@{project.version}</tagNameFormat>
        <goals>deploy</goals>
        <pushChanges>false</pushChanges>
        <localCheckout>true</localCheckout>
        <arguments>-Drelease -DisDryRun=${dryRun}</arguments>
        <preparationGoals>clean exec:exec@prepare-release verify</preparationGoals>
        <completionGoals>compile exec:exec@release-prepared</completionGoals>
    </configuration>
</plugin>

The script is rather simple (you’ll find it in the bin folder):

Listing 7. Extracting a prepared version from Maven release files and updating a property (update-previous-version.sh)
#!/usr/bin/env bash

set -euo pipefail
DIR="$(dirname "$(realpath "$0")")"

NEW_OLD_VERSION=$(sed -n 's/project\.rel\.eu\.michael-simons\.neo4j\\:neo4j-migrations-parent=\(.*\)/\1/p' $DIR/../release.properties)
$DIR/../mvnw versions:set-property -DgenerateBackupPoms=false -Dproperty=neo4j-migrations.previous.version -DnewVersion=$NEW_OLD_VERSION -pl :neo4j-migrations-parent

While a Maven maven would maybe have solved this with pure XML declaration, I couldn’t and would go as so far. This is easier to read.

These days, you are not restricted to shell-scripts. You can just stick to Java if you want. Have a look at my test_native_cli.java script in the same folder. It is a Java program, orchestrating a [Testcontainer] and testing the native build. Right now, it is called from a GitHub action, but it would be easy enough to call it from Maven. All the power of Java right at your hands, as a script via [JBang].

Resources


1. https://spring.io/blog/2021/12/16/spring-framework-6-0-m1-released
2. https://docs.oracle.com/en/java/javase/17/docs/specs/man/javac.html
3. https://spring.io/blog/2020/06/08/migrating-spring-boot-s-build-to-gradle
4. https://andresalmiray.com/maven-dependencies-pop-quiz-results/
5. https://github.com/michael-simons/neo4j-migrations/blob/662018028e1ffc89bf4558eed23a463fe11e2ccf/pom.xml#L66
6. https://github.com/michael-simons/neo4j-migrations/blob/662018028e1ffc89bf4558eed23a463fe11e2ccf/pom.xml#L135
7. https://twitter.com/aalmiray/status/1478036541238923267
8. https://maven.apache.org/developers/conventions/code.html
9. https://github.com/michael-simons/neo4j-migrations/blob/662018028e1ffc89bf4558eed23a463fe11e2ccf/neo4j-migrations-cli/src/main/java/ac/simons/neo4j/migrations/cli/MigrationsCli.java#L62
10. https://picocli.info/autocomplete.html