Contents

A solid checklist for configuring new Scala projects

Krzysztof Ciesielski

24 Feb 2023.9 minutes read

A solid checklist for configuring new Scala projects webp image

Setting up a new Scala project often results in copy-pasting existing build configurations. A longer-living build configuration contains many useful settings and plugins. Organizations sometimes extract default settings to their internal plugins or use features like GitHub templates. These templates are rarely universal outside of their context, so it's pointless to aim to create a single, ultimate project scaffolding for all purposes.

Instead, I'd like to propose a checklist where I gathered important settings and practices worth considering to enrich your builds or enhance the development process. Some of these points are a good fit for templates. Some are just vital steps I want to set up before anyone starts contributing.

Checklist overview (TLDR)

Here's the entire checklist if you'd like to review it without scrolling through boring paragraphs:

  1. .gitignore
  2. sbt-dotenv and direnv
  3. local.conf
  4. .sbtopts
  5. Java version
  6. Compiler flags
  7. .scalafmt.conf
  8. .scala-steward.conf or renovate
  9. publish settings
  10. .mergify.conf or automerge
  11. GitHub actions and CODEOWNERS
  12. Essential libraries
  13. TTLs for SNAPSHOTs
  14. Bonus: nix

.gitignore

Avoid annoying unneeded local files in pull request by initially preparing a comprehensive .gitignore file. Check out this example for a solid .gitignore for Scala, which includes typical files used by IntelliJ, VS Code, bloop, metals, direnv, and sbt.

sbt-dotenv and direnv

Environment variables are frequently used to set custom project configuration values. For example, it's a common practice to refer to them in Scala projects in application.conf, then configure concrete values in local, test, and production environments. direnv is a prevalent tool that loads the environment in your current shell automatically when you enter a project directory with a .envrc file with content like

export KAFKA_HOST=localhost
export KAFKA_PORT=9131

It is very convenient to integrate such a file with sbt console, especially if you're running it from IDE. For this purpose, you may want to check the sbt-dotenv plugin. It is worth configuring as a global plugin, making it work for all your local projects.

First, create ~/.sbt/1.0/plugins/dotenv.sbt with the following content:

addSbtPlugin("nl.gn0s1s" % "sbt-dotenv" % "3.0.0")

Additionally, I recommend changing the plugin's default filename from .env to .envrc. It's a standard name used by direnv, which is a more widely adopted tool that can be used for other purposes. You can do this by adding a .sbt/1.0/envrc.sbt with:

ThisBuild / envFileName := ".envrc"

From now on, each time you open an sbt console, it will pick environment variables from the .envrc file in the workspace root directory. This file should be local-only, so ensure git ignores it; see section .gitignore.

(Note: sbt-dotenv uses reflection. Since Java 17 reflection is forbidden and sbt-dotenv crashes on startup. See the next section about .sbtopts for information about how to resolve this issue).

local.conf

A complementary practice to setting environment variables is to override configuration entries from application.conf. Include local.conf in .gitignore so developers can create an src/main/resources/local.conf file, which includes application.conf, like:

include "application.conf"

akka {
  loggers = ["akka.event.slf4j.Slf4jLogger"]
  loglevel = "DEBUG"
  stdout-loglevel = "DEBUG"
}

The file will be used when you run the app with -Dconfig.file=path/to/src/main/resources/local.conf. This parameter can be passed to sbt in a few ways:

  • for raw sbt console executions: sbt 'set javaOptions+="-Dconfig.file=path/to/src/main/resources/local.conf"; test:runMain com.myapp.MyMain'
  • for IntelliJ sbt console: just add -Dconfig.file=... to "VM parameters" in "sbt settings"
  • for IntelliJ "Run" configurations: similarly to sbt console, use the "VM parameters" field
  • and analogously for other tools

.sbtopts

A convenient way to define sbt or Java options for builds and sbt console. If you want to share a standard set for all developers and CI, put them in .sbtopts, for example:
Allowing reflection in newer java versions: (required for sbt-dotenv on Java 17+)

-J--add-opens=java.base/java.util=ALL-UNNAMED
-J--add-opens=java.base/java.lang=ALL-UNNAMED

Setting specific memory requirements:

-J-Xmx5G
-J-Xms1G
-J-Xss2M

Setting specific sbt options (full list here):

-Dsbt.task.timings=true

Java version

Sometimes a long investigation of an issue leads to discovering that your local Java version is different than the one used by CI, not to mention that there can be entirely different version deployed with the app. Unifying Java version used by your project on all levels is possible, but tricky.

For local development, developers quite often use sdkman or jenv. The first tool is universal and can be combined with .envrc, but the only solution I could find is far from elegant. Otherwise, use either .sdkmanrc or .java-version, specific files meant for these tools.

However, your Continuous Integration pipeline probably uses its own configuration for the JDK (see this GitHub Action which loads from a file), and there's also sbt-native-packager, which needs a base image definition in build.sbt.

I haven't yet seen all these JDK versions unified into a single location until I stumbled upon nix, and I will cover this subject in detail in its blog post. Meanwhile, check https://github.com/gvolpe/sbt-nix.g8 if you'd like to explore this subject on your own.

Compiler flags

Question: what's wrong with this code?

def doStuff(data: Data): F[Unit] = {
  Logger[F].info(s"Logging important metadata: ${data.metadata}")
  processor.process(data).void
}

Answer: The code is obviously missing a >> binding the F[Unit] returned by our logger and the other F[Unit], returned by the processor call, which is our final result. But, is it really that obvious? Probably yes, if you're asked to check for issues in two lines of code, but it can be overlooked in a wider context.

The problem is Scala 2.13 compiler won't complain at all, and you can easily miss such errors on code reviews! The default compiler settings almost always need to be enhanced, for example, to warn about unused and discarded values and other potential issues. You can find a great set of sbt compiler flags in the sbt-tpolecat plugin. Combine it with selective warning suppression, and you're golden. For example, add:

scalacOptions += "-Wconf:src=src_managed/.*:s"

to your build configuration to silence warnings for classes generated by ScalaPB from protobuf files.

.scalafmt.conf

Defining formatting rules later in the project lifecycle results in painful updates, so deciding on a nice set of rules in the beginning is crucial. Here are some rules I really like, but please remember that code style conventions are very subjective:

maxColumn                                 = 140
// breaks long class/def definitions into multiline
verticalMultiline.atDefnSite              = true 
// forces newline before and after implicit params
newlines.implicitParamListModifierForce   = [before, after] 
// only format files tracked by git
project.git                               = true 
// PreferCurlyFors: Replaces parentheses into curly braces in for comprehensions that contain multiple enumerator generators
// RedundantBraces, RedundantParens: Remove redundant braces, parens
rewrite.rules                             = [PreferCurlyFors, RedundantBraces, RedundantParens, SortImports]
// Add spaces next to curly braces in imports
spaces.inImportCurlyBraces                = true 
// more than standard align rules https://scalameta.org/scalafmt/docs/configuration.html#alignpresetmore
style                                     = defaultWithAlign 
// all infix operators can be exempted from applying continuation indentation
indentOperator.exemptScope                = all 
rewriteTokens {
  "⇒" = "=>"
  "→" = "->"
  "←" = "<-"
}

.scala-steward.conf or renovate

Managing dependency updates is a complex process that requires regular actions, and often consists of specific rules. Nowadays, it can be automated with tools like Scala Steward, or renovate, complemented by dependabot. Sometimes you may want to ignore particular libraries, configure titles for PRs, limit number of updates or frequency. Tweak your Scala Steward rules (see this reference and prepare a well-defined config.

publish settings

In multi-module sbt projects it's often pretty specific which modules should be:

  • Published as artifacts
  • Published as docker images
  • Published with scaladoc
  • Not published

My starting build.sbt almost always would benefit from:

  • A base list of resolvers
  • A path to ivy local credentials
  • Publishing configuration
  • Initial docker settings like base image, or package name
  • Compile / packageDoc / publishArtifact := false in commonSettings for all modules, to skip scaladoc, which saves a significant amount of build time.
    A set of
    lazy val noPublishSettings = Seq(
    publish := {},
    publishLocal := {}
    )
    lazy val dockerNoPublishSettings = Seq(
    Docker / publishLocal := {},
    Docker / publish := {}
    )

which can be later used as

.settings(noPublishSettings)
.settings(dockerNoPublishSettings)

on modules that are not supposed to be published.
Prepare your publishing setup in advance so that minimum additional work is needed in future repositories following your template.

.mergify.conf or automerge

There are various cases where you may want to merge your PRs automatically. For example:

  • automatically merge approved PRs published by a dependency bumping bot
  • automatically merge PRs affecting certain files

See an example .mergify.conf with some rules we use in softwaremill-sbt. Keep in mind that mergify is free only for open source code, so GitHub automerge or renovate may be a better choice for your project. Example configuration:

# Enables automatic merge of master -> PR for Scala Steward PRs
# created by user 'github-actions', when all checks
# pass (including review approvals)
name: automerge
on:
  pull_request:
    types:
      - labeled
      - unlabeled
      - synchronize
      - edited
      - ready_for_review
      - reopened
      - unlocked
  pull_request_review:
    types:
      - submitted
  check_suite:
    types:
    - completed
jobs:
  automerge:
    runs-on: ubuntu-latest
    if: github.event.pull_request.user.login == 'github-actions'
    steps:
      - name: automerge
        uses: "pascalgn/automerge-action@v0.15.3"
        env:
          GITHUB_TOKEN: "${{ secrets.GITHUB_TOKEN }}"
          MERGE_FILTER_AUTHOR: "github-actions"
          MERGE_METHOD: "squash"
          MERGE_LABELS: ""
          MERGE_RETRIES: "15"
          MERGE_RETRY_SLEEP: "20000"
          UPDATE_LABELS: ""

GitHub actions and CODEOWNERS

I mentioned one example of a GitHub action that is a good candidate for a project template - automerge. Are there any other actions you consider crucial to add before contributors start to commit?

And speaking of contributors, I like to define a .github/CODEOWNERS file so that pull request reviewers can be automatically assigned based on this information. You don't need to specify individual people; the file can handle teams, like

@my-organization/alpha-team

Essential libraries

Libraries aren't meant for project templates, but they are worth mentioning here as a checklist point when creating a new codebase. Many of them can be considered essential utils, and we often would like to add them to our project in advance. For example, in this video Vlad shows a great list: cats, refined, iron, chimney, ducktape, scala-newtype, monocle, quicklens, enumeratum, derevo, macwire, jam, PPrint, sbt-thank-you-stars.

Your starter pack may also include libraries for logging, metrics, tests, and other aspects common across your organization.

Caution! I don't suggest that you should have a plugin with "common libraries" reused across projects. Quite the opposite - I avoid such approach, as it leads to terrible dependency issues, as well as a bloated module with a hundred megabytes of possibly unnecessary things. It's much better to maintain a checklist with suggestions that teams go through when initiating a new codebase.

TTLs for SNAPSHOTs

Developing using SNAPSHOT dependencies is justifiable in some circumstances. We can imagine two teams frequently exchanging an "api" artifact in the early prototype stage of establishing a communication contract. If that's your case, keep in mind that Coursier's TTL is 24 hours, so your artifacts won't get updated even if a new version is published. To shorten this period, select one of two solutions:

  1. Put TTL configuration in your template (build.sbt)
    csrConfiguration := csrConfiguration.value.withTtl(Some(1.minute))

or

  1. Use the aforementioned .envrc file and add export COURSIER_TTL=60s (or any desired value), then restart your sbt session.

Please note that even if sbt picks up these settings and reloads SNAPSHOTs properly, IntelliJ seems to have problems noticing new artifacts. It may require a restart each time a new SNAPSHOT is fetched.

Bonus: nix

I briefly mentioned nix as a tool that can potentially enhance sbt-native-packager to create docker images with precisely the same jdk used for development and CI. Nix gives other possibilities, like defining entire environments of sdks, libraries, and system packages, which are loaded only when you are inside your project directory (nix shells). I will prepare a dedicated article about all this but don't hesitate to check "Nix all the things" by Paweł Szulc, who smoothly introduces nix to newcomers.

Summary

I hope this checklist will help you to prepare solid build configurations in the early stages of your projects. For additional ideas, look at sbt-softwaremill and sbt-template, which we maintain at SoftwareMill for our open-source projects. Still, I am sure there may be numerous points I don't know about, so if you have any suggestions on what I could add or update, please comment!

Reviewed by: Adam Bartosik, Łukasz Lenart, Michał Matłoka, Michał Ostruszka, Adam Warski

Blog Comments powered by Disqus.