Structuring ZIO 2 applications

07 Jun 2022.17 minutes read

Structuring ZIO 2 applications webp image

I've been interested in various approaches to dependency injection for a long time; first as a user of annotation & reflection-based approaches, then as its critic, finally exploring simpler alternatives, such as "just using constructors and parameters", optionally compile-time automated with libraries such as macwire.

As ZIO—a library that aims to enable writing large-scale business applications, with large teams, using functional programming and Scala - offers its own unique approach, I've been following its development closely. In ZIO 1, the "best practice" for managing and passing dependencies, as advocated by the library authors and the library code itself, was making extensive use of concepts known as "layers" and "environment". I've never been a particular fan of this approach, which I've written about in a separate article.

However, the ZIO 2 release brings major changes both to the "environment" and "layer" features, as well as the recommended way of how ZIO 2 applications should be structured. Although I've already covered a lot of these changes, since then, the approach has further evolved, making a code-first case study worthwhile and interesting to explore.

In particular, since I still didn't understand the benefits that "environment" and "layers" might bring to the problem at hand, I took my doubts to twitter. John de Goes (ZIO project lead) and other community members were quick to offer explanations to my doubts. In the end, I did get a set of "rules of thumb" which, although not yet codified in ZIO 2 docs and scaladocs (which I think haven't yet been updated before the final release), served me as guiding principles in the short exercise that we'll explore below. Here's the 1-tweet summary, but of course the whole twitter thread might be informative as well:

Use constructors to require dependencies, use layers to make dependencies, use environment to access context (scopes, transactions, etc.) that is eliminated with layers (since layers are also algebraic effect handlers).
— John A De Goes (@jdegoes) May 31, 2022

Note that the below is only my interpretation of how ZIO 2 applications can be structured. It might not be the best way, and it might differ from the "official" recipe, if there will ever be one.

Example ZIO 2 application

I think it's best to explore and discuss such topics having some code at hand, which, even if the example is small, might give you some intuition as to how the ZIO 2 approach to managing dependencies feels in real life.

Our example will be quite minimalistic: we'll write an application that allows us to register cars, keeping the invariant that only a single car with a given license plate might exist in the database at any given time. There will be a couple of components that, which try to mimic how a business application might be structured:

api: parses user requests and passes control to the service
service: implements the business logic of registering a car: checking in the repository if there's already an entry for the given license plate, and if not, inserting a new one
repository: implements the logic of querying and writing to the database. Our example code will only pretend to implement access to a relational database, though
database: contains methods that allow running a set of database-interacting code in a transaction, using a connection obtained from the connection pool
connection pool: contains methods to obtain & release a connection

The control flow is quite linear, without many conditionals and error scenarios (still, there are some, so the example isn't completely trivial), which should make the code easy enough to explore. Let's start!

TL;DR

If you'd like to jump straight to the code, it's available in two compact files on GitHub. Try running this locally and let me know (through comments / PRs) if you think this "template" can be somehow improved!

ZIO refresher

We won't introduce ZIO—the central data structure of the ZIO library—in much detail here, as there are numerous excellent articles on that subject (see e.g. ZIO docs). If you're familiar with other functional programming libraries, the ZIO datatype is a lazily evaluated description of a potentially asynchronous and side-effecting computation (like the IO monad), having three type parameters. A good intuition for a ZIO[R, E, A] value is a function R => async Either[E, A]. The type parameters stand for:

R - the environment, which is needed to run the computation
E - the type of the errors that might be reported by the computation
A - the type of the result of the computation

The connection pool

We'll start from the bottom, adding one component at a time, and explaining the concepts as we go. As we are not using a real relational database, and only pretending, our connections will be pretended as well, a case class with an id:

case class Connection(id: String)

The pool will contain a mutable vector of available connections. To protect access to our mutable state, we'll use Ref from ZIO. One of its operations is .modify, which allows performing an atomic update to the current state; the result is a purely functional ZIO value: a description of the modification operation. As we are simulating a real-world connection pool, exceptions might be thrown in the process, hence we declare that this might end with an error of type Throwable:

class ConnectionPool(r: Ref[Vector[Connection]]):
  // Task is a type alias for ZIO: 
  // type Task[+A] = ZIO[Any, Throwable, A]
  def obtain: Task[Connection] = r
    .modify {
      case h +: t => (h, t)
      case _      => 
        throw new IllegalStateException("No connection available!")
    }
    .tap(c => ZIO.logInfo(s"Obtained connection: ${c.id}"))

  def release(c: Connection): Task[Unit] =
    r.modify(cs => ((), cs :+ c))
     .tap(_ => ZIO.logInfo(s"Released connection: ${c.id}"))

To see what's happening in our example application, we'll be doing quite a lot of logging. Here we are using the logging integration that's built into ZIO. Moreover, we're glossing over the possible situation where we might run out of connections. In reality, rather than throw exceptions, we'd want to back-pressure the application in some way (at least initially).

How to create an instance of our ConnectionPool? Here's where ZIO offers its unique approach. There's a dedicated datatype, ZLayer[RIn, E, ROut], which describes how to create ROut objects, given RIn dependencies, with E errors that might occur. This construction might take the form of simple object instantiation, or it might have side-effects, be asynchronous, or use some kind of resources that need to be released after the construction is done.

As in ZIO 2 we're supposed to "use layers to construct dependencies", we'll couple each component with a layer describing how to construct it, placed in the companion object. In the case of ConnectionPool, this is non-trivial, as allocating a Ref is an effectful operation. We'll also prime our connection pool with three connections:

object ConnectionPool:
  lazy val live: ZLayer[Any, Nothing, ConnectionPool] =
    ZLayer(
      Ref
        .make(Vector(
            Connection("conn1"), 
            Connection("conn2"), 
            Connection("conn3"))
        )
        .map(ConnectionPool(_))
    )

This layer will be used to create the dependency graph at "the end of the world", that is in the main method that will be used to run our application.

Partner with Scala Experts to build complex applications efficiently and with improved code accuracy. Working code delivered quickly and confidently. Explore the offer >>

The database

The DB object will focus on another interesting aspect of the ZIO datatype: the environment. While in ZIO 1, it was commonly used to pass around any kind of dependencies, in ZIO 2, we'll be using the environment to access "context". Of course, this is still quite vague, but we'll try to be more precise about what "context" means later. In this particular example, this will be the transactional context.

Transactions might seem a mere technicality, but they implement important business logic. The fact that some operations are guaranteed to be written atomically, leaving the database in a consistent state, might have important business implications. Hence tracking this explicitly, instead of implicitly as is often the case in Java frameworks, is an important and valuable feature.

In our example, the transactional context will be represented by the Connection object. That is, any ZIO description of a computation, which needs to be run within a transaction, will contain Connection as part of its environment. The intuition here is that such a computation might access the connection, which has an ongoing transaction opened, to perform some operations on the database.

The sole public method of the DB object will be a way of running some computations inside a transaction:

def transact[R, E, A](
    dbProgram: ZIO[Connection & R, E, A]): ZIO[R, E | Throwable, A]

The description of these computations—the dbProgram parameter—has the transactional context present in its environment. On the other hand, the returned computation description eliminates this context. It should contain the following steps, required to run a single transaction:

obtaining a connection from the connection pool
starting the transaction
running the dbProgram computations inside the transaction
committing, or rolling back the transaction
releasing the connection back to the pool

Note that we accept computations that require any context above Connection, as well as ones that have arbitrary error and result types. Since we are interacting with the ConnectionPool, errors of type Throwable might occur, hence we are adding it to the error types.

You might have noticed the usage of & and | types, that is intersection and union types. These are new additions in Scala 3 and behave as you would expect them to: in case of context, we have access both to Connection and the R environment. In case of errors, the computation might end with E or Throwable errors.

In the implementation of the transact method, a connection will become a resource, which needs to be acquired before use. Its release will be guaranteed by the library, whatever the outcome of the logic that uses it. In ZIO 2, implementation of resources also takes advantage of the environment, by tracking the scope in which a resource can be used with a Scope value in the environment. Here's the full implementation:

class DB(connectionPool: ConnectionPool):
  private def connection: ZIO[Scope, Throwable, Connection] =
    ZIO.acquireRelease(connectionPool.obtain)(c =>
      connectionPool
        .release(c)
        .catchAll(t => ZIO.logErrorCause(
            "Exception when releasing a connection", Cause.fail(t)))
    )

  def transact[R, E, A](
      dbProgram: ZIO[Connection & R, E, A]): ZIO[R, E | Throwable, A] =
    ZIO.scoped {
      connection.flatMap { c =>
        dbProgram.provideSomeLayer(ZLayer.succeed(c))
      }
    }

The connection method defines the resource, which can be created using acquire & release functions, here coming from the ConnectionPool that we defined earlier. transact uses ZIO.scoped to eliminate the Scope from the environment. This method guarantees that the appropriate release logic will be run, as part of the computation description that is being returned.

Finally, we're eliminating the usage of Connection in the passed dbProgram by providing a simple layer with the obtained connection. In reality, we would have to start the transaction before passing in the layer to dbProgram, and commit or rollback after it completes.

One more thing that we need to implement for DB is an accompanying description of how to create an instance of that class. As we have a single dependency, without any effectful allocation logic necessary, we'll simply "lift" the constructor to a layer (note that in Scala 3, the new keyword is no longer necessary):

object DB:
  lazy val live: ZLayer[ConnectionPool, Nothing, DB] = 
    ZLayer.fromFunction(DB(_))

The repository

The next component implements interactions with the database, for the storage unit in which we persist the car data. Again, we will only pretend to implement this functionality using some dummy logic. However, in both of the exists and insert methods, we'll extract the Connection from the environment (the current transactional context):

class CarRepository():
  def exists(licensePlate: String): ZIO[Connection, Nothing, Boolean] =
    ZIO
      .service[Connection]
      .map(_ => /* perform the check */ licensePlate.startsWith("WN"))
      .tap(_ => ZIO.logInfo(s"Checking if exists: $licensePlate"))

  def insert(car: Car): ZIO[Connection, Nothing, Unit] =
    ZIO
      .service[Connection]
      .map(_ => /* perform the insert */ ())
      .tap(_ => ZIO.logInfo(s"Inserting car: $car"))

object CarRepository:
  lazy val live: ZLayer[Any, Nothing, CarRepository] = 
    ZLayer.succeed(CarRepository())

ZIO.service[Connection] creates a description of a computation that depends on a Connection being in the environment (this requirement is then propagated to the final type). The layer describing the creation process of a repository is trivial, as this component has no dependencies.

Note that we don't need to depend on DB here, as we are not running any transactions. Instead, we create transaction fragments, which is expressed by the Connection environmental requirement.

The service

In the CarService, we'll implement the main business logic: checking if a car with the given license plate exists and either reporting a custom error (represented with an instance of LicensePlateExistsError, or performing an insert. We'll also run both exists and insert descriptions in a single transaction by eliminating the Connection context on the combined program description:

class CarService(carRepository: CarRepository, db: DB):
  def register(
      car: Car): ZIO[Any, Throwable | LicensePlateExistsError, Unit] =
    db.transact {
      carRepository.exists(car.licensePlate).flatMap {
        case true  => ZIO.fail(LicensePlateExistsError(car.licensePlate))
        case false => carRepository.insert(car)
      }
    }

object CarService:
  lazy val live: ZLayer[CarRepository & DB, Nothing, CarService] =
    ZLayer.fromFunction(CarService(_, _))

The register method might report two kinds of errors: either exceptions coming from interactions with the database or the custom LicensePlateExistsError error. The CarService layer is once again quite trivial, simply lifting the 2-argument constructor.

The API

Finally, we've got the API layer, which will parse incoming user requests and call the service if possible. For simplicity, we'll use simple text-based inputs. We'll also handle any errors that might occur during the invocation of the service, logging them and returning sanitized output to the user:

class CarApi(carService: CarService):
  def register(input: String): ZIO[Any, Nothing, String] =
    input.split(" ", 3).toList match
      case List(f1, f2, f3) =>
        val car = Car(f1, f2, f3)
        carService.register(car)
          .as("OK: Car registered").catchAll {
          case _: LicensePlateExistsError =>
            ZIO
              .logError(s"Duplicate register: $car")
              .as("Bad request: duplicate")
          case t =>
            ZIO
              .logErrorCause(s"Cannot register: $car", Cause.fail(t))
              .as("Internal server error")
        }
      case _ => ZIO.logError(s"Bad request: $input")
        .as("Bad Request")

object CarApi:
  lazy val live: ZLayer[CarService, Any, CarApi] = 
    ZLayer.fromFunction(CarApi(_))

Bringing it all together

The final piece of the puzzle is to compose all the layers to obtain a CarApi instance with which we can interact. That's the "end of the world", where we create the full object graph. If you're coming from Java, that's what Spring is doing when the application starts up!

We could do this by hand, combing the live layers that we have defined in the companion objects. Or we could use ZLayer.make, which is a macro doing this for us (hence everything happens at compile-time, generating the appropriate code). It takes the list of layers that it might use, and tries to create the given target instance using them. If that's not possible, or if there are some unused layers, a compile-time error/warning is reported (try for yourself!). Here's the Main object implementation, along with some test code:

object Main extends ZIOAppDefault:
  override def run: ZIO[Scope, Any, Any] =
    def program(api: CarApi): ZIO[Any, IOException, Unit] = for {
      _ <- api.register("Toyota Corolla WE98765")
        .flatMap(Console.printLine(_))
      _ <- api.register("VW Golf WN12345")
        .flatMap(Console.printLine(_))
      _ <- api.register("Tesla")
        .flatMap(Console.printLine(_))
    } yield ()

    ZLayer
      .make[CarApi](
        CarApi.live,
        CarService.live,
        CarRepository.live,
        DB.live,
        ConnectionPool.live
      )
      .build
      .map(_.get[CarApi])
      .flatMap(program)

Layers as a dependency injection mechanism

Now that our example is complete, we can try to take a closer look at some of the novel mechanisms that it introduces. Firstly, we've got layers to implement dependency injection. This approach has three main characteristics:

to express dependencies, we use constructors and constructor parameters
to describe the creation process of a dependency (possible asynchronous/side-effecting), we create a layer in the companion object of the dependency
at the end of the world, we compose the layers to obtain the entry point to our system

Compared to the "only constructors" approach, we do get the additional requirement to create a layer instance for each dependency in our system. On one hand, this might be viewed as boilerplate. Especially for the services where construction is trivial (such as the repository, service, DB, and API in our example), this is simply some mechanical work that needs to be done for each component. On the other hand, the creation logic for a component is nicely encapsulated alongside the component's implementation. It might also involve allocating and releasing resources or side-effects if needed, which is not possible with ordinary constructors.

ZLayer.make takes a similar role as macwire, that is it automates the boring process of assembling the dependency graph from the layers. Using only constructors, without any libraries, we'd need to create all the instances by hand, in the proper order, possibly handling side-effecting creation logic within the Main object. This might end up messy.

However, I suspect that in many applications, most of the components will be stateless, for which the layers are trivial. Only a few will require side-effecting acquire/release logic (such as DB connection pools or HTTP clients). Will the additional boilerplate required by the need to create a layer for each single dependency hold its weight? This remains to be seen by gathering experiences of developers using ZIO 2 for production apps.

One possible improvement is to extend ZIO.make in the direction that autowire explores, that is, automatically creating the "trivial" instances, where a simple constructor call is needed, without the necessity to explicitly pass it to the macro. Maybe a similar feature of automatically inferring "trivial" layers would be useful here?

Overall, ZIO 2 provides a simple and clean recipe as to how to handle dependencies.

Environment as a means to provide context

In ZIO 2, the environment is used to access the context in which a computation runs. We've seen two examples of this:

scoping of resource usage, eliminated with ZIO.scoped, which guarantees that the release logic for all resources is run
running computations in the context of an open transaction, eliminated using DB.transact, which manages the transaction's lifecycle

The latter approach is not novel; both the DBIOAction data type from Slick and ConnectionIO from Doobie implement transaction management in a similar way, however, with one crucial difference: they needed to introduce a dedicated data type. In ZIO, the same is achieved using the same ZIO as everywhere else, just with a different environment. That's a huge difference, both in the way new contexts can be introduced and in the wealth of operators that are available out-of-the box in the ZIO data type.

An important distinction to make is recognizing when something is a dependency, when something is a contextual value, and when it's a method invocation parameter. The first case is the simplest, we probably all know it intuitively; dependencies, wired through constructors, are usually global and often singletons. The decision when to use a contextual value or parameter, which are both local and request-dependent, is not as clear-cut. In general, context is for cases when it's useful for the value to be automatically propagated across method calls; similarly as is the case with Scala's implicits. They are also usually not directly created, or manipulated by application code. Instead, they are often introduced and eliminated using library code.

Finally, it's also worth noting that not all context should be explicitly tracked. Things such as correlation ids or traces are very useful for observability, but the smaller their surface area in code, the better. While it is useful to explicitly track the usages of Scope and Connection, as these directly influence the implementation of the business requirements, logging & tracing is best left to be propagated implicitly.

And ZIO has a mechanism for that—FiberLocals, which are the thread-local equivalent for fibers. The simplest example of their usage is creating logging spans using ZIO.logSpan. This will automatically add the span name (with timings) to all logs produced by the wrapped effect.

Summing up

Take a look at the code: does this way of structuring applications appeal to you? Can you imagine writing a larger app using this approach? Or maybe you see a better way of using ZIO 2 to achieve the same goal? Create a PR, an issue or comment :)

My subjective take would be that ZIO 2 strikes a good balance of code readability, developer ergonomics, and flexibility, all in a purely functional programming setting. I'm not yet entirely sold on layers (vs "just" using constructors), but the evolution of both layers and environments shows the desire to simplify the way ZIO apps are written, which is the right goal for such a library.

Finally, things will probably never get perfect, and what we've seen so far should definitely be good enough to build your next microservice using FP & Scala. Now we just need to build up the ecosystem. tapir is already there!

Contents