Programming

Java records vs Kotlin data classes

Records and data classes are two very similar types of classes in Java and Kotlin. Despite almost the same look and feel, there are also a couple of differences between them. Recently, I had a chance to try out records after writing exclusively in Kotlin for almost a year. In this article I would like to share my thoughts and show, how they relate to each other.

tl;dr;

Records in Java have extra limitations, not present in Kotlin data classes. In Java, there are no named arguments, therefore using large records is not convenient. Records cannot contain extra internal fields or mutable properties. Data classes offer a copy() function.

Overview

Let’s take a closer look how records and data classes look like.

Kotlin data classes

Data classes are a part of Kotlin since its first release in 2016. Their main purpose is representing structured data of any size:

data class Author(
   val id: AuthorId,
   val firstName: String,
   val lastName: String,
   val website: String,
   val birthDate: LocalDate
)

With this compact syntax, we get a lot of things for free:

a constructor that accepts all the fields as an arguments,
copy() function for making a partial copy of an object,
useful toString() implementation,
useful equals() and hashCode() implementations,
getters (and optionally setters) for all fields.

Java records

Records are still a very fresh feature of Java. They appeared for the first time as a preview feature in Java 14 (March 2020). The official release took place a year later, with Java 16. At the first sight, their syntax is very similar to Kotlin:

public record(
   AuthorId id,
   String firstName,
   String lastName,
   String website,
   LocalDate birthDate
) { }

Java compiler also takes care of creating all the typical elements that we would like to have in such a class. This includes the constructor, accessor methods, equals(), hashCode() and toString().

Use case

In a typical application, there is a large number of classes which merely hold data. We use them to represent larger data structures. We often need temporary bags for two or three values inside algorithms. For a long time, creating such classes in Java required a lot of ceremony. Many of us still work with long listings of getters and setters. It was very easy to add a field and forget to include it in toString() or equals() methods. This is how the example presented above would look like in Java 8:

public class Author {
   private final AuthorId id;
   private final String firstName;
   private final String lastName;
   private final String website;
   private final LocalDate birthDate;

   public Author(
      AuthorId id,
      String firstName,
      String lastName,
      String website,
      LocalDate birthDate
   ) {
      this.id = id;
      this.firstName = firstName;
      this.lastName = lastName;
      this.website = website;
      this.birthDate = birthDate;
   }

   public AuthorId getId() {
      return id;
   }

   public String getFirstName() {
      return firstName;
   }

   public String getLastName() {
      return lastName;
   }

   public String getWebsite() {
      return website;
   }

   public LocalDate getBirthDate() {
      return birthDate;
   }

   public String toString() {
      return "Author(id = " + id + ", firstName = " + firstName + 
         ", lastName = " + lastName + ", website = " + website +
         ", birthDate = " + birthDate + ")";
   }

   public boolean equals(Object other) {
      if (this == other) {
         return true;
      }
      if (!(other instanceof Author)) {
         return false;
      }
      return Objects.equals(this.id, other.id) &&
         Objects.equals(this.firstName, other.firstName) &&
         Objects.equals(this.lastName, other.lastName) &&
         Objects.equals(this.website, other.website) &&
         Objects.equals(this.birthDate, other.birthDate);
   }

   public int hashCode() {
      // Since Java 7, there is Objects.hash() but it uses variadic args
      // and allocates a temporary array on the heap
      int hash = 7;
      hash = 31 * hash + (id == null ? 0 : id.hashCode());
      hash = 31 * hash + (firstName == null ? 0 : firstName.hashCode());
      hash = 31 * hash + (lastName == null ? 0 : lastName.hashCode());
      hash = 31 * hash + (website == null ? 0 : website.hashCode());
      hash = 31 * hash + (birthDate == null ? 0 : birthDate.hashCode());
      return hash;
   }
}

Only 73 lines to create a bag for 5 values. Both data classes and records try to shorten that, however their approaches are slightly different. Let’s see, how!

Practical usage

Large structured data

So far, we have seen a small DTO with only 5 fields. However, many applications have to work with larger data structures that contain even 10 or 20 fields. Here we can see the first difference between data classes and records. In Kotlin, we have named arguments, therefore creating even a large data class instance is still readable:

val instance = Article(
   id = id,
   name = template.name,
   publishDate = Instant.now(),
   author = author,
   slug = computeSlug(template.name),
   content = content,
   permalink = permalink,
   categories = categories,
   tags = tags
)

Sadly, Java does not have such a feature. For records, we get an all-args constructor, but we have to remember the position of each value. Although the IDE can display hints for argument names, we cannot see them e.g. in pull request preview. The only workaround is manually creating an extra builder class. Let’s also notice that thanks to the named arguments, there is rarely a need to write a builder in Kotlin! Example in Java for comparison:

var instance = new Article(
    id,
    template.name,
    Instant.now(),
    author,
    computeSlug(template.name),
    content,
    permalink,
    categories,
    tags
);

Internal fields

Kotlin data classes allow creating extra internal fields. We do not do that too often, but there are some justified situations. For example, let’s say that we have a long-living data structure with a large collection inside. We want to create two additional internal maps for quicker access:

data class LargeDataStructure(val collection: Collection<Item>) {
   private val byName: Map<String, Item> = collection.associateBy { it.name }
   private val byId: Map<String, Item> = collection.associateBy { it.id }
}

Such an action is not possible in Java records. While we can create static fields inside records, we cannot add additional instance fields.

Mutable data

Another difference is related to immutability. In Kotlin data classes, we can use both mutable and read-only fields. It depends on whether we use val or var keyword in the field list. Java does not have such a feature, therefore all record fields are read-only. Of course, it applies only to the object references stored in a record. The actual objects can be mutable or immutable.

Copy function

Another feature of data classes not present in records, is copy() function. Kotlin automatically generates such a function for us. It helps making a full or partial copy of the original object:

val copied = author.copy(website = "https://www.example.com")

It is very useful for operating on immutable data structures, however it has a small side effect. Because of copy(), creating a private constructor in a data class makes little sense. We can do it, but this constructor is indirectly exposed anyway through copy(). It causes issues with using some defensive programming techniques. For example, we may want to force using truly immutable collections from Guava internally:

data class MyDto private constructor(val items: Collection<Item>) {
   companion object {
      fun create(items: Collection<Item>) = MyDto(ImmutableList.copyOf(items))
   }
}

val good = MyDto.create(listOf(Item()))
val hacked = good.copy(items = ArrayList())
val hackedCollection = hacked.items as MutableList<Item>
hackedCollection.add(Item()) // say 'bye' to immutability!

In this example we use a factory function to make sure that the underlying collection in MyDto is immutable. To do so, we make the data class constructor private. However, it leaks out through copy() function. We can use it for creating an instance with a mutable ArrayList under the hood. Later we can cast it back to MutableList<Item> and add new items to a collection that should not be touched.

When it comes to records, they do not have copy() function at all. Unfortunately, there is also a restriction that the constructor of the public record must be also public. So, if we want to make sure that the collection is immutable, we have to declare it so:

data class MyDto(items: ImmutableList<Item>)
public record MyDto(ImmutableList<Item> items) { }

Note that this is actually how Guava authors recommend using immutable collections.

In short…

Remember that Kotlin does not have truly immutable collections. Both in records and data classes, there are some limitations of using defensive programming techniques.

Conclusion

When it comes to functionality, Kotlin data classes are superior to Java records. On the other hand, Java is the lingua franca of the JVM ecosystem. Records offer interoperability between different JVM languages and tools. In fact, if our data class meets some conditions, we can tell Kotlin compiler to turn it into a record:

@JvmRecord
data class Point(val x: Int, val y: Int)

My suggestions:

Java projects: we can use records, and we might still need to write regular beans in some cases,
Kotlin applications: we can use the full functionality of data classes,
Kotlin libraries: we should consider using @JvmRecord annotation on data classes for interoperability.

Author