musings

Handling default values in JSON with uPickle and tapir

This is a follow-up to my previous post that covered the basics of uPickle.

Choosing to build a Scala project with a bunch of libraries that I've never used before has and continues to lead me to new and unexpected challenges. As I'm still in the mood to talk JSON, I'm going to go over a subtle issue I didn't notice until I started generating more test data for this app — the response objects didn't always include fields with default values!

I didn't know whether the culprit was one piece of my stack or a combination of:

So I decided to create a small script with uPickle on its own and then gradually reintroduce the other components until things broke again. That way I would know whose documentation to start digging through.

Writing JSON with uPickle

To get started, let's:

  1. Define an Artist case class that has an optional field with a default value
  2. Provide a ReadWriter[Artist] pickler to make Artist serializable
  3. Create some Artist instances, one of which will be missing the optional field
  4. Serialize the instances to see how they look as JSON
//> using dependency com.lihaoyi::upickle:4.1.0

import upickle.default.{ReadWriter, write}

// 1, 2
case class Artist(name: String, country: Option[String] = Some("USA")) derives ReadWriter

// 3
val artists = Seq(
  Artist("Wintersun", Some("Finland")),
  Artist("Darkest Hour"),
)

// 4
val artistsJson = write(artists)

As expected, the second Artist instance replaces its missing country with the default value:

artists(0).country // Some(Finland)
artists(1).country // Some(USA)

But the second object in the serialized artistsJson has no country field at all:

[
  {"name":"Wintersun","country":["Finland"]},
  {"name":"Darkest Hour"}
]

NOTE: uPickle treats optional fields as single-element arrays, hence ["Finland"] instead of "Finland"

The field is missing from the JSON output because when serializing objects, uPickle drops fields that are set to their default values. And it doesn't matter whether the field is optional like ours or not. The same thing would happen if we changed it to country: String = "USA".

To fix this, we need to annotate the case class or the specific field with @upickle.implicits.serializeDefaults(true):

// case class
@upickle.implicits.serializeDefaults(true)
case class Artist(name: String, country: Option[String] = Some("USA")) derives ReadWriter
// field
case class Artist(
    name: String,
    @upickle.implicits.serializeDefaults(true)
    country: Option[String] = Some("USA"),
) derives ReadWriter

With the annotation in place, both objects in artistsJson now have a country field, the second one using the default value:

[
  {"name":"Wintersun","country":["Finland"]},
  {"name":"Darkest Hour","country":["USA"]}
]

Creating a JSON API with tapir

Now that we've seen how to serialize default values with uPickle, let's expose our data as an API.

We'll create one with tapir describing the endpoints and Pekko HTTP powering the server:

//> using dependency com.softwaremill.sttp.tapir::tapir-core:1.11.16
//> using dependency com.softwaremill.sttp.tapir::tapir-json-upickle:1.11.16
//> using dependency com.softwaremill.sttp.tapir::tapir-pekko-http-server:1.11.16

import upickle.default._

case class Artist(name: String, country: Option[String] = Some("USA")) derives ReadWriter

val artists = Seq(
  Artist("Wintersun", Some("Finland")),
  Artist("Darkest Hour"),
)

@main def run(): Unit = {
  import org.apache.pekko.actor.ActorSystem
  import org.apache.pekko.http.scaladsl.Http
  import scala.concurrent.{Await, ExecutionContext, Future}
  import scala.concurrent.duration.DurationInt
  import sttp.tapir._
  import sttp.tapir.generic.auto._
  import sttp.tapir.json.upickle._
  import sttp.tapir.server.ServerEndpoint
  import sttp.tapir.server.pekkohttp.PekkoHttpServerInterpreter

  given system: ActorSystem = ActorSystem()
  given ec: ExecutionContext = system.dispatcher

  val getArtists = endpoint.get
    .in("artists")
    .out(jsonBody[Seq[Artist]])
    .serverLogic(_ => Future(Right(artists)))

  val routes = PekkoHttpServerInterpreter().toRoute(getArtists)

  val port = 8080
  Await.result(Http().newServerAt("localhost", port).bindFlow(routes), 1.minute)
  println(s"Server started on port $port")
}

The getArtists variable defines a GET /artists endpoint that returns the serialized artists data:

$ curl localhost:8080/artists
[{"name":"Wintersun","country":["Finland"]},{"name":"Darkest Hour"}]

But once again, the default country is missing from the second object.

Customizing uPickle

Our first move to fix things is annotating the Artist case class like we did earlier, but that causes the following error:

[error] type serializeDefaults is not a member of upickle.implicits
[error] @upickle.implicits.serializeDefaults(true)
[error]  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

What's changed since then is that we replaced the original upickle dependency with tapir-json-upickle to get access to the sttp.tapir.json.upickle.jsonBody method used to describe the tapir endpoint's response body:

val getArtists = endpoint.get
  .in("artists")
  .out(jsonBody[Seq[Artist]]) // <-------- sttp.tapir.json.upickle.jsonBody
  .serverLogic(_ => Future(Right(artists)))

Since the new dependency doesn't have the annotation we want, we have to:

  1. Create a custom configuration that sets serializeDefaults to true

    object SerializedDefaults extends upickle.AttributeTagged {
      override def serializeDefaults = true
    }
    
  2. Replace ReadWriter with the custom SerializedDefaults.ReadWriter:

    case class Artist(
        name: String,
        country: Option[String] = Some("USA")
    ) derives SerializedDefaults.ReadWriter
    

We can now use SerializedDefaults to serialize Artists without losing fields that are set to their default values:

SerializedDefaults.write(artists)
// [{"name":"Wintersun","country":["Finland"]},{"name":"Darkest Hour","country":["USA"]}]

A closer look at ReadWriter

Serializing Artists with default values is working again, but now we have a new problem.

The tapir endpoint definition gives us this error:

[error] No given instance of type upickle.default.ReadWriter[Seq[Artist]] was found for a context parameter of method jsonBody in trait TapirJsonuPickle.

If you read my previous post, you'll recognize this as the Artist case class not being serializable because it has no pickler available. But we created one:

case class Artist(
    name: String,
    country: Option[String] = Some("USA")
) derives SerializedDefaults.ReadWriter

So what's going on?

A careful read of the error message reveals that jsonBody is looking for a upickle.default.ReadWriter, but we replaced it with our custom configuration's SerializedDefaults.ReadWriter to include default values in our JSON output.

Someone ran into this problem back in 2022 and filed an issue on tapir's GitHub repo, to which a maintainer replied:

The signature requires that any ReadWriter is in the implicit scope. You can provide a custom one by defining e.g. a custom implicit val.

But reading the tapir and uPickle sources confirms that:

  1. jsonBody needs a upickle.default.ReadWriter
  2. upickle.default extends upickle.AttributeTagged, just like SerializedDefaults does

If upickle.default.ReadWriter and SerializedDefaults.ReadWriter are distinct types (Point 2), then jsonBody rejects all custom picklers (Point 1); therefore, the maintainer's answer was wrong.1

Now that we know the cause, how do we fix it?

Reconciling uPickle and jsonBody

The reporter ended up closing the issue after finding a workaround by "re-implementing the TapirJsonPickle [sic] methods". In other words, we have to create a new version of the jsonBody method that works with SerializedDefaults.ReadWriter instead of upickle.default.ReadWriter.

First, add the missing methods in question to SerializedDefaults:

object SerializedDefaults extends upickle.AttributeTagged {
  override def serializeDefaults = true

  // New code below

  import scala.util.{Failure, Success, Try}
  import sttp.tapir._
  import sttp.tapir.Codec.JsonCodec

  def jsonBody[T: ReadWriter: Schema]: EndpointIO.Body[String, T] =
    stringBodyUtf8AnyFormat(readWriterCodec[T])

  implicit def readWriterCodec[T: ReadWriter: Schema]: JsonCodec[T] =
    Codec.json[T] { s =>
      Try(read[T](s)) match {
        case Success(v) => DecodeResult.Value(v)
        case Failure(e) =>
          DecodeResult.Error(s, DecodeResult.Error.JsonDecodeException(errors = List.empty, e))
      }
    }(t => write(t))
}

Then, replace jsonBody in the endpoint definition with SerializedDefaults.jsonBody:

val getArtists = endpoint.get
  .in("artists")
  .out(SerializedDefaults.jsonBody[Seq[Artist]])
  .serverLogic(_ => Future(Right(artists)))

And now we can finally call the endpoint and see default values in the JSON response:

$ curl localhost:8080/artists
[{"name":"Wintersun","country":["Finland"]},{"name":"Darkest Hour","country":["USA"]}]

Takeaways

This problem wasn't one I considered possible — in the likely, not literal, sense — but JSON libraries don't all make the same serialization choices2.

Sorting things out in uPickle was quick thanks to the detailed documentation. The tapir implementation, on the other hand, introduced a deeper problem that not covered by either library's documentation, but I stayed the course and digging into the library sources to verify the comments on the GitHub issue really paid off.

It would've been nice to not have to work through this in the first place, but on the bright side, I learned more about uPickle in the process.

Source Code

This is a self-contained example with all of the work above.

If you have Scala CLI, you can start the server with scala-cli <FILENAME>.scala.

//> using dependency ch.qos.logback:logback-classic:1.5.17
//> using dependency com.softwaremill.sttp.tapir::tapir-core:1.11.16
//> using dependency com.softwaremill.sttp.tapir::tapir-json-upickle:1.11.16
//> using dependency com.softwaremill.sttp.tapir::tapir-pekko-http-server:1.11.16

case class Artist(
    name: String,
    country: Option[String] = Some("USA"),
) derives SerializedDefaults.ReadWriter

val artists = Seq(
  Artist("Wintersun", Some("Finland")),
  Artist("Darkest Hour"),
)

object SerializedDefaults extends upickle.AttributeTagged {
  // uPickle code
  override def serializeDefaults = true

  // tapir code
  import scala.util.{Failure, Success, Try}
  import sttp.tapir._
  import sttp.tapir.Codec.JsonCodec

  def jsonBody[T: ReadWriter: Schema]: EndpointIO.Body[String, T] =
    stringBodyUtf8AnyFormat(readWriterCodec[T])

  implicit def readWriterCodec[T: ReadWriter: Schema]: JsonCodec[T] =
    Codec.json[T] { s =>
      Try(read[T](s)) match {
        case Success(v) => DecodeResult.Value(v)
        case Failure(e) =>
          DecodeResult.Error(s, DecodeResult.Error.JsonDecodeException(errors = List.empty, e))
      }
    }(t => write(t))
}

@main def run(): Unit = {
  import org.apache.pekko.actor.ActorSystem
  import org.apache.pekko.http.scaladsl.Http
  import scala.concurrent.{Await, ExecutionContext, Future}
  import scala.concurrent.duration.DurationInt
  import sttp.tapir._
  import sttp.tapir.generic.auto._
  import sttp.tapir.server.ServerEndpoint
  import sttp.tapir.server.pekkohttp.PekkoHttpServerInterpreter

  val logger = org.slf4j.LoggerFactory.getLogger(this.getClass().getName)
  given system: ActorSystem = ActorSystem()
  given ec: ExecutionContext = system.dispatcher

  // GET localhost:8080/artists
  val getArtists = endpoint.get
    .in("artists")
    .out(SerializedDefaults.jsonBody[Seq[Artist]])
    .serverLogic(_ => Future(Right(artists)))

  val routes = PekkoHttpServerInterpreter().toRoute(getArtists)

  val port = 8080
  Await.result(Http().newServerAt("localhost", port).bindFlow(routes), 1.minute)
  logger.info(s"Server started on port $port")
}
  1. Granted, the maintainers admittedly didn't have much experience with uPickle at the time and have been busy adding support for at least 7 other JSON libraries to tapir, so I'm not surprised by this edge case.

  2. See this comparison of how play-json, uPickle, and weepickle handle default values.

#debugging #json #scala #scala:tapir #scala:upickle