Skip to content

Another Serialization API Axis

February 22, 2012

Johan Tibell and I have been working on a Haskell serialization API recently, and he wrote an excellent post outlining the general issues one runs into again and again. This is a follow up that adds another.

Please read Designing a Serialization API first

How many ways can the Haskell types map

In some serialization, there is really on one useful way to map the basic Haskell types. For example, Bool‘s True and False clearly map to and from JSON’s true and false. Any other conversion, while conceivable, probably isn’t realistic. On the other hand, there is no standard way for Int to map onto a stream of bytes. Even within a given protocol, and Int may be serailized different ways at different points in the data stream.

This influences the design of the serialization  type classes: Aeson’s toJSON type class can be parameterised by type, because Int, Bool, String, and Text are all going always serialize the same.

On the other hand, Binary’s Put monad allows, at some extra programmer work, the ability to serialize a Int various different ways. via putWord8, putWord16le, putWord16be, etc…

Turns out, Binary also provides the Binary type class, parameterised on type, must like Aeson’s ToJSON and FromJSON type classes. However, it isn’t used as much for serialization to known protocols for precisely the reason that you don’t have enough control over how the basic Haskell types are represented.


From → Uncategorized

  1. Bardur Arantsson permalink

    This is a big deal.

    My particular pet peeve where this is an issue is with the current Haskell offerings for RDBMS access. For example, for PostgreSQL a ByteString may map to several different types: a) a “Byte Array” raw binary data which is stored “inline” in a row, or b) the BLOB which is raw binary data which is not stored directly “inline” in a row, and c) UTF-8 encoded character data (for example). ByteString doesn’t say anything about those.

    Without the ability to specify (somehow) exactly which native database column type (BYTEA, BLOB, UTF_8) you want to map a specific value to, you tend to have the choice taken away from you… and that sucks even more than having to specify explicit mappings for all queries.

  2. Bardur, you may be interested in looking at postgresql-simple, which aims to fix a lot of the problems associated with HDBC. I don’t recall all the issues surrounding bytea, blob, and text, but there are many issues that aren’t relevant to the access libraries, and I do have support for bytea serialization and deserialization to and from bytestrings, see the “Binary” type

    And there is nothing about Aeson that prevents you from serializing a boolean as say, as 0 and 1, or defining your own toJSON function for a record that renders one of it’s boolean fields as 0 and 1. (Maybe the application is particularly concerned about the size of the json file… but then why is it using Json?)

Comments are closed.

%d bloggers like this: