Content Types and Character Encodings

NOTE! Updated for FHIR 0.9

If you’re building a Fhir Client or Server, you will soon or later encouter the issue of Fhir’s (or rather Http’s) use of content types and character sets. They are easy to get wrong, and we currently have spread information about them out over (at least) two places in the Fhir spec, which does not really help.

The notions of “format”, “content type”, “content encoding”, “character encoding” and “character set” are by themselves already a source of confusion, so, at least for this blog, I’ll use “content type” to mean MIME-types like “application/pdf” and “image/jpeg”, and “character encoding” for character-to-byte encodings like “utf8” and “US-ASCII”.

Let’s take this from a Fhir client’s perspective: you are trying to get information from a Fhir server. If you are not sending any information to the server yourself (like when POSTing a new resource), you are just responsible for indicating whether you’d like to receive the server’s answer in Json or in Xml. You do this by sending, in your request to the server, an “Accept” header, which can contain the following content types:

xml json
Resource application/fhir+xml (changed!) application/fhir+json (changed!)
Bundle application/atom+xml application/fhir+json (changed!)

As you can see, you have to chose a different content type based on whether the REST operation you invoke returns a single resource (for operations like read and conformance) or a bundle (for operations like history and search). This header is optional, if you don’t send it, the server will assume the default, which is Xml.

Now, before you run off and code this into your client, there’s also the issue of character encodings to consider. To make life easy, FHIR has chosen to make UTF-8 its only acceptable character encoding. Period. If you feel you don’t need to worry about “all that”, please do read Joel Spolsky’s excellent blog on character encodings. Ofcourse, there’s a “but”: Http’s default character encoding is defined to be ISO-8859-1, so to make this work, both the client and the server must always indicate they are using UTF-8. In our case, this means we have to add “charset=utf-8” parameter to the Accept header like this:

Accept: application/fhir+xml;charset=utf-8

I assume most servers will be quite lenient about this. For example, my server accepts other common xml content types (like application/xml, text/xml) as well and will completely ignore whatever you put in the charset parameter (but will always speak to you in utf-8). Others might be strict and return you a Http 406 (Not acceptable) if you stray from the path.

If you wish, you can forget about these headers completely, and use the _format parameter in your Url. This will override whatever is in the Http headers. Acceptable values for the _format parameter are “xml”, “json” and even “application/json” or “application/fhir+xml”, but be aware to encode the “+” in your Url if you do that. We have added this to the specification explicitly so you can show Fhir in json and xml to all your friends using just a browser when you find yourself at a geek party.

When the server processes your request, it will try to reply to you in the content encoding you specified. Servers are not required to implement the Json format, so, even if you stick to the content types I have described above, the server may still return you Xml or even a 406 (Not acceptable) when you ask it for Json. I am afraid it’s up to you what to do about this if you’re a light-weight, json-only script client. Whatever the server choses, it will return your Bundle or Resource and indicate it’s type in the Content-Type header (including, again, the character encoding):

Content-Type: application/fhir+xml;charset=utf-8

There’s one more thing to be aware of as a client: if you are submitting data yourself to the server (e.g. on a create or update operation), you have to tell the server whether you are using Xml or Json using the Content-Type header. This is in addition to the Accept header as described above, so your request will contain two headers with content types. So, yes, it is possible to send the server data in Json and request it to have its response back in Xml:

Accept: application/fhir+xml;charset=utf-8
Content-Type: application/json;charset=utf-8

Whever you get confused about which header to send (which I frequently do): we added the summary table at the bottom of the http page to help you out.

Advertisements

3 thoughts on “Content Types and Character Encodings

  1. Interested observer

    Interesting

    How does FHIR reconcile with what the CommonWell Alliance is trying to do? Is it similar? Based on the ONC committee update published yesterday, it sounds like some of the underlying protocols would need to be similar to FHIR’s goals.

    Thanks

    Reply
    1. ewoutkramer Post author

      We have the intention to register it, but were told it was a lengthy and hard process, so we have spent our time on other things instead. In fact we were waiting for someone to come around who knows how to navigate around the pitfalls 😉

      Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s