Envelopes in APIs
Tags: Architecture and Technology
Data is the king. APIs are everywhere.
The first thing we think of when we build APIs is the data. After all, that’s what fuels our apps. Many MVPs just expose the DB for the rich clients, be it mobile apps or SPAs.
But data is not the only part of a good API. And many MVPs came to that realisation fairly quickly. The missing part is what goes around data — the envelope.
While the data is what makes products valuable, the envelope is what makes an API nice to use. Its main purpose is to provide infrastructure for the communication between the client and the server.
Apart from delivering the data APIs need to facilitate the delivery and so might require some additional features. One of the most common needs is error signalling. Servers might need to indicate that they can not fulfil the client’s request and being able to give detailed feedback is essential for debugging issues during implementation and providing meaningful feedback to the user during API use.
Standardised APIs that might have multiple implementations both on the client and server sides can benefit from structured payloads. The ability to extend the protocol enables experimentation and the API evolution while maintaining interoperability.
Let’s take a look at a few examples.
GraphQL
GraphQL is an asymmetric API. Requests are GraphQL documents and responses are JSON documents.
Responses might look like this:
{
  "errors": [{
    "message": "Error message",
    "locations": [{"line": 6, "column": 12}],
    "path": ["whatever"]
  }],
  "data": {
    "key": "value"
  }
}
The application-specific data is in the data key, its structure depends on the
request. The rest is the envelope. The main function is to signal errors back to
the client if any.
JSON:API
Another popular option is JSON:API. It’s been around just a little bit longer than GraphQL.
It’s a symmetric protocol. Both requests and responses are represented by the same format and serialised, obviously, into JSON. It looks something like this:
{
  "data": {...},
  "errors": [{
    "id": "42-12",
    "status": 503,
    "code": "full-moon",
    "title": "Full Moon",
    "detail": "During full Moon service is unavailable",
  }],
  "meta": {
    "arbitrary": "data",
  }
}
data is highly structured. The envelope itself is rather rich too. It provides
error reporting back to the client. It also allows for implementation-specific
data in meta keys in various places in the document.
Version 1.1 also adds extensions and profiles to extend the protocol in a
standard and interoperable way. On top of that @-prefixed keys are allowed
anywhere in the document, basically enabling implementation-specific extensions.
Unlike GraphQL, JSON:API expects HTTP as a transport and so uses its capabilities to specify extensions and profiles. It also reuses HTTP for some of its semantics.
gRPC
gRPC is a fairly new protocol but it’s only one of many RPC-style protocols.
It uses protobufs for serialisation and HTTP/2 for transport. However, docs mention that other serialisation methods and transports can be used. However, I have not found a specification for gRPS so it seems only one reference implementation exists and it’s the spec.
Requests can contain some protocol-level information such as response deadline.
Messages can also have metadata attached. Metadata is a list of key-value pairs. Keys are strings and values are either strings of binary blobs.
Response messages also include status and an optional status message.
JSON-RPC
Another example of an RPC-style protocol is JSON-RPC.
It’s transport agnostics. Its documents are highly structured.
Example from the spec:
// Request
{"jsonrpc": "2.0", "method": "subtract", "params": [42, 23], "id": 1}
// Response
{"jsonrpc": "2.0", "result": 19, "id": 1}
Response message can include error information if any.
One of the most prominent uses of this protocol is Language Server Protocol.
The spec mentions extensions but I couldn’t find any so can’t tell how exactly the protocol can be extended.
XML-RPC
XML-RPC is an earlier entry to the series of RPC-style protocol. It probably was an inspiration for JSON-RPC as it explicitly mentions XML-RPC and reserves some of its error codes.
XML-RPC, obviously, uses XML as its serialisation format. Its messages are highly structured.
Responses can contain error information if any.
Spec mentions extensibility as one of its goals but doesn’t provide any further details.
SOAP
XML-RPC has never reached an RFC status but after its submission to IETF it evolved into SOAP.
SOAP uses XML as its data model and default serialisation format. It actually
names its message root element Envelope.
Example message:
<?xml version="1.0"?>
<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope">
  <soap:Header>
  </soap:Header>
  <soap:Body>
    <m:GetStockPrice xmlns:m="http://www.example.org">
      <m:StockName>T</m:StockName>
    </m:GetStockPrice>
  </soap:Body>
</soap:Envelope>
Header contains a “header block” which are XML nodes with arbitrary structure.
These headers are intended for the message processing nodes.
Body has arbitrary structure as well but intended for the application
receiving the message.
Message can also have Fault node for error signaling.
SOAP messages are explicitly extensible. Spec provides some details on how to properly specify and implement extensions.
SOAP is different to the protocols we examined so far in that it’s a uni-directional protocol. That is, it doesn’t require a response in the same connection. It can leverage underlaying transport but it’s not a part of the spec. It also acknowledges that the message might travel through a few intermediate nodes before reaching its destination. The protocol includes some means to influence message processing on the nodes. For example, some header blocks can be addressed specifically to intermediate nodes.
XMPP
While we’re on the topic of XML let’s take a look at XMPP. XMPP doesn’t have a reputation of a general purpose protocol but it’s very extensive (it’s in the name) and has a few extensions that allow to use it as such. For example, there are multiple extensions that allow exchange of arbitrary binary data of various sizes. There’s an exception allowing to upload files over HTTP. And, being an XML-based protocol, there’s an extension for exchanging SOAP messages.
XMPP is a session-based protocol. It provides session-based capabilities such as choice of transport, authentication, encryption, etc. with error reporting, of course.
Within session messages of different types can be exchanged. iq messages are
basically request-response protocol we’ve seen in most of the protocols so far.
And they also have error reporting scoped to an individual request.
HTTP
Many of the protocols above use HTTP for transport.
HTTP is a fairly simple protocol.
An interesting historic fact is that the very first version of HTTP (released in
1991) had only one type of requests: GET. And response to that was just the
requested document: raw bytes of the document, no headers, no status.
It became apparent fairly quickly that it’s not enough. In 1992 draft of HTTP/1.0 became the protocol we all know today.
HTTP request contains request line (method, protocol version and path), headers, and an optional request body.
Response message contain status line, headers and response body.
Both messages borrow heavily from RFC 822, we’ll get to it in minute.
This simple format gives us error reporting in the status line and extensibility in the headers. Applications can specify their own headers as much as they want.
HTTP/1.1 adds chunked encoding: sending a message in chinks. And with that allows for trailing headers chunk that can contain some of the headers defined in the spec but also any application-specified header, too.
HTTP/1.0 and HTTP/1.1 are text-based (apart from the body) TCP protocols.
HTTP/2.0 becomes a binary protocol but preserves semantics.
HTTP/3.0 switches to UDP. Again, preserving semantics.
As mentioned above, HTTP was inspired by RFC 822. Its title is “Standard for ARPA Internet Text Messages”. Later incarnations of this RFC are named “Internet Message Format”. It might be not quite obvious but this is all about e-mail.
RFC 822 and later editions only describe message format. No protocol is
mentioned here because this message format is used in many protocols. SMTP,
IMAP, POP — all use this message format. .eml files also contain messages in
this format.
If you’re familiar with HTTP you’ll find this format familiar too. It’s basically HTTP without the request/status line. Historically and causally it’s the other way around, obviously.
Internet Message consists of a block of headers and a message body.
It’s allowed to specify any custom header you want as long you follow header
format in the RFC. The first RFC recommends using X- prefix to avoid potential
name collisions but later editions do not.
There’s no error signalling for obvious reasons.
TCP
On a lower level we have TCP. It’s a transport level protocol. It’s built around
the principle of confirmation of success. That is rather than letting the
counterpart know when something went wrong protocol lets them know when things
went right. This is done by sending back a packet with ACK flag set. If a
packet got lost or corrupted on its way, receiver doesn’t acknowledge the packet
and sender eventually resends it.
A TCP packet consists of a header and data. There’s no structure to the data part. It’s just a sequence of bits. Header contains a bunch of protocol-related fields. One of them is checksum that allows corruption detection.
At the end of the header is a list of options. These options are defined in an IANA registry but 2 of them are reserved for experimentation and can contain whatever. Space for options is very limited but there are proposals for extending it in works.
IP
Still lower is IP. IP’s main concern is addressing. It allows for corruption in the data section. That should be handled by the next level protocol (e.g. TCP). Yet, it includes a header checksum to make sure that at least addressing is fine. Corrupted headers result in packets being discarded. Higher level protocols should handle these “missing” packets. ICMP might be used to report errors on the IP-level.
IPv4
The vey first proposal for IP back in 1977 (IEN 2) did not allow for any extensibility. But the actual IPv4 proposal (RFC 791) has a variable options field. Whole 4 option numbers reserver for experimental use and can contain any data in them.
IPv6
IPv6 reduces number of header fields and opts for Extension Headers. One of them is Options header. It’s similar to IPv4 in its use. 2 option numbers are reserved for experimentation and can contain any data.
Don’t forget the envelope
From the dawn of Internet protocols and APIs used the concept of Envelope. It serves different purposes but there are a few clear categories of functionality:
- Operational information. Data required for the protocol/API operation. For example, addressing in IP, sequencing and fragmentation in TCP, typing/encoding in e-mail and HTTP, routing in RPC, etc.
- Status signalling. This might be classified as operation information but it’s specifically concerned with quality of service and error recovery. Examples are, acknowledgement of packets in TCP, response status code in HTTP, error fields in SOAP, JSON:API and other.
- Extensibility affordances. No protocol/API can cover 100% of use cases. So allowing a bit of flexibility helps a lot in adoption and adaptation to changes.
All these concerns inevitably arise in any protocol or API. Just maybe not right away. So they might not be at the forefront of thought when building an API MVP. It’s fine to just dump some JSON at that stage. But it’s worth keeping in mind that eventually all these concerns will have to be addressed so it’s better to have an envelope that allows for that.