APP area C. Bormann Internet-Draft Universitaet Bremen TZI Intended status: Informational October 11, 2012 Expires: April 14, 2013 The binarypack JSON-like representation format draft-bormann-apparea-bpack-00 Abstract JSON (RFC 4627) is an extremely successful format for the representation of structured information, supporting Boolean values, numbers, strings, arrays, and tables. Recently, a number of applications have started to look for binary representation formats that solve a similar problem. In particular, constrained node networks can benefit from such a binary representation format. A very successful binary representation that is otherwise comparable to JSON is MessagePack. Recently, a slight modification of MessagePack has been defined that allows for distinguishing UTF-8 strings from binary data. This draft, as an independent effort, documents the resulting BinaryPack format. The current version -00 of this document is an early draft that is intended to spark further discussion. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on April 14, 2013. Copyright Notice Copyright (c) 2012 IETF Trust and the persons identified as the document authors. All rights reserved. Bormann Expires April 14, 2013 [Page 1] Internet-Draft binarypack October 2012 This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 1.2. Notation . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. The binarypack Representation Format . . . . . . . . . . . . . 5 2.1. Data Types . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2. Integers . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.3. Floating Point Values . . . . . . . . . . . . . . . . . . 6 2.4. Special Values . . . . . . . . . . . . . . . . . . . . . . 6 2.5. Opaque Byte Strings . . . . . . . . . . . . . . . . . . . 7 2.6. UTF-8 Strings . . . . . . . . . . . . . . . . . . . . . . 7 2.7. Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.8. Tables . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.1. JSON roundtripping . . . . . . . . . . . . . . . . . . . . 9 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 5. Security Considerations . . . . . . . . . . . . . . . . . . . 11 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 12 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13 7.1. Normative References . . . . . . . . . . . . . . . . . . . 13 7.2. Informative References . . . . . . . . . . . . . . . . . . 13 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 14 Bormann Expires April 14, 2013 [Page 2] Internet-Draft binarypack October 2012 1. Introduction (To be written - for now please see the Abstract.) A description of the MessagePack binary representation format can be found in [msgpack]. An implementation of BinaryPack is available in [binarypack]. 1.1. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. The term "byte" is used in its now customary sense as a synonym for "octet". All multi-byte integers in this protocol are interpreted in network byte order. Where arithmetic is used, this specification uses the notation familiar from the programming language C, except that the operator "**" stands for exponentiation. 1.2. Notation This specification uses a trivial notation for code bytes and the bitfields in them the meaning of which should be mostly obvious. More formally speaking, the meaning of the notation is: Potential values for the code bytes themselves are expressed by templates that represent 8-bit most-significant-bit-first binary numbers (without any special prefix), where 0 stands for 0, 1 for 1, and variable segments in these code byte templates are indicated by sequences of the same letter such as kkkkkkk or ssss, the length of which indicates the length of the variable segment in bits. In the notation of values derived from the code bytes, 0b is used as a prefix for expressing binary numbers in most-significant-bit first notation (akin to the use of 0x for most-significant-digit-first hexadecimal numbers in the C programming language). Where the abovementioned sequences of letters are then referenced in such a binary number in the text, the intention is that the value from these bitfields in the actual code byte be inserted. Example: The code byte template Bormann Expires April 14, 2013 [Page 3] Internet-Draft binarypack October 2012 101nssss stands for a byte that starts (most-significant-bit-first) with the bits 1, 0, and 1, and continues with five variable bits, the first of which is referenced as "n" and the next four are referenced as "ssss". Based on this code byte template, a reference to 0b0ssss000 means a binary number composed from a zero bit, the four bits that are in the "ssss" field (for 101nssss, the four least significant bits) in the actual byte encountered, kept in the same order, and three more zero bits. Also, 0xhh stands for the hexadecimal value hh, and 1b, 2b, 4b, 8b, nb stand for 1, 2, 4, 8, or n bytes of data following; (1b) etc. stand for the numerical value as an integer interpreted in network byte order; nd stands for n data objects, each in binarypack representation format. Bormann Expires April 14, 2013 [Page 4] Internet-Draft binarypack October 2012 2. The binarypack Representation Format 2.1. Data Types The binarypack representation format is able to represent the following data types: o Integers (represented in signed and unsigned forms) o Floating point values (in IEEE 754 32-bit and 64-bit forms) o special values nil, false, true o opaque ("raw") byte strings o UTF-8 strings o arrays, which can contain any combination of data types o tables (often called maps, hashes, dictionaries; objects in JSON), which contain pairs, key and value, which may in turn be of any data type This list is mostly faithful to JSON [RFC4627], which however does not distinguish integer from floating point number types. Based on recent discussions on the use of binary representation formats, this specification distinguishes UTF-8 strings from opaque binary strings. (Interestingly, this was already done in the binaryjs implementation of a "95 % MessagePack" format [binarypack], so the author of the present specification could just lazily copy that.) 2.2. Integers binarypack provides a number of representations for integer values, assuming that these occur often. The encoder is free to choose any of these representations that is able to represent the desired value. Bormann Expires April 14, 2013 [Page 5] Internet-Draft binarypack October 2012 +----------+--------------+---------------------------+ | Bits | Value | Description | +----------+--------------+---------------------------+ | 0nnnnnnn | 0bnnnnnnn | Positive Fixnum (0..127) | | | | | | 111nnnnn | 0bnnnnn - 32 | Negative Fixnum (-32..-1) | | | | | | 0xcc 1b | 1b as uint | Unsigned Integer | | | | | | 0xcd 2b | 2b as uint | Unsigned Integer | | | | | | 0xce 4b | 4b as uint | Unsigned Integer | | | | | | 0xcf 8b | 8b as uint | Unsigned Integer | | | | | | 0xd0 1b | 1b as sint | Signed Integer | | | | | | 0xd1 2b | 2b as sint | Signed Integer | | | | | | 0xd2 4b | 4b as sint | Signed Integer | | | | | | 0xd3 8b | 8b as sint | Signed Integer | +----------+--------------+---------------------------+ 2.3. Floating Point Values binarypack provides 32-bit and 64-bit IEEE 754 values. +---------+-----------------------+-------------+ | Bits | Value | Description | +---------+-----------------------+-------------+ | 0xca 4b | 4b as 32-bit IEEE 754 | Float | | | | | | 0xcb 8b | 8b as 64-bit IEEE 754 | Double | +---------+-----------------------+-------------+ 2.4. Special Values +------+-------+---------------+ | Bits | Value | Description | +------+-------+---------------+ | 0xc0 | nil | null, nothing | | | | | | 0xc2 | false | Boolean false | | | | | | 0xc3 | true | Boolean true | +------+-------+---------------+ Bormann Expires April 14, 2013 [Page 6] Internet-Draft binarypack October 2012 2.5. Opaque Byte Strings +-------------+------------+----------------------------------+ | Bits | Value | Description | +-------------+------------+----------------------------------+ | 1010nnnn nb | n = 0bnnnn | Short byte string (0..15 bytes) | | | | | | 0xda 2b nb | n = (2b) | byte string (0..(2**16-1) bytes) | | | | | | 0xdb 4b nb | n = (4b) | byte string (0..(2**32-1) bytes) | +-------------+------------+----------------------------------+ 2.6. UTF-8 Strings +-------------+------------+-----------------------------------+ | Bits | Value | Description | +-------------+------------+-----------------------------------+ | 1011nnnn nb | n = 0bnnnn | Short UTF-8 string (0..15 bytes) | | | | | | 0xd8 2b nb | n = (2b) | UTF-8 string (0..(2**16-1) bytes) | | | | | | 0xd9 4b nb | n = (4b) | UTF-8 string (0..(2**32-1) bytes) | +-------------+------------+-----------------------------------+ The assumption is that the UTF-8 strings are in net unicode [RFC5198]. 2.7. Arrays +-------------+------------+------------------------------------+ | Bits | Value | Description | +-------------+------------+------------------------------------+ | 1001nnnn nd | n = 0bnnnn | Short array (0..15 data elements) | | | | | | 0xdc 2b nd | n = (2b) | array (0..(2**16-1) data elements) | | | | | | 0xdd 4b nd | n = (4b) | array (0..(2**32-1) data elements) | +-------------+------------+------------------------------------+ Bormann Expires April 14, 2013 [Page 7] Internet-Draft binarypack October 2012 2.8. Tables +-------------+----------------+---------------------------------+ | Bits | Value | Description | +-------------+----------------+---------------------------------+ | 1000nnnn nd | n = 2 * 0bnnnn | Short table (0..15 data pairs) | | | | | | 0xde 2b nd | n = 2 * (2b) | table (0..(2**16-1) data pairs) | | | | | | 0xdf 4b nd | n = 2 * (4b) | table (0..(2**32-1) data pairs) | +-------------+----------------+---------------------------------+ The sequence of n elements is a sequence of pairs of one key followed by its associated value. Bormann Expires April 14, 2013 [Page 8] Internet-Draft binarypack October 2012 3. Discussion This draft tries to be faithful to the successful MessagePack [msgpack] format, with the exception of enabling the distinction between opaque binary byte strings and UTF-8 byte strings, as introduced in the binaryjs project [binarypack]. Little analysis has been made whether a slightly different bit allocation (e.g., using up fewer of the code combination for fixnums) would be advantageous. A short floating point (e.g., based on the 16-bit IEEE 754 floating point value) might be useful. Adding decimal floating point values probably is not so useful, except where high fidelity to JSON is desired. Some additional data types might be useful for some protocols, e.g. UUIDs. This would further increase the distance from JSON created by distinguishing opaque and UTF-8 strings. 3.1. JSON roundtripping binarypack enables mostly lossless translation to JSON. JSON [RFC4627]. JSON roundtripping, however, is not necessarily the primary design goal of binarypack, but it is a consideration. In the translation of binarypack to JSON, opaque byte strings SHOULD be converted to equivalent base64url [RFC4648] UTF-8 strings. Without a schema, it is hard to do the inverse consistently, as base64url encoded byte strings are not specially marked up in JSON. When translating binarypack floating point values to JSON, the usual problem of converting binary fractions to decimal representation arises. In the other direction, the choice of a floating point format may be hard to do properly. Clearly, any number that can be transformed from a 64-bit IEEE 754 number to a 32-bit IEEE 754 number without loss of information can be represented as the latter. Without schema information, it may be hard to find other cases where the precision maybe is not that important. Bormann Expires April 14, 2013 [Page 9] Internet-Draft binarypack October 2012 4. IANA Considerations Once this has received some discussion, we will understand how exactly to register Internet media types for this. Bormann Expires April 14, 2013 [Page 10] Internet-Draft binarypack October 2012 5. Security Considerations (Nothing but generic warnings about correctly implementing protocol encoders/decoders so far; this section will certainly grow as additional security considerations become known.) Bormann Expires April 14, 2013 [Page 11] Internet-Draft binarypack October 2012 6. Acknowledgements MessagePack was developed and promoted by Sadayuki Furuhashi ("frsyuki"). BinaryPack is a minor derivation of MessagePack that was developed by Eric Zhang for the binaryjs project. The author of the present specification deserves absolutely no credits whatsoever for any of this. Bormann Expires April 14, 2013 [Page 12] Internet-Draft binarypack October 2012 7. References 7.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", RFC 4648, October 2006. [RFC5198] Klensin, J. and M. Padlipsky, "Unicode Format for Network Interchange", RFC 5198, March 2008. 7.2. Informative References [RFC4627] Crockford, D., "The application/json Media Type for JavaScript Object Notation (JSON)", RFC 4627, July 2006. [binarypack] Zhang, E., "BinaryPack for Javascript browsers", 2012, . [msgpack] Ohta, K. and S. Colebourne, "MessagePack format specification", 2011, . Bormann Expires April 14, 2013 [Page 13] Internet-Draft binarypack October 2012 Author's Address Carsten Bormann Universitaet Bremen TZI Postfach 330440 Bremen D-28359 Germany Phone: +49-421-218-63921 Email: cabo@tzi.org Bormann Expires April 14, 2013 [Page 14]