Module msgpack
The msgpack
module decodes raw MsgPack strings by converting them to Lua objects,
and encodes Lua objects by converting them to raw MsgPack strings.
Tarantool makes heavy internal use of MsgPack because tuples in Tarantool
are stored as MsgPack arrays.
Besides, starting from version 2.10.0, the msgpack
module enables creating a specific userdata Lua object – MsgPack object.
The MsgPack object stores arbitrary MsgPack data, and can be created from any Lua object including another MsgPack object
and from a raw MsgPack string. The MsgPack object has its own set of methods and iterators.
Note
- MsgPack is short for MessagePack.
- A “raw MsgPack string” is a byte array formatted according to the MsgPack specification including type bytes and sizes.
The type bytes and sizes can be made displayable with string.hex(),
or the raw MsgPack strings can be converted to Lua objects by using the
msgpack
module methods.
Below is a list of msgpack
members and related objects.
Members | |
---|---|
msgpack.encode(lua_value) | Convert a Lua object to a raw MsgPack string |
msgpack.encode(lua_value,ibuf) | Convert a Lua object to a raw MsgPack string in an ibuf |
msgpack.decode(msgpack_string) | Convert a raw MsgPack string to a Lua object |
msgpack.decode(C_style_string_pointer) | Convert a raw MsgPack string in an ibuf to a Lua object |
msgpack.decode_unchecked(msgpack_string) | Convert a raw MsgPack string to a Lua object |
msgpack.decode_unchecked(C_style_string_pointer) | Convert a raw MsgPack string to a Lua object |
msgpack.decode_array_header(byte-array, size) | Call the MsgPuck’s mp_decode_array function and return the array size and a pointer to the first array component |
msgpack.decode_map_header(byte-array, size) | Call the MsgPuck’s mp_decode_map function and return the map size and a pointer to the first map component |
__serialize parameter | Output structure specification |
msgpack.cfg() | Change MsgPack configuration settings |
msgpack.NULL | Analog of Lua’s nil |
msgpack.object(lua_value) | Create a MsgPack object from a Lua object |
msgpack.object_from_raw(msgpack_string) | Create a MsgPack object from a raw MsgPack string |
msgpack.object_from_raw(C_style_string_pointer, size) | Create a MsgPack object from a raw MsgPack string |
msgpack.is_object(some_argument) | Check if an argument is a MsgPack object |
Related objects | |
msgpack_object | A MsgPack object |
iterator_object | A MsgPack iterator object |
-
msgpack.
encode
(lua_value)¶ Convert a Lua object to a raw MsgPack string.
Parameters: - lua_value – either a scalar value or a Lua table value.
Return: the original contents formatted as a raw MsgPack string;
Rtype: raw MsgPack string
-
msgpack.
encode
(lua_value, ibuf) Convert a Lua object to a raw MsgPack string in an ibuf, which is a buffer such as buffer.ibuf() creates. As with encode(lua_value), the result is a raw MsgPack string, but it goes to the
ibuf
output instead of being returned.Parameters: - lua_value (
lua-object
) – either a scalar value or a Lua table value. - ibuf (
buffer
) – (output parameter) where the result raw MsgPack string goes
Return: number of bytes in the output
Rtype: raw MsgPack string
Example using buffer.ibuf() and ffi.string() and string.hex(): The result will be ‘91a161’ because 91 is the MessagePack encoding of “fixarray size 1”, a1 is the MessagePack encoding of “fixstr size 1”, and 61 is the UTF-8 encoding of ‘a’:
ibuf = require('buffer').ibuf() msgpack_string_size = require('msgpack').encode({'a'}, ibuf) msgpack_string = require('ffi').string(ibuf.rpos, msgpack_string_size) string.hex(msgpack_string)
- lua_value (
-
msgpack.
decode
(msgpack_string[, start_position])¶ Convert a raw MsgPack string to a Lua object.
Parameters: - msgpack_string (
string
) – a raw MsgPack string. - start_position (
integer
) – where to start, minimum = 1, maximum = string length, default = 1.
Return: - (if
msgpack_string
is a valid raw MsgPack string) the original contents ofmsgpack_string
, formatted as a Lua object, usually a Lua table, (otherwise) a scalar value, such as a string or a number; - “next_start_position”. If
decode
stops after parsing as far as byte N inmsgpack_string
, then “next_start_position” will equal N + 1, anddecode(msgpack_string, next_start_position)
will continue parsing from where the previousdecode
stopped, plus 1. Normallydecode
parses all ofmsgpack_string
, so “next_start_position” will equalstring.len(msgpack_string)
+ 1.
Rtype: Lua object and number
Example: The result will be [‘a’] and 4:
msgpack_string = require('msgpack').encode({'a'}) require('msgpack').decode(msgpack_string, 1)
- msgpack_string (
-
msgpack.
decode
(C_style_string_pointer, size) Convert a raw MsgPack string, whose address is supplied as a C-style string pointer such as the
rpos
pointer which is inside an ibuf such as buffer.ibuf() creates, to a Lua object. A C-style string pointer may be described ascdata<char *>
orcdata<const char *>
.Parameters: - C_style_string_pointer (
buffer
) – a pointer to a raw MsgPack string. - size (
integer
) – number of bytes in the raw MsgPack string
Return: - (if C_style_string_pointer points to a valid raw MsgPack string) the original contents
of
msgpack_string
, formatted as a Lua object, usually a Lua table, (otherwise) a scalar value, such as a string or a number; - returned_pointer = a C-style pointer to the byte after what was passed, so that C_style_string_pointer + size = returned_pointer
Rtype: table and C-style pointer to after what was passed
Example using buffer.ibuf and pointer arithmetic: The result will be [‘a’] and 3 and true:
ibuf = require('buffer').ibuf() msgpack_string_size = require('msgpack').encode({'a'}, ibuf) a, b = require('msgpack').decode(ibuf.rpos, msgpack_string_size) a, b - ibuf.rpos, msgpack_string_size == b - ibuf.rpos
- C_style_string_pointer (
-
msgpack.
decode_unchecked
(msgpack_string[, start_position])¶ Input and output are the same as for decode(string).
-
msgpack.
decode_unchecked
(C_style_string_pointer) Input and output are the same as for decode(C_style_string_pointer), except that
size
is not needed. Some checking is skipped, anddecode_unchecked(C_style_string_pointer)
can operate with string pointers to buffers whichdecode(C_style_string_pointer)
cannot handle. For an example see the buffer module.
-
msgpack.
decode_array_header
(byte-array, size)¶ Call the MsgPuck’s
mp_decode_array
function and return the array size and a pointer to the first array component. A subsequent call tomsgpack_decode
can decode the component instead of the whole array.Parameters: - byte-array – a pointer to a raw MsgPack string.
- size – a number greater than or equal to the string’s length
Return: - the size of the array;
- a pointer to after the array header.
Example:
-- Example of decode_array_header -- Suppose we have the raw data '\x93\x01\x02\x03'. -- \x93 is MsgPack encoding for a header of a three-item array. -- We want to skip it and decode the next three items. msgpack = require('msgpack'); ffi = require('ffi'); x, y = msgpack.decode_array_header(ffi.cast('char*', '\x93\x01\x02\x03'), 4) a = msgpack.decode(y, 1); b = msgpack.decode(y + 1, 1); c = msgpack.decode(y + 2, 1); a, b, c -- The result is: 1,2,3.
-
msgpack.
decode_map_header
(byte-array, size)¶ Call the MsgPuck’s
mp_decode_map
function and return the map size and a pointer to the first map component. A subsequent call tomsgpack_decode
can decode the component instead of the whole map.Parameters: - byte-array – a pointer to a raw MsgPack string.
- size – a number greater than or equal to the raw MsgPack string’s length
Return: - the size of the map;
- a pointer to after the map header.
Example:
-- Example of decode_map_header -- Suppose we have the raw data '\x81\xa2\x41\x41\xc3'. -- '\x81' is MsgPack encoding for a header of a one-item map. -- We want to skip it and decode the next map item. msgpack = require('msgpack'); ffi = require('ffi') x, y = msgpack.decode_map_header(ffi.cast('char*', '\x81\xa2\x41\x41\xc3'), 5) a = msgpack.decode(y, 3); b = msgpack.decode(y + 3, 1) x, a, b -- The result is: 1,"AA", true.
__serialize parameter
The MsgPack output structure can be specified with the __serialize
parameter:
- ‘seq’, ‘sequence’, ‘array’ – table encoded as an array
- ‘map’, ‘mappping’ – table encoded as a map
- function – the meta-method called to unpack the serializable representation of table, cdata, or userdata objects
Serializing ‘A’ and ‘B’ with different __serialize
values brings different
results. To show this, here is a routine which encodes {'A','B'}
both as an
array and as a map, then displays each result in hexadecimal.
function hexdump(bytes)
local result = ''
for i = 1, #bytes do
result = result .. string.format("%x", string.byte(bytes, i)) .. ' '
end
return result
end
msgpack = require('msgpack')
m1 = msgpack.encode(setmetatable({'A', 'B'}, {
__serialize = "seq"
}))
m2 = msgpack.encode(setmetatable({'A', 'B'}, {
__serialize = "map"
}))
print('array encoding: ', hexdump(m1))
print('map encoding: ', hexdump(m2))
Result:
**array** encoding: 92 a1 41 a1 42
**map** encoding: 82 01 a1 41 02 a1 42
The MsgPack Specification page explains that the first encoding means:
fixarray(2), fixstr(1), "A", fixstr(1), "B"
and the second encoding means:
fixmap(2), key(1), fixstr(1), "A", key(2), fixstr(2), "B"
Here are examples for all the common types, with the Lua-table representation on the left, with the MsgPack format name and encoding on the right.
Common Types and MsgPack Encodings
{} | ‘fixmap’ if metatable is ‘map’ = 80 otherwise ‘fixarray’ = 90 |
‘a’ | ‘fixstr’ = a1 61 |
false | ‘false’ = c2 |
true | ‘true’ = c3 |
127 | ‘positive fixint’ = 7f |
65535 | ‘uint 16’ = cd ff ff |
4294967295 | ‘uint 32’ = ce ff ff ff ff |
nil | ‘nil’ = c0 |
msgpack.NULL | same as nil |
[0] = 5 | ‘fixmap(1)’ + ‘positive fixint’ (for the key) + ‘positive fixint’ (for the value) = 81 00 05 |
[0] = nil | ‘fixmap(0)’ = 80 – nil is not stored when it is a missing map value |
1.5 | ‘float 64’ = cb 3f f8 00 00 00 00 00 00 |
-
msgpack.
cfg
(table)¶ Change MsgPack configuration settings.
The values are all either integers or boolean
true
/false
.Option Default Use cfg.encode_max_depth
128 The maximum recursion depth for encoding cfg.encode_deep_as_nil
false Specify whether to crop tables with nesting level deeper than cfg.encode_max_depth
. Not-encoded fields are replaced with one null. If not set, too high nesting is considered an error.cfg.encode_invalid_numbers
true Specify whether to enable encoding of NaN and Inf numbers cfg.encode_load_metatables
true Specify whether the serializer will follow __serialize metatable field cfg.encode_use_tostring
false Specify whether to use tostring()
for unknown typescfg.encode_invalid_as_nil
false Specify whether to use NULL for non-recognized types cfg.encode_sparse_convert
true Specify whether to handle excessively sparse arrays as maps. See detailed description below cfg.encode_sparse_ratio
2 1/ encode_sparse_ratio
is the permissible percentage of missing values in a sparse arraycfg.encode_sparse_safe
10 A limit ensuring that small Lua arrays are always encoded as sparse arrays (instead of generating an error or encoding as a map) cfg.encode_error_as_ext
true Specify how error objects (box.error.new()) are encoded in the MsgPack format:
- if
true
, errors are encoded as the the MP_ERROR MsgPack extension. - if
false
, the encoding format depends on other configuration options (encode_load_metatables
,encode_use_tostring
,encode_invalid_as_nil
).
cfg.decode_invalid_numbers
true Specify whether to enable decoding of NaN and Inf numbers cfg.decode_save_metatables
true Specify whether to set metatables for all arrays and maps - if
Sparse arrays features
During encoding, the MsgPack encoder tries to classify tables into one of four kinds:
- map - at least one table index is not unsigned integer
- regular array - all array indexes are available
- sparse array - at least one array index is missing
- excessively sparse array - the number of values missing exceeds the configured ratio
An array is excessively sparse when all the following conditions are met:
encode_sparse_ratio
> 0max(table)
>encode_sparse_safe
max(table)
>count(table)
*encode_sparse_ratio
MsgPack encoder never considers an array to be excessively sparse
when encode_sparse_ratio = 0
. The encode_sparse_safe
limit ensures
that small Lua arrays are always encoded as sparse arrays.
By default, attempting to encode an excessively sparse array
generates an error. If encode_sparse_convert
is set to true
,
excessively sparse arrays will be handled as maps.
msgpack.cfg() example 1:
If msgpack.cfg.encode_invalid_numbers = true
(the default),
then NaN and Inf are legal values. If that is not desirable, then
ensure that msgpack.encode()
does not accept them, by saying
msgpack.cfg{encode_invalid_numbers = false}
, thus:
tarantool> msgpack = require('msgpack'); msgpack.cfg{encode_invalid_numbers = true}
---
...
tarantool> msgpack.decode(msgpack.encode{1, 0 / 0, 1 / 0, false})
---
- [1, -nan, inf, false]
- 22
...
tarantool> msgpack.cfg{encode_invalid_numbers = false}
---
...
tarantool> msgpack.decode(msgpack.encode{1, 0 / 0, 1 / 0, false})
---
- error: ... number must not be NaN or Inf'
...
msgpack.cfg() example 2:
To avoid generating errors on attempts to encode unknown data types as userdata/cdata, you can use this code:
tarantool> httpc = require('http.client').new()
---
...
tarantool> msgpack.encode(httpc.curl)
---
- error: unsupported Lua type 'userdata'
...
tarantool> msgpack.cfg{encode_use_tostring = true}
---
...
tarantool> msgpack.encode(httpc.curl)
---
- !!binary tnVzZXJkYXRhOiAweDAxMDU5NDQ2Mzg=
...
Note
To achieve the same effect for only one call to msgpack.encode()
(that is without changing the configuration permanently), you can use
msgpack.new({encode_invalid_numbers = true}).encode({1, 2})
.
Similar configuration settings exist for JSON and YAML.
-
msgpack.
NULL
¶ A value comparable to Lua “nil” which may be useful as a placeholder in a tuple.
Example
tarantool> msgpack = require('msgpack') --- ... tarantool> y = msgpack.encode({'a',1,'b',2}) --- ... tarantool> z = msgpack.decode(y) --- ... tarantool> z[1], z[2], z[3], z[4] --- - a - 1 - b - 2 ... tarantool> box.space.tester:insert{20, msgpack.NULL, 20} --- - [20, null, 20] ...
-
msgpack.
object
(lua_value)¶ Since: 2.10.0
Encode an arbitrary Lua object into the MsgPack format.
Parameters: - lua_value (
lua-object
) – a Lua object of any type.
Return: encoded MsgPack data encapsulated in a MsgPack object.
Rtype: userdata
Example:
local msgpack = require('msgpack') -- Create a MsgPack object from a Lua object of any type local mp_from_number = msgpack.object(123) local mp_from_string = msgpack.object('hello world') local mp_from_array = msgpack.object({ 10, 20, 30 }) local mp_from_table = msgpack.object({ band_name = 'The Beatles', year = 1960 }) local mp_from_tuple = msgpack.object(box.tuple.new(1, 'The Beatles', 1960))
- lua_value (
-
msgpack.
object_from_raw
(msgpack_string)¶ Since: 2.10.0
Create a MsgPack object from a raw MsgPack string.
Parameters: - msgpack_string (
string
) – a raw MsgPack string.
Return: a MsgPack object
Rtype: userdata
Example:
local msgpack = require('msgpack') -- Create a MsgPack object from a raw MsgPack string local raw_mp_string = msgpack.encode({ 10, 20, 30 }) local mp_from_mp_string = msgpack.object_from_raw(raw_mp_string)
- msgpack_string (
-
msgpack.
object_from_raw
(C_style_string_pointer, size) Since: 2.10.0
Create a MsgPack object from a raw MsgPack string. The address of the MsgPack string is supplied as a C-style string pointer such as the
rpos
pointer inside anibuf
that the buffer.ibuf() creates. A C-style string pointer may be described ascdata<char *>
orcdata<const char *>
.Parameters: - C_style_string_pointer (
buffer
) – a pointer to a raw MsgPack string. - size (
integer
) – number of bytes in the raw MsgPack string.
Return: a MsgPack object
Rtype: userdata
Example:
local msgpack = require('msgpack') -- Create a MsgPack object from a raw MsgPack string using buffer local buffer = require('buffer') local ibuf = buffer.ibuf() msgpack.encode({ 10, 20, 30 }, ibuf) local mp_from_mp_string_pt = msgpack.object_from_raw(ibuf.buf, ibuf:size())
- C_style_string_pointer (
-
msgpack.
is_object
(some_argument)¶ Since: 2.10.0
Check if the given argument is a MsgPack object.
Parameters: - some_agrument – any argument.
Return: true
if the argument is a MsgPack object; otherwise,false
Rtype: boolean
Example:
local msgpack = require('msgpack') local mp_from_string = msgpack.object('hello world') -- Check if the given argument is a MsgPack object local mp_is_object = msgpack.is_object(mp_from_string) -- Returns true local string_is_object = msgpack.is_object('hello world') -- Returns false