Understanding the binary protocol | Tarantool
How-to guides More guides Understanding the binary protocol

Understanding the binary protocol

To communicate with each other, Tarantool instances use a binary protocol called iproto. To learn more, see the Binary protocol section.

In this set of examples, the user will be looking at binary code transferred via iproto. The code is intercepted with tcpdump, a monitoring utility.

To follow the examples in this section, get a single Linux computer and start three command-line shells (“terminals”).

– On terminal #1, Start monitoring port 3302 with tcpdump:

sudo tcpdump -i lo 'port 3302' -X

On terminal #2, start a server with:

box.cfg{listen=3302}
box.schema.space.create('tspace')
box.space.tspace:create_index('I')
box.space.tspace:insert{280}
box.schema.user.grant('guest','read,write,execute,create,drop','universe')

On terminal #3, start another server, which will act as a client, with:

box.cfg{}
net_box = require('net.box')
conn = net_box.connect('localhost:3302')

On terminal #3, run the following:

conn.space.tspace:select(280)

Now look at what tcpdump shows for the job connecting to 3302 – the “request”. After the words “length 32” is a packet that ends with these 32 bytes (we have added indented comments):

ce 00 00 00 1b   MP_UINT = decimal 27 = number of bytes after this
82               MP_MAP, size 2 (we'll call this "Main-Map")
01                 IPROTO_SYNC (Main-Map Item#1)
04                 MP_INT = 4 = number that gets incremented with each request
00                 IPROTO_REQUEST_TYPE (Main-Map Item#2)
01                 IPROTO_SELECT
86                 MP_MAP, size 6 (we'll call this "Select-Map")
10                   IPROTO_SPACE_ID (Select-Map Item#1)
cd 02 00             MP_UINT = decimal 512 = id of tspace (could be larger)
11                   IPROTO_INDEX_ID (Select-Map Item#2)
00                   MP_INT = 0 = id of index within tspace
14                   IPROTO_ITERATOR (Select-Map Item#3)
00                   MP_INT = 0 = Tarantool iterator_type.h constant ITER_EQ
13                   IPROTO_OFFSET (Select-Map Item#4)
00                   MP_INT = 0 = amount to offset
12                   IPROTO_LIMIT (Select-Map Item#5)
ce ff ff ff ff       MP_UINT = 4294967295 = biggest possible limit
20                   IPROTO_KEY (Select-Map Item#6)
91                   MP_ARRAY, size 1 (we'll call this "Key-Array")
cd 01 18               MP_UINT = 280 (Select-Map Item#6, Key-Array Item#1)
                       -- 280 is the key value that we are searching for

Now read the source code file net_box.c and skip to the line netbox_encode_select(lua_State *L). From the comments and from simple function calls like mpstream_encode_uint(&stream, IPROTO_SPACE_ID); you will be able to see how net_box put together the packet contents that you have just observed with tcpdump.

There are libraries for reading and writing MessagePack objects. C programmers sometimes include msgpuck.h.

Now you know how Tarantool itself makes requests with the binary protocol. When in doubt about a detail, consult net_box.c – it has routines for each request. Some connectors have similar code.

For an IPROTO_UPDATE example, suppose a user changes field #2 in tuple #2 in space #256 to 'BBBB'. The body will look like this: (notice that in this case there is an extra map item IPROTO_INDEX_BASE, to emphasize that field numbers start with 1, which is optional and can be omitted):

04               IPROTO_UPDATE
85               IPROTO_MAP, size 5
10                 IPROTO_SPACE_ID, Map Item#1
cd 02 00           MP_UINT 256
11                 IPROTO_INDEX_ID, Map Item#2
00                 MP_INT 0 = primary-key index number
15                 IPROTO_INDEX_BASE, Map Item#3
01                 MP_INT = 1 i.e. field numbers start at 1
21                 IPROTO_TUPLE, Map Item#4
91                 MP_ARRAY, size 1, for array of operations
93                   MP_ARRAY, size 3
a1 3d                   MP_STR = OPERATOR = '='
02                      MP_INT = FIELD_NO = 2
a5 42 42 42 42 42       MP_STR = VALUE = 'BBBB'
20                 IPROTO_KEY, Map Item#5
91                 MP_ARRAY, size 1, for array of key values
02                   MP_UINT = primary-key value = 2

Byte codes for the IPROTO_EXECUTE example:

0b               IPROTO_EXECUTE
83               MP_MAP, size 3
43                 IPROTO_STMT_ID Map Item#1
ce d7 aa 74 1b     MP_UINT value of n.stmt_id
41                 IPROTO_SQL_BIND Map Item#2
92                 MP_ARRAY, size 2
01                   MP_INT = 1 = value for first parameter
a1 61                MP_STR = 'a' = value for second parameter
2b                 IPROTO_OPTIONS Map Item#3
90                 MP_ARRAY, size 0 (there are no options)

Byte codes for the response to the box.space.space-name:insert{6} example:

ce 00 00 00 20                MP_UINT = HEADER AND BODY SIZE
83                            MP_MAP, size 3
00                              IPROTO_REQUEST_TYPE
ce 00 00 00 00                  MP_UINT = IPROTO_OK
01                              IPROTO_SYNC
cf 00 00 00 00 00 00 00 53      MP_UINT = sync value
05                              IPROTO_SCHEMA_VERSION
ce 00 00 00 68                  MP_UINT = schema version
81                            MP_MAP, size 1
30                              IPROTO_DATA
dd 00 00 00 01                  MP_ARRAY, size 1 (row count)
91                              MP_ARRAY, size 1 (field count)
06                              MP_INT = 6 = the value that was inserted

Byte codes for the response to the conn:eval([[box.schema.space.create('_space');]]) example:

ce 00 00 00 3b                  MP_UINT = HEADER AND BODY SIZE
83                              MP_MAP, size 3 (i.e. 3 items in header)
   00                              IPROTO_REQUEST_TYPE
   ce 00 00 80 0a                  MP_UINT = hexadecimal 800a
   01                              IPROTO_SYNC
   cf 00 00 00 00 00 00 00 26      MP_UINT = sync value
   05                              IPROTO_SCHEMA_VERSION
   ce 00 00 00 78                  MP_UINT = schema version value
   81                              MP_MAP, size 1
     31                              IPROTO_ERROR_24
     db 00 00 00 1d 53 70 61 63 etc. MP_STR = "Space '_space' already exists"

Byte codes, if we use the same net.box connection that we used in the beginning and we say
conn:execute([[CREATE TABLE t1 (dd INT PRIMARY KEY AUTOINCREMENT, дд STRING COLLATE "unicode");]])
conn:execute([[INSERT INTO t1 VALUES (NULL, 'a'), (NULL, 'b');]])
and we watch what tcpdump displays, we will see two noticeable things: (1) the CREATE statement caused a schema change so the response has a new IPROTO_SCHEMA_VERSION value and the body includes the new contents of some system tables (caused by requests from net.box which users will not see); (2) the final bytes of the response to the INSERT will be:

81   MP_MAP, size 1
42     IPROTO_SQL_INFO
82     MP_MAP, size 2
00       Tarantool constant (not in iproto_constants.h) = SQL_INFO_ROW_COUNT
02       1 = row count
01       Tarantool constant (not in iproto_constants.h) = SQL_INFO_AUTOINCREMENT_ID
92       MP_ARRAY, size 2
01         first autoincrement number
02         second autoincrement number

Byte codes for the SQL SELECT example, if we ask for full metadata by saying
conn.space._session_settings:update('sql_full_metadata', {{'=', 'value', true}})
and we select the two rows from the table that we just created
conn:execute([[SELECT dd, дд AS д FROM t1;]])
then tcpdump will show this response, after the header:

82                       MP_MAP, size 2 (i.e. metadata and rows)
32                         IPROTO_METADATA
92                         MP_ARRAY, size 2 (i.e. 2 columns)
85                           MP_MAP, size 5 (i.e. 5 items for column#1)
00 a2 44 44                    IPROTO_FIELD_NAME and 'DD'
01 a7 69 6e 74 65 67 65 72     IPROTO_FIELD_TYPE and 'integer'
03 c2                          IPROTO_FIELD_IS_NULLABLE and false
04 c3                          IPROTO_FIELD_IS_AUTOINCREMENT and true
05 c0                          PROTO_FIELD_SPAN and nil
85                           MP_MAP, size 5 (i.e. 5 items for column#2)
00 a2 d0 94                    IPROTO_FIELD_NAME and 'Д' upper case
01 a6 73 74 72 69 6e 67        IPROTO_FIELD_TYPE and 'string'
02 a7 75 6e 69 63 6f 64 65     IPROTO_FIELD_COLL and 'unicode'
03 c3                          IPROTO_FIELD_IS_NULLABLE and true
05 a4 d0 b4 d0 b4              IPROTO_FIELD_SPAN and 'дд' lower case
30                         IPROTO_DATA
92                         MP_ARRAY, size 2
92                           MP_ARRAY, size 2
01                             MP_INT = 1 i.e. contents of row#1 column#1
a1 61                          MP_STR = 'a' i.e. contents of row#1 column#2
92                           MP_ARRAY, size 2
02                             MP_INT = 2 i.e. contents of row#2 column#1
a1 62                          MP_STR = 'b' i.e. contents of row#2 column#2

Byte code for the SQL PREPARE example. If we said
conn:prepare([[SELECT dd, дд AS д FROM t1;]])
then tcpdump would show almost the same response, but there would be no IPROTO_DATA. Instead, additional items will appear:

34                       IPROTO_BIND_COUNT
00                       MP_UINT = 0

33                       IPROTO_BIND_METADATA
90                       MP_ARRAY, size 0

MP_UINT = 0 and MP_ARRAY has size 0 because there are no parameters to bind. Full output:

84                       MP_MAP, size 4
43                         IPROTO_STMT_ID
ce c2 3c 2c 1e             MP_UINT = statement id
34                         IPROTO_BIND_COUNT
00                         MP_INT = 0 = number of parameters to bind
33                         IPROTO_BIND_METADATA
90                         MP_ARRAY, size 0 = there are no parameters to bind
32                         IPROTO_METADATA
92                         MP_ARRAY, size 2 (i.e. 2 columns)
85                           MP_MAP, size 5 (i.e. 5 items for column#1)
00 a2 44 44                    IPROTO_FIELD_NAME and 'DD'
01 a7 69 6e 74 65 67 65 72     IPROTO_FIELD_TYPE and 'integer'
03 c2                          IPROTO_FIELD_IS_NULLABLE and false
04 c3                          IPROTO_FIELD_IS_AUTOINCREMENT and true
05 c0                          PROTO_FIELD_SPAN and nil
85                           MP_MAP, size 5 (i.e. 5 items for column#2)
00 a2 d0 94                    IPROTO_FIELD_NAME and 'Д' upper case
01 a6 73 74 72 69 6e 67        IPROTO_FIELD_TYPE and 'string'
02 a7 75 6e 69 63 6f 64 65     IPROTO_FIELD_COLL and 'unicode'
03 c3                          IPROTO_FIELD_IS_NULLABLE and true
05 a4 d0 b4 d0 b4              IPROTO_FIELD_SPAN and 'дд' lower case

Byte code for the heartbeat example. The master might send this body:

83                      MP_MAP, size 3
00                        Main-Map Item #1 IPROTO_REQUEST_TYPE
00                          MP_UINT = 0
02                        Main-Map Item #2 IPROTO_REPLICA_ID
02                          MP_UINT = 2 = id
04                        Main-Map Item #3 IPROTO_TIMESTAMP
cb                          MP_DOUBLE (MessagePack "Float 64")
41 d7 ba 06 7b 3a 03 21     8-byte timestamp
81                      MP_MAP (body), size 1
5a                      Body Map Item #1 IPROTO_VCLOCK_SYNC
14                        MP_UINT = 20 (vclock sync value)

Byte code for the heartbeat example. The replica might send back this body:

81                       MP_MAP, size 1
00                         Main-Map Item #1 IPROTO_REQUEST_TYPE
00                         MP_UINT = 0 = IPROTO_OK
83                         MP_MAP (body), size 3
26                           Body Map Item #1 IPROTO_VCLOCK
81                             MP_MAP, size 1 (vclock of 1 component)
01                               MP_UINT = 1 = id (part 1 of vclock)
06                               MP_UINT = 6 = lsn (part 2 of vclock)
5a                           Body Map Item #2 IPROTO_VCLOCK_SYNC
14                             MP_UINT = 20 (vclock sync value)
53                           Body Map Item #3 IPROTO_TERM
31                             MP_UINT = 49 (term value)
Found what you were looking for?
Feedback