Top.Mail.Ru
SQL statements and clauses | Tarantool
 
Reference / SQL reference / SQL statements and clauses
Reference / SQL reference / SQL statements and clauses

SQL statements and clauses

SQL statements and clauses

The Statements and Clauses guide shows all Tarantool/SQL statements’ syntax and use.

Statements that change data definition

ALTER TABLE

Syntax:

  • ALTER TABLE table-name RENAME TO new-table-name;
  • ALTER TABLE table-name ADD CONSTRAINT constraint-name constraint-definition;
  • ALTER TABLE table-name DROP CONSTRAINT constraint-name;
  • ALTER TABLE table-name ENABLE|DISABLE CHECK CONSTRAINT constraint-name;


../../../_images/alter_table.svg


ALTER is used to change a table’s name or a table’s constraints.

Examples:

For renaming a table with ALTER ... RENAME, the old-table must exist, the new-table must not exist. Example:
-- renaming a table: ALTER TABLE t1 RENAME TO t2;

For adding a table constraint with ADD CONSTRAINT, the table must exist, the table must be empty, the constraint name must be unique within the table. Example with a foreign-key constraint definition:
ALTER TABLE t1 ADD CONSTRAINT fk_s1_t1_1 FOREIGN KEY (s1) REFERENCES t1;

It is not possible to say CREATE TABLE table_a ... REFERENCES table_b ... if table b does not exist yet. This is a situation where ALTER TABLE is handy – users can CREATE TABLE table_a without the foreign key, then CREATE TABLE table_b, then ALTER TABLE table_a ... REFERENCES table_b ....

-- adding a primary-key constraint definition:
-- This is unusual because primary keys are created automatically
-- and it is illegal to have two primary keys for the same table.
-- However, it is possible to drop a primary-key index, and this
-- is a way to restore the primary key if that happens.
ALTER TABLE t1 ADD CONSTRAINT "pk_unnamed_T1_1" PRIMARY KEY (s1);

-- adding a unique-constraint definition:
-- Alternatively, you can say CREATE UNIQUE INDEX unique_key ON t1 (s1);
ALTER TABLE t1 ADD CONSTRAINT "unique_unnamed_T1_2" UNIQUE (s1);

-- Adding a check-constraint definition:
ALTER TABLE t1 ADD CONSTRAINT "ck_unnamed_T1_1" CHECK (s1 > 0);

For ALTER ... DROP CONSTRAINT, it is only legal to drop a named constraint. (Tarantool generates the constraint names automatically if the user does not provide them.) To remove a unique constraint, use use either ALTER ... DROP CONSTRAINT or DROP INDEX, which will drop the constraint as well.

To remove a unique constraint, use DROP INDEX, which will drop the constraint as well.

-- dropping a constraint:
ALTER TABLE t1 DROP CONSTRAINT "fk_unnamed_JJ2_1";

For ALTER ... ENABLE|DISABLE CHECK CONSTRAINT, it is only legal to enable or disable a named constraint, and Tarantool only looks for names of check constraints. By default a constraint is enabled. If a constraint is disabled, then the check will not be performed.

-- disabling and re-enabling a constraint:
ALTER TABLE t1 DISABLE CHECK CONSTRAINT c;
ALTER TABLE t1 ENABLE CHECK CONSTRAINT c;

Limitations:

  • It is not possible to add or drop a column.
  • It is not possible to modify NOT NULL constraints or column properties DEFAULT and data type. However, it is possible to modify them with Tarantool/NOSQL, for example by calling space_object:format() with a different is_nullable value.

CREATE TABLE

Syntax:

CREATE TABLE [IF NOT EXISTS] table-name (column-definition or table-constraint list) [WITH ENGINE = string];


../../../_images/create_table.svg


Create a new base table, usually called a “table”.

Note

A table is a base table if it is created with CREATE TABLE and contains data in persistent storage.

A table is a viewed table, or just “view”, if it is created with CREATE VIEW and gets its data from other views or from base tables.

The table-name must be an identifier which is valid according to the rules for identifiers, and must not be the name of an already existing base table or view.

The column-definition or table-constraint list is a comma-separated list of column definitions or table constraint definitions. Column definitions and table constraint definitions are sometimes called table elements.

Rules:

  • A primary key is necessary; it can be specified with a table constraint PRIMARY KEY.
  • There must be at least one column.
  • When IF NOT EXISTS is specified, and there is already a table with the same name, the statement is ignored.
  • When WITH ENGINE = string is specified, where string must be either ‘memtx’ or ‘vinyl’, the table is created with that storage engine. When this clause is not specified, the table is created with the default engine, which is ordinarily ‘memtx’ but may be changed by updating the box.space._session_settings system table..

Actions:

  1. Tarantool evaluates each column definition and table-constraint, and returns an error if any of the rules is violated.
  2. Tarantool makes a new definition in the schema.
  3. Tarantool makes new indexes for PRIMARY KEY or UNIQUE constraints. A unique index name is created automatically.
  4. Usually Tarantool effectively executes a COMMIT statement.

Examples:

-- the simplest form, with one column and one constraint:
CREATE TABLE t1 (s1 INTEGER, PRIMARY KEY (s1));

-- you can see the effect of the statement by querying
-- Tarantool system spaces:
SELECT * FROM "_space" WHERE "name" = 'T1';
SELECT * FROM "_index" JOIN "_space" ON "_index"."id" = "_space"."id"
         WHERE "_space"."name" = 'T1';

-- variation of the simplest form, with delimited identifiers
-- and a bracketed comment:
CREATE TABLE "T1" ("S1" INT /* synonym of INTEGER */, PRIMARY KEY ("S1"));

-- two columns, one named constraint
CREATE TABLE t1 (s1 INTEGER, s2 STRING, CONSTRAINT pk_s1s2_t1_1 PRIMARY KEY (s1, s2));

Limitations:

Column definition

Syntax:

column-name data-type [, column-constraint]

Define a column, which is a table-element used in a CREATE TABLE statement.

The column-name must be an identifier which is valid according to the rules for identifiers.

Each column-name must be unique within a table.

Column definition – data type
../../../_images/data_type.svg


Every column has a data type: BOOLEAN or DOUBLE or INTEGER or NUMBER or SCALAR or STRING or UNSIGNED or VARBINARY. The detailed description of data types is in the section Operands.

Column definition – the rules for the SCALAR data type

SCALAR is a “complex” data type, unlike all the other data types which are “primitive”. Two column values in a SCALAR column can have two different primitive data types.

  1. Any item defined as SCALAR has an underlying primitive type. For example, here:

    CREATE TABLE t (s1 SCALAR PRIMARY KEY);
    INSERT INTO t VALUES (55), ('41');
    

    the underlying primitive type of the item in the first row is INTEGER because literal 55 has data type INTEGER, and the underlying primitive type in the second row is STRING (the data type of a literal is always clear from its format).

    An item’s primitive type is far more important than its defined type. Incidentally Tarantool might find the primitive type by looking at the way MsgPack stores it, but that is an implementation detail.

  2. A SCALAR definition may not include a maximum length, as there is no suggested restriction.

  3. A SCALAR definition may include a COLLATE clause, which affects any items whose primitive data type is STRING. The default collation is “binary”.

  4. Some assignments are illegal when data types differ, but legal when the target is a SCALAR item. For example UPDATE ... SET column1 = 'a' is illegal if column1 is defined as INTEGER, but is legal if column1 is defined as SCALAR – values which happen to be INTEGER will be changed so their data type is STRING.

  5. There is no literal syntax which implies data type SCALAR.

  6. TYPEOF(x) is never SCALAR, it is always the underlying data type. This is true even if x is NULL (in that case the data type is BOOLEAN). In fact there is no function that is guaranteed to return the defined data type. For example, TYPEOF(CAST(1 AS SCALAR)); returns INTEGER, not SCALAR.

  7. For any operation that requires implicit casting from an item defined as SCALAR, the syntax is legal but the operation may fail at runtime. At runtime, Tarantool detects the underlying primitive data type and applies the rules for that. For example, if a definition is:

    CREATE TABLE t (s1 SCALAR PRIMARY KEY, s2 INTEGER);
    

    and within any row s1 = 'a', that is, its underlying primitive type is STRING to indicate character strings, then UPDATE t SET s2 = s1; is illegal. Tarantool usually does not know that in advance.

  8. For any dyadic operation that requires implicit casting for comparison, the syntax is legal and the operation will not fail at runtime. Take this situation: comparison with a primitive type VARBINARY and a primitive type STRING.

    CREATE TABLE t (s1 SCALAR PRIMARY KEY);
    INSERT INTO t VALUES (X'41');
    SELECT * FROM t WHERE s1 > 'a';
    

    The comparison is valid, because Tarantool knows the ordering of X’41’ and ‘a’ in Tarantool/NoSQL ‘scalar’.

  9. The result data type of min/max operation on a column defined as SCALAR is the data type of the minimum/maximum operand, unless the result value is NULL. For example:

    CREATE TABLE t (s1 INTEGER, s2 SCALAR PRIMARY KEY);
    INSERT INTO t VALUES (1, X'44'), (2, 11), (3, 1E4), (4, 'a');
    SELECT min(s2), hex(max(s2)) FROM t;
    

    The result is: - - [11, '44',]

    That is only possible with Tarantool/NoSQL scalar rules, but SELECT SUM(s2) would not be legal because addition would in this case require implicit casting from VARBINARY to a number, which is not sensible.

  10. The result data type of a primitive combination is never SCALAR because we in effect use TYPEOF(item) not the defined data type. (Here we use the word “combination” in the way that the standard document uses it for section “Result of data type combinations”.) Therefore for greatest(1E308, 'a', 0, X'00') the result is X’00’.

Column definition – relation to NoSQL

All the SQL data types correspond to Tarantool/NoSQL types with the same name. For example an SQL STRING is stored in a NoSQL space as type = ‘string’.

Therefore specifying an SQL data type X determines that the storage will be in a space with a format column saying that the NoSQL type is ‘x’.

The rules for that NoSQL type are applicable to the SQL data type.

If two items have SQL data types that have the same underlying type, then they are compatible for all assignment or comparison purposes.

If two items have SQL data types that have different underlying types, then the rules for explicit casts, or implicit (assignment) casts, or implicit (comparison) casts, apply.

There is one floating-point value which is not handled by SQL: -nan is seen as NULL.

There are also some Tarantool/NoSQL data types which have no corresponding SQL data types. For example, SELECT "flags" FROM "_space"; will return a column whose data type is ‘map’. Such columns can only be manipulated in SQL by invoking Lua functions.

Column definition – column-constraint or default clause
../../../_images/column_constraint.svg

The column-constraint or default clause may be as follows:

Type Comment
NOT NULL means “it is illegal to assign a NULL to this column”
PRIMARY KEY explained in the later section “Table Constraint Definition”
UNIQUE explained in the later section “Table Constraint Definition”
CHECK (expression) explained in the later section “Table Constraint Definition”
foreign-key-clause explained in the later section “Table Constraint Definition for foreign keys”
DEFAULT expression means “if INSERT does not assign to this column then assign expression result to this column” – if there is no DEFAULT clause then DEFAULT NULL is assumed.

If column-constraint is PRIMARY KEY, this is a shorthand for a separate table-constraint definition: “PRIMARY KEY (column-name)”.

If column-constraint is UNIQUE, this is a shorthand for a separate table-constraint definition: “UNIQUE (column-name)”.

If column-constraint is CHECK, this is a shorthand for a separate table-constraint definition: “CHECK (expression)”.

Columns defined with PRIMARY KEY are automatically NOT NULL.

To enforce some restrictions that Tarantool does not enforce automatically, add CHECK clauses, like these:

CREATE TABLE t ("smallint" INTEGER PRIMARY KEY CHECK ("smallint" <= 32767 AND "smallint" >= -32768));
CREATE TABLE t ("shorttext" STRING PRIMARY KEY CHECK (length("shorttext") <= 10));

but this may cause inserts or updates to be slow.

Column definition – examples

These are shown within CREATE TABLE statements. Data types may also appear in CAST functions.

-- the simple form with column-name and data-type
CREATE TABLE t (column1 INTEGER ...);
-- with column-name and data-type and column-constraint
CREATE TABLE t (column1 STRING PRIMARY KEY ...);
-- with column-name and data-type and collate-clause
CREATE TABLE t (column1 SCALAR COLLATE "unicode" ...);
-- with all possible data types and aliases
CREATE TABLE t
(column1 BOOLEAN, column2 BOOL,
 column3 INT PRIMARY KEY, column4 INTEGER,
 column5 DOUBLE,
 column6 NUMBER,
 column7 STRING, column8 STRING COLLATE "unicode",
 column9 TEXT, columna TEXT COLLATE "unicode_sv_s1",
 columnb VARCHAR(0), columnc VARCHAR(100000) COLLATE "binary",
 columnd VARBINARY,
 columne SCALAR, columnf SCALAR COLLATE "unicode_uk_s2");
-- with all possible column constraints and a default clause
CREATE TABLE t
(column1 INTEGER NOT NULL,
 column2 INTEGER PRIMARY KEY,
 column3 INTEGER UNIQUE,
 column4 INTEGER CHECK (column3 > column2),
 column5 INTEGER REFERENCES t,
 column6 INTEGER DEFAULT NULL);

Table Constraint Definition

Syntax:

[CONSTRAINT constraint-name] primary-key-constraint | unique-constraint | check-constraint | foreign-key-constraint


../../../_images/constraint.svg


Define a constraint, which is a table-element used in a CREATE TABLE statement.

The constraint-name must be an identifier which is valid according to the rules for identifiers. The constraint-name must be unique within the table.

PRIMARY KEY constraints look like this:
PRIMARY KEY (column-name [, column-name...])

There is a shorthand: specifying PRIMARY KEY in a column definition.

Every table must have one and only one primary key.
Primary-key columns are automatically NOT NULL.
Primary-key columns are automatically indexed.
Primary-key columns are unique, that is, it is illegal to have two rows which have the same values for the columns specified in the constraint.

Examples:

-- this is a table with a one-column primary-key constraint
CREATE TABLE t1 (s1 INTEGER, PRIMARY KEY (s1));
-- this is the column-definition shorthand for the same thing:
CREATE TABLE t1 (s1 INTEGER PRIMARY KEY);
-- this is a table with a two-column primary-key constraint
CREATE TABLE t2 (s1 INTEGER, s2 INTEGER, PRIMARY KEY (s1, s2));
-- this is an example of an attempted primary-key violation
-- (the third INSERT will fail because 55, 'a' is a duplicate)
CREATE TABLE t3 (s1 INTEGER, s2 STRING, PRIMARY KEY (s1, s2));
INSERT INTO t3 VALUES (55, 'a');
INSERT INTO t3 VALUES (55, 'b');
INSERT INTO t3 VALUES (55, 'a');

PRIMARY KEY plus AUTOINCREMENT modifier may be specified in one of two ways:
- In a column definition after the words PRIMARY KEY, as in CREATE TABLE t (c INTEGER PRIMARY KEY AUTOINCREMENT);
- In a PRIMARY KEY (column-list) after a column name, as in CREATE TABLE t (c INTEGER, PRIMARY KEY (c AUTOINCREMENT));
When AUTOINCREMENT is specified, the column must be a primary-key column and it must be INTEGER or UNSIGNED.
Only one column in the table may be autoincrement. However, it is legal to say PRIMARY KEY (a, b, c AUTOINCREMENT) – in that case, there are three columns in the primary key but only the first column (a) is AUTOINCREMENT.

As the name suggests, values in an autoincrement column are automatically incremented. That is: if a user inserts NULL in the column, then the stored value will be the smallest non-negative integer that has not already been used. This occurs because autoincrement columns are associated with sequences.

UNIQUE constraints look like this:
UNIQUE (column-name [, column-name...])

There is a shorthand: specifying UNIQUE in a column definition.

Unique constraints are similar to primary-key constraints, except that: a table may have any number of unique keys, and unique keys are not automatically NOT NULL.
Unique columns are automatically indexed.
Unique columns are unique, that is, it is illegal to have two rows with the same values in the unique-key columns.

Examples:

-- this is a table with a one-column primary-key constraint
-- and a one-column unique constraint
CREATE TABLE t1 (s1 INTEGER, s2 INTEGER, PRIMARY KEY (s1), UNIQUE (s2));
-- this is the column-definition shorthand for the same thing:
CREATE TABLE t1 (s1 INTEGER PRIMARY KEY, s2 INTEGER UNIQUE);
-- this is a table with a two-column unique constraint
CREATE TABLE t2 (s1 INTEGER PRIMARY KEY, s2 INTEGER, UNIQUE (s2, s1));
-- this is an example of an attempted unique-key violation
-- (the third INSERT will not fail because NULL is not a duplicate)
-- (the fourth INSERT will fail because 'a' is a duplicate)
CREATE TABLE t3 (s1 INTEGER PRIMARY KEY, s2 STRING, UNIQUE (s2));
INSERT INTO t3 VALUES (1, 'a');
INSERT INTO t3 VALUES (2, NULL);
INSERT INTO t3 VALUES (3, NULL);
INSERT INTO t3 VALUES (4, 'a');

CHECK constraints look like this:
CHECK (expression)

There is a shorthand: specifying CHECK in a column definition.

The expression may be anything that returns a BOOLEAN result = TRUE or FALSE or UNKNOWN.
The expression may not contain a subquery.
If the expression contains a column name, the column must exist in the table.
If a CHECK constraint is specified, the table must not contain rows where the expression is FALSE. (The table may contain rows where the expression is either TRUE or UNKNOWN.)
Constraint checking may be stopped with ALTER TABLE … DISABLE CHECK CONSTRAINT and restarted with ALTER TABLE … ENABLE CHECK CONSTRAINT.

Examples:

-- this is a table with a one-column primary-key constraint
-- and a check constraint
CREATE TABLE t1 (s1 INTEGER PRIMARY KEY, s2 INTEGER, CHECK (s2 <> s1));
-- this is an attempt to violate the constraint, it will fail
INSERT INTO t1 VALUES (1, 1);
-- this is okay because comparison with NULL will not return FALSE
INSERT INTO t1 VALUES (1, NULL);
-- a constraint that makes it difficult to insert lower case
CHECK (s1 = UPPER(s1))

Limitations: (Issue#3503):
* CREATE TABLE t99 (s1 INTEGER, UNIQUE(s1, s1),PRIMARY KEY(s1)); causes no error message, although (s1, s1) is probably a user error.

Table Constraint Definition for foreign keys

FOREIGN KEY constraints look like this:
FOREIGN KEY (referencing-column-name [, referencing-column-name...]) REFERENCES referenced-table-name [(referenced-column-name [, referenced-column-name...]]) [MATCH FULL] [update-or-delete-rules]

There is a shorthand: specifying REFERENCES in a column definition.

The referencing column names must be defined in the table that is being created. The referenced table name must refer to a table that already exists, or to the table that is being created. The referenced column names must be defined in the referenced table, and have similar data types. There must be a PRIMARY KEY or UNIQUE constraint or UNIQUE index on the referenced column names.

The words MATCH FULL are optional and have no effect.

If a foreign-key constraint exists, then the values in the referencing columns must equal values in the referenced columns of the referenced table, or at least one of the referencing columns must contain NULL.

Examples:

-- A foreign key referencing a primary key in the same table
CREATE TABLE t1 (s1 INTEGER PRIMARY KEY, s2 INTEGER, FOREIGN KEY (s2) REFERENCES t1 (s1));
-- The same thing with column shorthand
CREATE TABLE t1 (s1 INTEGER PRIMARY KEY, s2 INTEGER REFERENCES t1(s1));
-- An attempt to violate the constraint -- this will fail
INSERT INTO t1 VALUES (1, 2);
-- A NULL in the referencing column -- this will succeed
INSERT INTO t1 VALUES (1, NULL);
-- A reference to a primary key that now exists -- this will succeed
INSERT INTO t1 VALUES (2, 1);

The optional update-or-delete rules look like this:
ON {UPDATE|DELETE} { CASCADE | SET DEFAULT | SET NULL | RESTRICT | NO ACTION}
and the idea is: if something changes the referenced key, then one of these possible “referential actions” takes place:
CASCADE: the change that is applied for the referenced key is applied for the referencing key.
SET DEFAULT: the referencing key is set to its default value.
SET NULL: the referencing key is set to NULL.
RESTRICT: the UPDATE or DELETE fails if a referencing key exists; checked immediately.
NO ACTION: the UPDATE or DELETE fails if a referencing key exists; checked at statement end.
The default is NO ACTION.

For example:

CREATE TABLE f1 (ordinal INTEGER PRIMARY KEY,
             referenced_planet STRING UNIQUE NOT NULL);
CREATE TABLE f2 (
    ordinal INTEGER PRIMARY KEY,
    referring_planet STRING DEFAULT 'Earth',
    FOREIGN KEY (referring_planet) REFERENCES f1 (referenced_planet)
        ON UPDATE SET DEFAULT
        ON DELETE CASCADE);
INSERT INTO f1 VALUES (1, 'Mercury'), (2,' Venus'), (3, 'Earth');
INSERT INTO f2 VALUES (1, 'Mercury'), (2, 'Mercury');
UPDATE f1 SET referenced_planet = 'Mars'
    WHERE referenced_planet = 'Mercury';
SELECT * FROM f2;
DELETE FROM f1 WHERE referenced_planet = 'Earth';
SELECT * FROM f2;
... In this example, the UPDATE statement changes the referenced key,
    and the clause is ON UPDATE SET DEFAULT, therefore both of the
    rows in f2 have referring_planet set to their default value,
    which is 'Earth'. The DELETE statement deletes the row that
    has 'Earth', and the clause is ON DELETE CASCADE,
    therefore both of the rows in f2 are deleted.

Limitations:
* Foreign keys can have a MATCH clause (Issue#3455).

DROP TABLE

Syntax:

DROP TABLE [IF EXISTS] table-name;


../../../_images/drop_table.svg


Drop a table.

The table-name must identify a table that was created earlier with the CREATE TABLE statement.

Rules:

  • If there is a view that references the table, the drop will fail. Please drop the referencing view with DROP VIEW first.
  • If there is a foreign key that references the table, the drop will fail. Please drop the referencing constraint with ALTER TABLE … DROP first.

Actions:

  1. Tarantool returns an error if the table does not exist and there is no IF EXISTS clause.
  2. The table and all its data are dropped.
  3. All indexes for the table are dropped.
  4. All triggers for the table are dropped.
  5. Usually Tarantool effectively executes a COMMIT statement.

Examples:

-- the simple case:
DROP TABLE t31;
-- with an IF EXISTS clause:
DROP TABLE IF EXISTS t31;

See also: DROP VIEW.

CREATE VIEW

Syntax:

CREATE VIEW [IF NOT EXISTS] view-name [(column-list)] AS subquery;


../../../_images/create_view.svg


Create a new viewed table, usually called a “view”.

The view-name must be valid according to the rules for identifiers.

The optional column-list must be a comma-separated list of names of columns in the view.

The syntax of the subquery must be the same as the syntax of a SELECT statement, or of a VALUES clause.

Rules:

  • There must not already be a base table or view with the same name as view-name.
  • If column-list is specified, the number of columns in column-list must be the same as the number of columns in the select list of the subquery.

Actions:

  1. Tarantool will throw an error if a rule is violated.
  2. Tarantool will create a new persistent object with column-names equal to the names in the column-list or the names in the subquery’s select list.
  3. Usually Tarantool effectively executes a COMMIT statement.

Examples:

-- the simple case:
CREATE VIEW v AS SELECT column1, column2 FROM t;
-- with a column-list:
CREATE VIEW v (a,b) AS SELECT column1, column2 FROM t;

Limitations:

  • It is not possible to insert or update or delete from a view, although sometimes a possible substitution is to create an INSTEAD OF trigger.

DROP VIEW

Syntax:

DROP VIEW [IF EXISTS] view-name;


../../../_images/drop_view.svg


Drop a view.

The view-name must identify a view that was created earlier with the CREATE VIEW statement.

Rules: none

Actions:

  1. Tarantool returns an error if the view does not exist and there is no IF EXISTS clause.
  2. The view is dropped.
  3. All triggers for the view are dropped.
  4. Usually Tarantool effectively executes a COMMIT statement.

Examples:

-- the simple case:
DROP VIEW v31;
-- with an IF EXISTS clause:
DROP VIEW IF EXISTS v31;

See also: DROP TABLE.

CREATE INDEX

Syntax:

CREATE [UNIQUE] INDEX [IF NOT EXISTS] index-name ON table-name (column-list);


../../../_images/create_index.svg


Create an index.

The index-name must be valid according to the rules for identifiers.

The table-name must refer to an existing table.

The column-list must be a comma-separated list of names of columns in the table.

Rules:

  • There must not already be, for the same table, an index with the same name as index-name. But there may already be, for a different table, an index with the same name as index-name.
  • The maximum number of indexes per table is 128.

Actions:

  1. Tarantool will throw an error if a rule is violated.
  2. If the new index is UNIQUE, Tarantool will throw an error if any row exists with columns that have duplicate values.
  3. Tarantool will create a new index.
  4. Usually Tarantool effectively executes a COMMIT statement.

Automatic indexes:

Indexes may be created automatically for columns mentioned in the PRIMARY KEY or UNIQUE clauses of a CREATE TABLE statement. If an index was created automatically, then the index-name has four parts:

  1. pk if this is for a PRIMARY KEY clause, unique if this is for a UNIQUE clause;
  2. _unnamed_;
  3. the name of the table;
  4. _ and an ordinal number; the first index is 1, the second index is 2, and so on.

For example, after CREATE TABLE t (s1 INTEGER PRIMARY KEY, s2 INTEGER, UNIQUE (s2)); there are two indexes named pk_unnamed_T_1 and unique_unnamed_T_2. You can confirm this by saying SELECT * FROM "_index"; which will list all indexes on all tables. There is no need to say CREATE INDEX for columns that already have automatic indexes.

Examples:

-- the simple case
CREATE INDEX idx_column1_t_1 ON t (column1);
-- with IF NOT EXISTS clause
CREATE INDEX IF NOT EXISTS idx_column1_t_1 ON t (column1);
-- with UNIQUE specifier and more than one column
CREATE UNIQUE INDEX idx_unnamed_t_1 ON t (column1, column2);

Dropping an automatic index created for a unique constraint will drop the unique constraint as well.

DROP INDEX

Syntax:

DROP INDEX [IF EXISTS] index-name ON table-name;


../../../_images/drop_index.svg


The index-name must be the name of an existing index, which was created with CREATE INDEX. Or, the index-name must be the name of an index that was created automatically due to a PRIMARY KEY or UNIQUE clause in the CREATE TABLE statement. To see what a table’s indexes are, use PRAGMA index_list(table-name);.

Rules: none

Actions:

  1. Tarantool throws an error if the index does not exist, or is an automatically created index.
  2. Tarantool will drop the index.
  3. Usually Tarantool effectively executes a COMMIT statement.

Example:

-- the simplest form:
DROP INDEX idx_unnamed_t_1 ON t;

CREATE TRIGGER

Syntax:

CREATE TRIGGER [IF NOT EXISTS] trigger-name
BEFORE|AFTER|INSTEAD OF
DELETE|INSERT|UPDATE ON table-name
FOR EACH ROW
[WHEN search-condition]
BEGIN
delete-statement | insert-statement | replace-statement | select-statement | update-statement;
[delete-statement | insert-statement | replace-statement | select-statement | update-statement; ...]
END;


../../../_images/create_trigger.svg


The trigger-name must be valid according to the rules for identifiers.

If the trigger action time is BEFORE or AFTER, then the table-name must refer to an existing base table.

If the trigger action time is INSTEAD OF, then the table-name must refer to an existing view.

Rules:

  • There must not already be a trigger with the same name as trigger-name.
  • Triggers on different tables or views share the same namespace.
  • The statements between BEGIN and END should not refer to the table-name mentioned in the ON clause.
  • The statements between BEGIN and END should not contain an INDEXED BY clause.

SQL triggers are not activated by Tarantool/NoSQL requests. This will change in a future version.

On a replica, effects of trigger execution are applied, and the SQL triggers themselves are not activated upon replication events.

NoSQL triggers are activated both on replica and master, thus if you have a NoSQL trigger on a replica, it is activated when applying effects of an SQL trigger.

Actions:

  1. Tarantool will throw an error if a rule is violated.
  2. Tarantool will create a new trigger.
  3. Usually Tarantool effectively executes a COMMIT statement.

Examples:

-- the simple case:
CREATE TRIGGER stores_before_insert BEFORE INSERT ON stores FOR EACH ROW
  BEGIN DELETE FROM warehouses; END;
-- with IF NOT EXISTS clause:
CREATE TRIGGER IF NOT EXISTS stores_before_insert BEFORE INSERT ON stores FOR EACH ROW
  BEGIN DELETE FROM warehouses; END;
-- with FOR EACH ROW and WHEN clauses:
CREATE TRIGGER stores_before_insert BEFORE INSERT ON stores FOR EACH ROW WHEN a=5
  BEGIN DELETE FROM warehouses; END;
-- with multiple statements between BEGIN and END:
CREATE TRIGGER stores_before_insert BEFORE INSERT ON stores FOR EACH ROW
  BEGIN DELETE FROM warehouses; INSERT INTO inventories VALUES (1); END;

Trigger extra clauses

  • UPDATE OF column-list

    After BEFORE|AFTER UPDATE it is optional to add OF column-list. If any of the columns in column-list is affected at the time the row is processed, then the trigger will be activated for that row. For example:

    CREATE TRIGGER table1_before_update
     BEFORE UPDATE  OF column1, column2 ON table1
     FOR EACH ROW
     BEGIN UPDATE table2 SET column1 = column1 + 1; END;
    UPDATE table1 SET column3 = column3 + 1; -- Trigger will not be activated
    UPDATE table1 SET column2 = column2 + 0; -- Trigger will be activated
    
  • WHEN

    After table-name FOR EACH ROW it is optional to add [WHEN expression]. If the expression is true at the time the row is processed, only then will the trigger will be activated for that row. For example:

    CREATE TRIGGER table1_before_update BEFORE UPDATE ON table1 FOR EACH ROW
     WHEN (SELECT COUNT(*) FROM table1) > 1
     BEGIN UPDATE table2 SET column1 = column1 + 1; END;
    

    This trigger will not be activated unless there is more than one row in table1.

  • OLD and NEW

    The keywords OLD and NEW have special meaning in the context of trigger action:

    • OLD.column-name refers to the value of column-name before the change.
    • NEW.column-name refers to the value of column-name after the change.

    For example:

    CREATE TABLE table1 (column1 STRING, column2 INTEGER PRIMARY KEY);
    CREATE TABLE table2 (column1 STRING, column2 STRING, column3 INTEGER PRIMARY KEY);
    INSERT INTO table1 VALUES ('old value', 1);
    INSERT INTO table2 VALUES ('', '', 1);
    CREATE TRIGGER table1_before_update BEFORE UPDATE ON table1 FOR EACH ROW
     BEGIN UPDATE table2 SET column1 = old.column1, column2 = new.column1; END;
    UPDATE table1 SET column1 = 'new value';
    SELECT * FROM table2;
    

    At the beginning of the UPDATE for the single row of table1, the value in column1 is ‘old value’ – so that is what is seen as old.column1.

    At the end of the UPDATE for the single row of table1, the value in column1 is ‘new value’ – so that is what is seen as new.column1. (OLD and NEW are qualifiers for table1, not table2.)

    Therefore, SELECT * FROM table2; returns ['old value', 'new value'].

    OLD.column-name does not exist for an INSERT trigger.

    NEW.column-name does not exist for a DELETE trigger.

    OLD and NEW are read-only; you cannot change their values.

  • Deprecated or illegal statements:

    It is illegal for the trigger action to include a qualified column reference other than OLD.column-name or NEW.column-name. For example, CREATE TRIGGER ... BEGIN UPDATE table1 SET table1.column1 = 5; END; is illegal.

    It is illegal for the trigger action to include statements that include a WITH clause, a DEFAULT VALUES clause, or an INDEXED BY clause.

    It is usually not a good idea to have a trigger on table1 which causes a change on table2, and at the same time have a trigger on table2 which causes a change on table1. For example:

    CREATE TRIGGER table1_before_update
     BEFORE UPDATE ON table1
     FOR EACH ROW
     BEGIN UPDATE table2 SET column1 = column1 + 1; END;
    CREATE TRIGGER table2_before_update
     BEFORE UPDATE ON table2
     FOR EACH ROW
     BEGIN UPDATE table1 SET column1 = column1 + 1; END;
    

    Luckily UPDATE table1 ... will not cause an infinite loop, because Tarantool recognizes when it has already updated so it will stop. However, not every DBMS acts this way.

Trigger activation

These are remarks concerning trigger activation.

Standard terminology:

  • “trigger action time” = BEFORE or AFTER or INSTEAD OF
  • “trigger event” = INSERT or DELETE or UPDATE
  • “triggered statement” = BEGIN … DELETE|INSERT|REPLACE|SELECT|UPDATE … END
  • “triggered when clause” = WHEN search-condition
  • “activate” = execute a triggered statement
  • some vendors use the word “fire” instead of “activate”

If there is more than one trigger for the same trigger event, Tarantool may execute the triggers in any order.

It is possible for a triggered statement to cause activation of another triggered statement. For example, this is legal:

CREATE TRIGGER t1_before_delete BEFORE DELETE ON t1 FOR EACH ROW BEGIN DELETE FROM t2; END;
CREATE TRIGGER t2_before_delete BEFORE DELETE ON t2 FOR EACH ROW BEGIN DELETE FROM t3; END;

Activation occurs FOR EACH ROW, not FOR EACH STATEMENT. Therefore, if no rows are candidates for insert or update or delete, then no triggers are activated.

The BEFORE trigger is activated even if the trigger event fails.

If an UPDATE trigger event does not make a change, the trigger is activated anyway. For example, if row 1 column1 contains 'a', and the trigger event is UPDATE ... SET column1 = 'a';, the trigger is activated.

The triggered statement may refer to a function: RAISE(FAIL, error-message). If a triggered statement invokes a RAISE(FAIL, error-message) function, or if a triggered statement causes an error, then statement execution stops immediately.

The triggered statement may refer to column values within the rows being changed. in this case:

  • The row “as of before” the change is called the “old” row (which makes sense only for UPDATE and DELETE statements).
  • The row “as of after” the change is called the “new” row (which makes sense only for UPDATE and INSERT statements).

This example shows how an INSERT can be done to a view by referring to the “new” row:

CREATE TABLE t (s1 INTEGER PRIMARY KEY, s2 INTEGER);
CREATE VIEW v AS SELECT s1, s2 FROM t;
CREATE TRIGGER v_instead_of INSTEAD OF INSERT ON v
  FOR EACH ROW
  BEGIN INSERT INTO t VALUES (new.s1, new.s2); END;
INSERT INTO v VALUES (1, 2);

Ordinarily saying INSERT INTO view_name ... is illegal in Tarantool, so this is a workaround.

It is possible to generalize this so that all data-change statements on views will change the base tables, provided that the view contains all the columns of the base table, and provided that the triggers refer to those columns when necessary, as in this example:

CREATE TABLE base_table (primary_key_column INTEGER PRIMARY KEY, value_column INTEGER);
CREATE VIEW viewed_table AS SELECT primary_key_column, value_column FROM base_table;
CREATE TRIGGER viewed_table_instead_of_insert INSTEAD OF INSERT ON viewed_table FOR EACH ROW
  BEGIN
    INSERT INTO base_table VALUES (new.primary_key_column, new.value_column); END;
CREATE TRIGGER viewed_table_instead_of_update INSTEAD OF UPDATE ON viewed_table FOR EACH ROW
  BEGIN
    UPDATE base_table
    SET primary_key_column = new.primary_key_column, value_column = new.value_column
    WHERE primary_key_column = old.primary_key_column; END;
CREATE TRIGGER viewed_table_instead_of_delete INSTEAD OF DELETE ON viewed_table FOR EACH ROW
  BEGIN
    DELETE FROM base_table WHERE primary_key_column = old.primary_key_column; END;

When INSERT or UPDATE or DELETE occurs for table X, Tarantool usually operates in this order (a basic scheme):

For each row
  Perform constraint checks
  For each BEFORE trigger that refers to table X
    Check that the trigger's WHEN condition is true.
    Execute what is in the triggered statement.
  Insert or update or delete the row in table X.
  Perform more constraint checks
  For each AFTER trigger that refers to table X
    Check that the trigger's WHEN condition is true.
    Execute what is in the triggered statement.

However, Tarantool does not guarantee execution order when there are multiple constraints, or multiple triggers for the same event (including NoSQL on_replace triggers or SQL INSTEAD OF triggers that affect a view of table X).

The maximum number of trigger activations per statement is 32.

INSTEAD OF triggers

A trigger which is created with the clause
INSTEAD OF INSERT|UPDATE|DELETE ON view-name
is an INSTEAD OF trigger. For each affected row, the trigger action is performed “instead of” the INSERT or UPDATE or DELETE statement that causes trigger activation.

For example, ordinarily it is illegal to INSERT rows in a view, but it is legal to create a trigger which intercepts attempts to INSERT, and puts rows in the underlying base table:

CREATE TABLE t1 (column1 INTEGER PRIMARY KEY, column2 INTEGER);
CREATE VIEW v1 AS SELECT column1, column2 FROM t1;
CREATE TRIGGER v1_instead_of INSTEAD OF INSERT ON v1 FOR EACH ROW BEGIN
 INSERT INTO t1 VALUES (NEW.column1, NEW.column2); END;
INSERT INTO v1 VALUES (1, 1);
-- ... The result will be: table t1 will contain a new row.

INSTEAD OF triggers are only legal for views, while BEFORE or AFTER triggers are only legal for base tables.

It is legal to create INSTEAD OF triggers with triggered WHEN clauses.

Limitations:

  • It is legal to create INSTEAD OF triggers with UPDATE OF column-list clauses, but they are not standard SQL.

Example:

CREATE TRIGGER ev1_instead_of_update
  INSTEAD OF UPDATE OF column2,column1 ON ev1
  FOR EACH ROW BEGIN
  INSERT INTO et2 VALUES (NEW.column1, NEW.column2); END;

DROP TRIGGER

Syntax:

DROP TRIGGER [IF EXISTS] trigger-name;


../../../_images/drop_trigger.svg


Drop a trigger.

The trigger-name must identify a trigger that was created earlier with the CREATE TRIGGER statement.

Rules: none

Actions:

  1. Tarantool returns an error if the trigger does not exist and there is no IF EXISTS clause.
  2. The trigger is dropped.
  3. Usually Tarantool effectively executes a COMMIT statement.

Examples:

-- the simple case:
DROP TRIGGER table1_before_insert;
-- with an IF EXISTS clause:
DROP TRIGGER IF EXISTS table1_before_insert;

Statements that change data

INSERT

Syntax:

  • INSERT INTO table-name [(column-list)] VALUES (expression-list) [, (expression-list)];
  • INSERT INTO table-name [(column-list)]  select-statement;
  • INSERT INTO table-name DEFAULT VALUES;


../../../_images/insert.svg


Insert one or more new rows into a table.

The table-name must be a name of a table defined earlier with CREATE TABLE.

The optional column-list must be a comma-separated list of names of columns in the table.

The expression-list must be a comma-separated list of expressions; each expression may contain literals and operators and subqueries and function invocations.

Rules:

  • The values in the expression-list are evaluated from left to right.
  • The order of the values in the expression-list must correspond to the order of the columns in the table, or (if a column-list is specified) to the order of the columns in the column-list.
  • The data type of the value should correspond to the data type of the column, that is, the data type that was specified with CREATE TABLE.
  • If a column-list is not specified, then the number of expressions must be the same as the number of columns in the table.
  • If a column-list is specified, then some columns may be omitted; omitted columns will get default values.
  • The parenthesized expression-list may be repeated – (expression-list),(expression-list),... – for multiple rows.

Actions:

  1. Tarantool evaluates each expression in expression-list, and returns an error if any of the rules is violated.
  2. Tarantool creates zero or more new rows containing values based on the values in the VALUES list or based on the results of the select-expression or based on the default values.
  3. Tarantool executes constraint checks and trigger actions and the actual insertion.

Examples:

-- the simplest form:
INSERT INTO table1 VALUES (1, 'A');
-- with a column list:
INSERT INTO table1 (column1, column2) VALUES (2, 'B');
-- with an arithmetic operator in the first expression:
INSERT INTO table1 VALUES (2 + 1, 'C');
-- put two rows in the table:
INSERT INTO table1 VALUES (4, 'D'), (5, 'E');

See also: REPLACE statement.

UPDATE

Syntax:

UPDATE table-name SET column-name = expression [, column-name = expression ...] [WHERE search-condition];


../../../_images/update.svg


Update zero or more existing rows in a table.

The table-name must be a name of a table defined earlier with CREATE TABLE or CREATE VIEW.

The column-name must be an updatable column in the table.

The expression may contain literals and operators and subqueries and function invocations and column names.

Rules:

  • The values in the SET clause are evaluated from left to right.
  • The data type of the value should correspond to the data type of the column, that is, the data type that was specified with CREATE TABLE.
  • If a search-condition is not specified, then all rows in the table will be updated; otherwise only those rows which match the search-condition will be updated.

Actions:

  1. Tarantool evaluates each expression in the SET clause, and returns an error if any of the rules is violated. For each row that is found by the WHERE clause, a temporary new row is formed based on the original contents and the modifications caused by the SET clause.
  2. Tarantool executes constraint checks and trigger actions and the actual update.

Examples:

-- the simplest form:
UPDATE t SET column1 = 1;
-- with more than one assignment in the SET clause:
UPDATE t SET column1 = 1, column2 = 2;
-- with a WHERE clause:
UPDATE t SET column1 = 5 WHERE column2 = 6;

Special cases:

It is legal to say SET (list of columns) = (list of values). For example:

UPDATE t SET (column1, column2, column3) = (1, 2, 3);

It is not legal to assign to a column more than once. For example:

INSERT INTO t (column1) VALUES (0);
UPDATE t SET column1 = column1 + 1, column1 = column1 + 1;

The result is an error: “duplicate column name”.

It is not legal to assign to a primary-key column.

DELETE

Syntax:

DELETE FROM table-name [WHERE search-condition];


../../../_images/delete.svg


Delete zero or more existing rows in a table.

The table-name must be a name of a table defined earlier with CREATE TABLE or CREATE VIEW.

The search-condition may contain literals and operators and subqueries and function invocations and column names.

Rules:

  • If a search-condition is not specified, then all rows in the table will be deleted; otherwise only those rows which match the search-condition will be deleted.

Actions:

  1. Tarantool evaluates each expression in the search-condition, and returns an error if any of the rules is violated.
  2. Tarantool finds the set of rows that are to be deleted.
  3. Tarantool executes constraint checks and trigger actions and the actual deletion.

Examples:

-- the simplest form:
DELETE FROM t;
-- with a WHERE clause:
DELETE FROM t WHERE column2 = 6;

REPLACE

Syntax:

  • REPLACE INTO table-name [(column-list)] VALUES (expression-list) [, (expression-list)];
  • REPLACE INTO table-name [(column-list)] select-statement;
  • REPLACE INTO table-name DEFAULT VALUES;


../../../_images/replace.svg


Insert one or more new rows into a table, or update existing rows.

If a row already exists (as determined by the primary key or any unique key), then the action is delete + insert, and the rules are the same as for a DELETE statement followed by an INSERT statement. Otherwise the action is insert, and the rules are the same as for the INSERT statement.

Examples:

-- the simplest form:
REPLACE INTO table1 VALUES (1, 'A');
-- with a column list:
REPLACE INTO table1 (column1, column2) VALUES (2, 'B');
-- with an arithmetic operator in the first expression:
REPLACE INTO table1 VALUES (2 + 1, 'C');
-- put two rows in the table:
REPLACE INTO table1 VALUES (4, 'D'), (5, 'E');

See also: INSERT Statement, UPDATE Statement.

TRUNCATE

Syntax:

TRUNCATE TABLE table-name;


../../../_images/truncate.svg


Remove all rows in the table.

TRUNCATE is considered to be a schema-change rather than a data-change statement, so it does not work within transactions (it cannot be rolled back).

Rules:

  • It is illegal to truncate a table which is referenced by a foreign key.
  • It is illegal to truncate a table which is also a system space, such as _space.
  • The table must be a base table rather than a view.

Actions:

  1. All rows in the table are removed. Usually this is faster than DELETE FROM table-name;.
  2. If the table has an autoincrement primary key, its sequence is not reset to zero, but that may occur in a future Tarantool version.
  3. There is no effect for any triggers associated with the table.
  4. There is no effect on the counts for the ROW_COUNT() function.
  5. Only one action is written to the write-ahead log (with DELETE FROM table-name; there would be one action for each deleted row).

Example:

TRUNCATE TABLE t;

SET

Syntax:

  • SET SESSION setting-name = setting-value;
../../../_images/set.svg

SET SESSION is a shorthand way to update the box.space._session_settings temporary system space.

setting-name can be any one of "error_marshaling_enabled". "sql_default_engine". "sql_defer_foreign_keys", "sql_full_column_names", "sql_full_metadata", "sql_parser_debug", "sql_recursive_triggers", "sql_reverse_unordered_selects", "sql_select_debug", "sql_vdbe_debug". The quote marks are necessary.

If setting-name is "sql_default_engine", then setting-value can be either ‘vinyl’ or ‘memtx’. Otherwise, setting-value can be either TRUE or FALSE.

Example: SET SESSION "sql_default_engine" = 'vinyl'; changes the default engine to ‘vinyl’ instead of ‘memtx’, and returns:

---
- row_count: 1
...

It is functionally the same thing as an UPDATE Statement:

UPDATE "_session_settings"
SET "value" = 'vinyl'
WHERE "name" = 'sql_default_engine';

Statements that retrieve data

SELECT

Syntax:

SELECT [ALL|DISTINCT] select list [from clause] [where clause] [group-by clause] [having clause] [order-by clause];


../../../_images/select.svg


Select zero or more rows.

The clauses of the SELECT statement are discussed in the following five sections.

Select list

Syntax:

select-list-column [, select-list-column ...]

select-list-column:

../../../_images/select_list.svg


Define what will be in a result set; this is a clause in a SELECT statement.

The select list is a comma-delimited list of expressions, or * (asterisk). An expression can have an alias provided with an [[AS] column-name] clause.

The * “asterisk” shorthand is valid if and only if the SELECT statement also contains a FROM clause which specifies the table or tables (details about the FROM clause are in the next section). The simple form is * which means “all columns” – for example, if the select is done for a table which contains three columns s1 s2 s3, then SELECT * ... is equivalent to SELECT s1, s2, s3 .... The qualified form is table-name.* which means “all columns in the specified table”, which again must be a result of the FROM clause – for example, if the table is named table1, then table1.* is equivalent to a list of the columns of table1.

The [[AS] column-name] clause determines the column name. The column name is useful for two reasons:

  • in a tabular display, the column names are the headings
  • if the results of the SELECT are used when creating a new table (such as a view), then the column names in the new table will be the column names in the select list.

If [[AS] column-name] is missing, and the expression is not simply the name of a column in the table, then Tarantool makes a name COLUMN_n where n is the number of the non-simple expression within the select list, for example SELECT 5.88, table1.x, 'b' COLLATE "unicode_ci" FROM table1; will cause the column names to be COLUMN_1, X, COLUMN_2. This is a behavior change in Tarantool 2.6, in earlier versions the name would be equal to the expression, see Issue#3962. It is still legal to define tables with column names like COLUMN_1 but not recommended.

Examples:

-- the simple form:
SELECT 5;
-- with multiple expressions including operators:
SELECT 1, 2 * 2, 'Three' || 'Four';
-- with [[AS] column-name] clause:
SELECT 5 AS column1;
-- * which must be eventually followed by a FROM clause:
SELECT * FROM table1;
-- as a list:
SELECT 1 AS a, 2 AS b, table1.* FROM table1;

FROM clause

Syntax:

FROM table-reference [, table-reference ...]


../../../_images/from.svg


Specify the table or tables for the source of a SELECT statement.

The table-reference must be a name of an existing table, or a subquery, or a joined table.

A joined table looks like this:

table-reference-or-joined-table join-operator table-reference-or-joined-table [join-specification]

A join-operator must be any of the standard types:

  • [NATURAL] LEFT [OUTER] JOIN,
  • [NATURAL] INNER JOIN, or
  • CROSS JOIN

A join-specification must be any of:

  • ON expression, or
  • USING (column-name [, column-name …])

Parentheses are allowed, and [[AS] correlation-name] is allowed.

The maximum number of joins in a FROM clause is 64.

Examples:

-- the simplest form:
SELECT * FROM t;
-- with two tables, making a Cartesian join:
SELECT * FROM t1, t2;
-- with one table joined to itself, requiring correlation names:
SELECT a.*, b.* FROM t1 AS a, t1 AS b;
-- with a left outer join:
SELECT * FROM t1 LEFT JOIN t2;

WHERE clause

Syntax:

WHERE condition;


../../../_images/where.svg


Specify the condition for filtering rows from a table; this is a clause in a SELECT or UPDATE or DELETE statement.

The condition may contain any expression that returns a BOOLEAN (TRUE or FALSE or UNKNOWN) value.

For each row in the table:

  • if the condition is true, then the row is kept;
  • if the condition is false or unknown, then the row is ignored.

In effect, WHERE condition takes a table with n rows and returns a table with n or fewer rows.

Examples:

-- with a simple condition:
SELECT 1 FROM t WHERE column1 = 5;
-- with a condition that contains AND and OR and parentheses:
SELECT 1 FROM t WHERE column1 = 5 AND (x > 1 OR y < 1);

GROUP BY clause

Syntax:

GROUP BY expression [, expression ...]


../../../_images/group_by.svg


Make a grouped table; this is a clause in a SELECT statement.

The expressions should be column names in the table, and each column should be specified only once.

In effect, the GROUP BY clause takes a table with rows that may have matching values, combines rows that have matching values into single rows, and returns a table which, because it is the result of GROUP BY, is called a grouped table.

Thus, if the input is a table:

a    b      c
-    -      -
1    'a'   'b
1    'b'   'b'
2    'a'   'b'
3    'a'   'b'
1    'b'   'b'

then GROUP BY a, b will produce a grouped table:

a    b      c
-    -      -
1    'a'   'b'
1    'b'   'b'
2    'a'   'b'
3    'a'   'b'

The rows where column a and column b have the same value have been merged; column c has been preserved but its value should not be depended on – if the rows were not all ‘b’, Tarantool could pick any value.

It is useful to envisage a grouped table as having hidden extra columns for the aggregation of the values, for example:

a    b      c    COUNT(a) SUM(a) MIN(c)
-    -      -    -------- ------ ------
1    'a'    'b'         2      2    'b'
1    'b'    'b'         1      1    'b'
2    'a'    'b'         1      2    'b'
     'a'    'b'         1      3    'b'

These extra columns are what aggregate functions are for.

Examples:

-- with a single column:
SELECT 1 FROM t GROUP BY column1;
-- with two columns:
SELECT 1 FROM t GROUP BY column1, column2;

Limitations:

  • SELECT s1, s2 FROM t GROUP BY s1; is legal.
  • SELECT s1 AS q FROM t GROUP BY q; is legal.
  • SELECT s1 FROM t GROUP by 1; is legal.

Aggregate functions

Syntax:

function-name (one or more expressions)

Apply a built-in aggregate function to one or more expressions and return a scalar value.

Aggregate functions are only legal in certain clauses of a SELECT statement for grouped tables. (A table is a grouped table if a GROUP BY clause is present.) Also, if an aggregate function is used in a select list and the GROUP BY clause is omitted, then Tarantool assumes SELECT ... GROUP BY [all columns];.

NULLs are ignored for all aggregate functions except COUNT(*).

AVG([DISTINCT] expression)

Return the average value of expression.

Example: AVG(column1)

COUNT([DISTINCT] expression)

Return the number of occurrences of expression.

Example: COUNT(column1)

COUNT(*)

Return the number of occurrences of a row.

Example: COUNT(*)

GROUP_CONCAT(expression-1 [, expression-2]) or GROUP_CONCAT(DISTINCT expression-1)

Return a list of expression-1 values, separated by commas if expression-2 is omitted, or separated by the expression-2 value if expression-2 is not omitted.

Example: GROUP_CONCAT(column1)

MAX([DISTINCT] expression)

Return the maximum value of expression.

Example: MAX(column1)

MIN([DISTINCT] expression)

Return the minimum value of expression.

Example: MIN(column1)

SUM([DISTINCT] expression)

Return the sum of values of expression.

Example: SUM(column1)

TOTAL([DISTINCT] expression)

Return the sum of values of expression.

Example: TOTAL(column1)

HAVING clause

Syntax:

HAVING condition;


../../../_images/having.svg


Specify the condition for filtering rows from a grouped table; this is a clause in a SELECT statement.

The clause preceding the HAVING clause may be a GROUP BY clause. HAVING operates on the table that the GROUP BY produces, which may contain grouped columns and aggregates.

If the preceding clause is not a GROUP BY clause, then there is only one group and the HAVING clause may only contain aggregate functions or literals.

For each row in the table:

  • if the condition is true, then the row is kept;
  • if the condition is false or unknown, then the row is ignored.

In effect, HAVING condition takes a table with n rows and returns a table with n or fewer rows.

Examples:

-- with a simple condition:
SELECT 1 FROM t GROUP BY column1 HAVING column2 > 5;
-- with a more complicated condition:
SELECT 1 FROM t GROUP BY column1 HAVING column2 > 5 OR column2 < 5;
-- with an aggregate:
SELECT x, SUM(y) FROM t GROUP BY x HAVING SUM(y) > 0;
-- with no GROUP BY and an aggregate:
SELECT SUM(y) FROM t GROUP BY x HAVING MIN(y) < MAX(y);

Limitations:

  • HAVING without GROUP BY is not supported for multiple tables.

ORDER BY clause

Syntax:

ORDER BY expression [ASC|DESC] [, expression [ASC|DESC] ...]


../../../_images/order_by.svg


Put rows in order; this is a clause in a SELECT statement.

An ORDER BY expression has one of three types which are checked in order:

  1. Expression is a positive integer, representing the ordinal position of the column in the select list. For example, in the statement
    SELECT x, y, z FROM t ORDER BY 2;
    ORDER BY 2 means “order by the second column in the select list”, which is y.
  2. Expression is a name of a column in the select list, which is determined by an AS clause. For example, in the statement
    SELECT x, y AS x, z FROM t ORDER BY x;
    ORDER BY x means “order by the column explicitly named x in the select list”, which is the second column.
  3. Expression contains a name of a column in a table of the FROM clause. For example, in the statement
    SELECT x, y FROM t1 JOIN t2 ORDER BY z;
    ORDER BY z means “order by a column named z which is expected to be in table t1 or table t2”.

If both tables contain a column named z, then Tarantool will choose the first column that it finds.

The expression may also contain operators and function names and literals. For example, in the statement
SELECT x, y FROM t ORDER BY UPPER(z);
ORDER BY UPPER(z) means “order by the uppercase form of column t.z”, which may be similar to doing ordering in a case-insensitive manner.

Type 3 is illegal if the SELECT statement contains UNION or EXCEPT or INTERSECT.

If an ORDER BY clause contains multiple expressions, then expressions on the left are processed first and expressions on the right are processed only if necessary for tie-breaking. For example, in the statement
SELECT x, y FROM t ORDER BY x, y; if there are two rows which both have the same values for column x, then an additional check is made to see which row has a greater value for column y.

In effect, ORDER BY clause takes a table with rows that may be out of order, and returns a table with rows in order.

Sorting order:

  • The default order is ASC (ascending), the optional order is DESC (descending).
  • NULLs come first, then BOOLEANs, then numbers, then STRINGs, then VARBINARYs.
  • Within STRINGs, ordering is according to collation.
  • Collation may be specified with a COLLATE clause within the ORDER BY column-list, or may be default.

Examples:

-- with a single column:
SELECT 1 FROM t ORDER BY column1;
-- with two columns:
SELECT 1 FROM t ORDER BY column1, column2;
-- with a variety of data:
CREATE TABLE h (s1 NUMBER PRIMARY KEY, s2 SCALAR);
INSERT INTO h VALUES (7, 'A'), (4, 'a'), (-4, 'AZ'), (17, 17), (23, NULL);
INSERT INTO h VALUES (17.5, 'Д'), (1e+300, 'A'), (0, ''), (-1, '');
SELECT * FROM h ORDER BY s2 COLLATE "unicode_ci", s1;
-- The result of the above SELECT will be:
- - [23, null]
  - [17, 17]
  - [-1, '']
  - [0, '']
  - [4, 'a']
  - [7, 'A']
  - [1e+300, 'A']
  - [-4, 'AZ']
  - [17.5, 'Д']
...

Limitations:

  • ORDER BY 1 is legal. This is common but is not standard SQL nowadays.

LIMIT clause

Syntax:

  • LIMIT limit-expression [OFFSET offset-expression]
  • LIMIT offset-expression, limit-expression

Note

The above is not a typo: offset-expression and limit-expression are in reverse order if a comma is used.


../../../_images/limit.svg


Specify a maximum number of rows and a start row; this is a clause in a SELECT statement.

Expressions may contain integers and arithmetic operators or functions, for example ABS(-3 / 1). However, the result must be an integer value greater than or equal to zero.

Usually the LIMIT clause follows an ORDER BY clause, because otherwise Tarantool does not guarantee that rows are in order.

Examples:

-- simple case:
SELECT * FROM t LIMIT 3;
-- both limit and order:
SELECT * FROM t LIMIT 3 OFFSET 1;
-- applied to a UNIONed result (LIMIT clause must be the final clause):
SELECT column1 FROM table1 UNION SELECT column1 FROM table2 ORDER BY 1 LIMIT 1;

Limitations:

  • If ORDER BY … LIMIT is used, then all order-by columns must be ASC or all must be DESC.

Subquery

Syntax:

A subquery has the same syntax as a SELECT statement or VALUES statement embedded inside a main statement.

Note

The SELECT and VALUES statements are called “queries” because they return answers, in the form of result sets.

Subqueries may be the second part of INSERT statements. For example:

INSERT INTO t2 SELECT a, b, c FROM t1;

Subqueries may be in the FROM clause of SELECT statements.

Subqueries may be expressions, or be inside expressions. In this case they must be parenthesized, and usually the number of rows must be 1. For example:

SELECT 1, (SELECT 5), 3 FROM t WHERE c1 * (SELECT COUNT(*) FROM t2) > 5;

Subqueries may be expressions on the right side of certain comparison operators, and in this unusual case the number of rows may be greater than 1. The comparison operators are: [NOT] EXISTS and [NOT] IN. For example:

DELETE FROM t WHERE s1 NOT IN (SELECT s2 FROM t);

Subqueries may refer to values in the outer query. In this case, the subquery is called a “correlated subquery”.

Subqueries may refer to rows which are being updated or deleted by the main query. In that case, the subquery finds the matching rows first, before starting to update or delete. For example, after:

CREATE TABLE t (s1 INTEGER PRIMARY KEY, s2 INTEGER);
INSERT INTO t VALUES (1, 3), (2, 1);
DELETE FROM t WHERE s2 NOT IN (SELECT s1 FROM t);

only one of the rows is deleted, not both rows.

WITH clause

WITH clause (common table expression)

Syntax:

WITH temporary-table-name AS (subquery)
[, temporary-table-name AS (subquery)]
SELECT statement | INSERT statement | DELETE statement | UPDATE statement | REPLACE statement;


../../../_images/with.svg


WITH v AS (SELECT * FROM t) SELECT * FROM v;

is equivalent to creating a view and selecting from it:

CREATE VIEW v AS SELECT * FROM t;
SELECT * FROM v;

The difference is that a WITH-clause “view” is temporary and only useful within the same statement. No CREATE privilege is required.

The WITH-clause can also be thought of as a subquery that has a name. This is useful when the same subquery is being repeated. For example:

SELECT * FROM t WHERE a < (SELECT s1 FROM x) AND b < (SELECT s1 FROM x);

can be replaced with:

WITH s AS (SELECT s1 FROM x) SELECT * FROM t,s WHERE a < s.s1 AND b < s.s1;

This “factoring out” of a repeated expression is regarded as good practice.

Examples:

WITH cte AS (VALUES (7, '') INSERT INTO j SELECT * FROM cte;
WITH cte AS (SELECT s1 AS x FROM k) SELECT * FROM cte;
WITH cte AS (SELECT COUNT(*) FROM k WHERE s2 < 'x' GROUP BY s3)
  UPDATE j SET s2 = 5
  WHERE s1 = (SELECT s1 FROM cte) OR s3 = (SELECT s1 FROM cte);

WITH can only be used at the beginning of a statement, therefore it cannot be used at the beginning of a subquery or after a set operator or inside a CREATE statement.

A WITH-clause “view” is read-only because Tarantool does not support updatable views.

WITH RECURSIVE

WITH RECURSIVE clause (iterative common table expression)

The real power of WITH lies in the WITH RECURSIVE clause, which is useful when it is combined with UNION or UNION ALL:

WITH RECURSIVE recursive-table-name AS
(SELECT ... FROM non-recursive-table-name ...
UNION [ALL]
SELECT ... FROM recursive-table-name ...)
statement-that-uses-recursive-table-name;


../../../_images/with_recursive.svg


In non-SQL this can be read as: starting with a seed value from a non-recursive table, produce a recursive viewed table, UNION that with itself, UNION that with itself, UNION that with itself … forever, or until a condition in the WHERE clause says “stop”.

For example:

CREATE TABLE ts (s1 INTEGER PRIMARY KEY);
INSERT INTO ts VALUES (1);
WITH RECURSIVE w AS (
  SELECT s1 FROM ts
  UNION ALL
  SELECT s1 + 1 FROM w WHERE s1 < 4)
SELECT * FROM w;

First, table w is seeded from t1, so it has one row: [1].

Then, UNION ALL (SELECT s1 + 1 FROM w) takes the row from w – which contains [1] – adds 1 because the select list says “s1+1”, and so it has one row: [2].

Then, UNION ALL (SELECT s1 + 1 FROM w) takes the row from w – which contains [2] – adds 1 because the select list says “s1+1”, and so it has one row: [3].

Then, UNION ALL (SELECT s1 + 1 FROM w) takes the row from w – which contains [3] – adds 1 because the select list says “s1+1”, and so it has one row: [4].

Then, UNION ALL (SELECT s1 + 1 FROM w) takes the row from w – which contains [4] – and now the importance of the WHERE clause becomes evident, because “s1 < 4” is false for this row, and therefore we have reached the “stop” condition.

So, before the “stop”, table w got 4 rows – [1], [2], [3], [4] – and the result of the statement looks like:

tarantool> WITH RECURSIVE w AS (
         >   SELECT s1 FROM ts
         >   UNION ALL
         >   SELECT s1 + 1 FROM w WHERE s1 < 4)
         > SELECT * FROM w;
---
- - [1]
  - [2]
  - [3]
  - [4]
...

In other words, this WITH RECURSIVE ... SELECT produces a table of auto-incrementing values.

UNION, EXCEPT, and INTERSECT clauses

Syntax:

  • select-statement UNION [ALL] select-statement [ORDER BY clause] [LIMIT clause];
  • select-statement EXCEPT select-statement [ORDER BY clause] [LIMIT clause];
  • select-statement INTERSECT select-statement [ORDER BY clause] [LIMIT clause];


../../../_images/union.svg


../../../_images/except.svg


../../../_images/intersect.svg


UNION, EXCEPT, and INTERSECT are collectively called “set operators” or “table operators”. In particular:

  • a UNION b means “take rows which occur in a OR b”.
  • a EXCEPT b means “take rows which occur in a AND NOT b”.
  • a INTERSECT b means “take rows which occur in a AND b”.

Duplicate rows are eliminated unless ALL is specified.

The select-statements may be chained: SELECT ... SELECT ... SELECT ...;

Each select-statement must result in the same number of columns.

The select-statements may be replaced with VALUES statements.

The maximum number of set operations is 50.

Example:

CREATE TABLE t1 (s1 INTEGER PRIMARY KEY, s2 STRING);
CREATE TABLE t2 (s1 INTEGER PRIMARY KEY, s2 STRING);
INSERT INTO t1 VALUES (1, 'A'), (2, 'B'), (3, NULL);
INSERT INTO t2 VALUES (1, 'A'), (2, 'C'), (3,NULL);
SELECT s2 FROM t1 UNION SELECT s2 FROM t2;
SELECT s2 FROM t1 UNION ALL SELECT s2 FROM t2 ORDER BY s2;
SELECT s2 FROM t1 EXCEPT SELECT s2 FROM t2;
SELECT s2 FROM t1 INTERSECT SELECT s2 FROM t2;

In this example:

  • The UNION query returns 4 rows: NULL, ‘A’, ‘B’, ‘C’.
  • The UNION ALL query returns 6 rows: NULL, NULL, ‘A’, ‘A’, ‘B’, ‘C’.
  • The EXCEPT query returns 1 row: ‘B’.
  • The INTERSECT query returns 2 rows: NULL, ‘A’.

Limitations:

  • Parentheses are not allowed.
  • Evaluation is left to right, INTERSECT does not have precedence.

Example:

CREATE TABLE t01 (s1 INTEGER PRIMARY KEY, s2 STRING);
CREATE TABLE t02 (s1 INTEGER PRIMARY KEY, s2 STRING);
CREATE TABLE t03 (s1 INTEGER PRIMARY KEY, s2 STRING);
INSERT INTO t01 VALUES (1, 'A');
INSERT INTO t02 VALUES (1, 'B');
INSERT INTO t03 VALUES (1, 'A');
SELECT s2 FROM t01 INTERSECT SELECT s2 FROM t03 UNION SELECT s2 FROM t02;
SELECT s2 FROM t03 UNION SELECT s2 FROM t02 INTERSECT SELECT s2 FROM t03;
-- ... results are different.

INDEXED BY clause

Syntax:

INDEXED BY index-name


../../../_images/indexed_by.svg


The INDEXED BY clause may be used in a SELECT, DELETE, or UPDATE statement, immediately after the table-name. For example:

DELETE FROM table7 INDEXED BY index7 WHERE column1 = 'a';

In this case the search for ‘a’ will take place within index7. For example:

SELECT * FROM table7 NOT INDEXED WHERE column1 = 'a';

In this case the search for ‘a’ will be done via a search of the whole table, what is sometimes called a “full table scan”, even if there is an index for column1.

Ordinarily Tarantool chooses the appropriate index or lookup method depending on a complex set of “optimizer” rules; the INDEXED BY clause overrides the optimizer choice.

Example:

Suppose a table has two columns:

  • The first column is the primary key and therefore it has an automatic index named pk_unnamed_T_1.
  • The second column has an index created by the user.

The user selects with INDEXED BY the-index-on-column1, then selects with INDEXED BY the-index-on-column-2.

CREATE TABLE t (column1 INTEGER PRIMARY KEY, column2 INTEGER);
CREATE INDEX idx_column2_t_1 ON t (column2);
INSERT INTO t VALUES (1, 2), (2, 1);
SELECT * FROM t INDEXED BY "pk_unnamed_T_1";
SELECT * FROM t INDEXED BY idx_column2_t_1;
-- Result for the first select: (1, 2), (2, 1)
-- Result for the second select: (2, 1), (1, 2).

Limitations:
Often INDEXED BY has no effect.
Often INDEXED BY affects a choice of covering index, but not a WHERE clause.

VALUES

Syntax:

VALUES (expression [, expression ...]) [, (expression [, expression ...])


../../../_images/values.svg


Select one or more rows.

VALUES has the same effect as SELECT, that is, it returns a result set, but VALUES statements may not have FROM or GROUP or ORDER BY or LIMIT clauses.

VALUES may be used wherever SELECT may be used, for example in subqueries.

Examples:

-- simple case:
VALUES (1);
-- equivalent to SELECT 1, 2, 3:
VALUES (1, 2, 3);
-- two rows:
VALUES (1, 2, 3), (4, 5, 6);

PRAGMA

Syntax:

  • PRAGMA pragma-name (pragma-value);
  • or PRAGMA pragma-name;
../../../_images/pragma.svg

PRAGMA statements will give rudimentary information about database ‘metadata’ or server performance, although it is better to get metadata via system tables.

For PRAGMA statements that include (pragma-value), pragma values are strings and can be specified inside "" double quotes, or without quotes. When a string is used for searching, results must match according to a binary collation. If the object being searched has a lower-case name, use double quotes.

In an earlier version, there were some PRAGMA statements that determined behavior. Now that does not happen. Behavior change is done by updating the box.space._session_settings system table.

Pragma Parameter Effect
foreign_key_list string
table-name
Return a result set with one row for each foreign key of “table-name”. Each row contains:
(INTEGER) id – identification number
(INTEGER) seq – sequential number
(STRING) table – name of table
(STRING) from – referencing key
(STRING) to – referenced key
(STRING) on_update – ON UPDATE clause
(STRING) on_delete – ON DELETE clause
(STRING) match – MATCH clause
The system table is "_fk_constraint".
collation_list   Return a result set with one row for each supported collation. The first four collations are 'none' and 'unicode' and 'unicode_ci' and 'binary', then come about 270 predefined collations, the exact count may vary because users can add their own collations.
The system table is "_collation".
index_info string
table-name . index-name
Return a result set with one row for each column in “table-name.index-name”. Each row contains:
(INTEGER) seqno – the column’s ordinal position in the index (first column is 0)
(INTEGER) cid – the column’s ordinal position in the table (first column is 0)
(STRING) name – name of the column
(INTEGER) desc – 0 if ASC, 1 if DESC
(STRING) collation name
(STRING) type – data type
index_list string
table-name
Return a result set with one row for each index of “table-name”. Each row contains:
(INTEGER) seq – sequential number
(STRING) name – index name
(INTEGER) unique – whether the index is unique, 0 = false, 1 = true
The system table is "_index".
stats   Return a result set with one row for each index of each table. Each row contains:
(STRING) table – name of the table
(STRING) index – name of the index
(INTEGER) width – arbitrary information
(INTEGER) height – arbitrary information
table_info string
table-name
Return a result set with one row for each column in “table-name”. Each row contains:
(INTEGER) cid – ordinal position in the table
(first column number is 0)
(STRING) name – column name
(INTEGER) notnull – whether the column is NOT NULL. 0 is false, 1 is true.
(STRING) dflt_value – default value
(INTEGER) pk – whether the column is a PRIMARY KEY column. 0 is false, 1 is true.

Example: (not showing result set metadata)

PRAGMA table_info('T');
---
- - [0, 's1', 'integer', 1, null, 1]
  - [1, 's2', 'integer', 0, null, 0]
...

EXPLAIN

Syntax:

  • EXPLAIN explainable-statement;
../../../_images/explain.svg

EXPLAIN will show what steps Tarantool would take if it executed explainable-statement. This is primarily a debugging and optimization aid for the Tarantool team.

Example: EXPLAIN DELETE FROM m; returns:

- - [0, 'Init', 0, 3, 0, '', '00', 'Start at 3']
  - [1, 'Clear', 16416, 0, 0, '', '00', '']
  - [2, 'Halt', 0, 0, 0, '', '00', '']
  - [3, 'Transaction', 0, 1, 1, '0', '01', 'usesStmtJournal=0']
  - [4, 'Goto', 0, 1, 0, '', '00', '']

Variation: EXPLAIN QUERY PLAN statement; shows the steps of a search.

Statements for transactions

START TRANSACTION

Syntax:

START TRANSACTION;


../../../_images/start.svg


Start a transaction. After START TRANSACTION;, a transaction is “active”. If a transaction is already active, then START TRANSACTION; is illegal.

Transactions should be active for fairly short periods of time, to avoid concurrency issues. To end a transaction, say COMMIT; or ROLLBACK;.

Just as in NoSQL, transaction control statements are subject to limitations set by the storage engine involved:
* For the memtx storage engine, if a yield happens within an active transaction, the transaction is rolled back.
* For the vinyl engine, yields are allowed.
Also, although CREATE AND DROP and ALTER statements are legal in transactions, there are a few exceptions. For example, CREATE INDEX ON table_name ... will fail within a multi-statement transaction if the table is not empty.

However,transaction control statements still may not work as you expect when run over a network connection: a transaction is associated with a fiber, not a network connection, and different transaction control statements sent via the same network connection may be executed by different fibers from the fiber pool.

In order to ensure that all statements are part of the intended transaction, put all of them between START TRANSACTION; and COMMIT; or ROLLBACK; then send as a single batch. For example:

  • Enclose each separate SQL statement in a box.execute() function.

  • Pass all the box.execute() functions to the server in a single message.

    If you are using a console, you can do this by writing everything on a single line.

    If you are using net.box, you can do this by putting all the function calls in a single string and calling eval(string).

Example:

START TRANSACTION;

Example of a whole transaction sent to a server on localhost:3301 with eval(string):

net_box = require('net.box')
conn = net_box.new('localhost', 3301)
s = 'box.execute([[START TRANSACTION;]]) '
s = s .. 'box.execute([[INSERT INTO t VALUES (1);]]) '
s = s .. 'box.execute([[ROLLBACK;]]) '
conn:eval(s)

COMMIT

Syntax:

COMMIT;


../../../_images/commit.svg


Commit an active transaction, so all changes are made permanent and the transaction ends.

COMMIT is illegal unless a transaction is active. If a transaction is not active then SQL statements are committed automatically.

Example:

COMMIT;

SAVEPOINT

Syntax:

SAVEPOINT savepoint-name;


../../../_images/savepoint.svg


Set a savepoint, so that ROLLBACK TO savepoint-name is possible.

SAVEPOINT is illegal unless a transaction is active.

If a savepoint with the same name already exists, it is released before the new savepoint is set.

Example:

SAVEPOINT x;

RELEASE SAVEPOINT

Syntax:

RELEASE SAVEPOINT savepoint-name;


../../../_images/release.svg


Release (destroy) a savepoint created by a SAVEPOINT statement.

RELEASE is illegal unless a transaction is active.

Savepoints are released automatically when a transaction ends.

Example:

RELEASE SAVEPOINT x;

ROLLBACK

Syntax:

ROLLBACK [TO [SAVEPOINT] savepoint-name];


../../../_images/rollback.svg


If ROLLBACK does not specify a savepoint-name, rollback an active transaction, so all changes since START TRANSACTION are cancelled, and the transaction ends.

If ROLLBACK does specify a savepoint-name, rollback an active transaction, so all changes since SAVEPOINT savepoint-name are cancelled, and the transaction does not end.

ROLLBACK is illegal unless a transaction is active.

Examples:

-- the simple form:
ROLLBACK;
-- the form so changes before a savepoint are not cancelled:
ROLLBACK TO SAVEPOINT x;
-- An example of a Lua function that will do a transaction
-- containing savepoint and rollback to savepoint.
function f()
box.execute([[DROP TABLE IF EXISTS t;]]) -- commits automatically
box.execute([[CREATE TABLE t (s1 STRING PRIMARY KEY);]]) -- commits automatically
box.execute([[START TRANSACTION;]]) -- after this succeeds, a transaction is active
box.execute([[INSERT INTO t VALUES ('Data change #1');]])
box.execute([[SAVEPOINT "1";]])
box.execute([[INSERT INTO t VALUES ('Data change #2');]])
box.execute([[ROLLBACK TO SAVEPOINT "1";]]) -- rollback Data change #2
box.execute([[ROLLBACK TO SAVEPOINt "1";]]) -- this is legal but does nothing
box.execute([[COMMIT;]]) -- make Data change #1 permanent, end the transaction
end

Functions

Explanation of functions

Syntax:

function-name (one or more expressions)

Apply a built-in function to one or more expressions and return a scalar value.

Tarantool supports 32 built-in functions.

The maximum number of operands for any function is 127.

The required privileges for built-in functions will likely change in a future version.

List of functions

ABS

Syntax:

ABS(numeric-expression)

Return the absolute value of numeric-expression, which can be any numeric type.

Example: ABS(-1) is 1.

CAST

Syntax:

CAST(expression AS data-type)

Return the expression value after casting to the specified data type.

Examples: CAST('AB' AS VARBINARY), CAST(X'4142' AS STRING)

CHAR

Syntax:

CHAR([numeric-expression [,numeric-expression...])

Return the characters whose Unicode code point values are equal to the numeric expressions.

Short example:

The first 128 Unicode characters are the “ASCII” characters, so CHAR(65, 66, 67) is ‘ABC’.

Long example:

For the current list of Unicode characters, in order by code point, see www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt. In that list, there is a line for a Linear B ideogram

100CC;LINEAR B IDEOGRAM B240 WHEELED CHARIOT ...

Therefore, for a string with a chariot in the middle, use the concatenation operator || and the CHAR function

'start of string ' || CHAR(0X100CC) || ' end of string'.

COALESCE

Syntax:

COALESCE(expression, expression [, expression ...])

Return the value of the first non-NULL expression, or, if all expression values are NULL, return NULL.

Example:
COALESCE(NULL, 17, 32) is 17.

GREATEST

Syntax:

GREATEST(expression-1, expression-2, [expression-3 ...])

Return the greatest value of the supplied expressions, or, if any expression is NULL, return NULL. The reverse of GREATEST is LEAST.

Examples: GREATEST(7, 44, -1) is 44; GREATEST(1E308, 'a', 0, X'00') is ‘0’ = the nul character; GREATEST(3, NULL, 2) is NULL

HEX

Syntax:

HEX(expression)

Return the hexadecimal code for each byte in expression, which may be either a string or a byte sequence. For ASCII characters, this is straightforward because the encoding is the same as the code point value. For non-ASCII characters, since character strings are usually encoded in UTF-8, each character will require two or more bytes.

Examples:

  • HEX('A') will return 41.
  • HEX('Д') will return D094.

IFNULL

Syntax:

IFNULL(expression, expression)

Return the value of the first non-NULL expression, or, if both expression values are NULL, return NULL. Thus IFNULL(expression, expression) is the same as COALESCE(expression, expression).

Example:
IFNULL(NULL, 17) is 17

LEAST

Syntax:

LEAST(expression-1, expression-2, [expression-3 ...])

Return the least value of the supplied expressions, or, if any expression is NULL, return NULL. The reverse of LEAST is GREATEST.

Examples: LEAST(7, 44, -1) is -1; LEAST(1E308, 'a', 0, X'00') is 0; LEAST(3, NULL, 2) is NULL.

LENGTH

Syntax:

LENGTH(expression)

Return the number of characters in the expression, or the number of bytes in the expression. It depends on the data type: strings with data type STRING are counted in characters, byte sequences with data type VARBINARY are counted in bytes and are not ended by the nul character. There are two aliases for LENGTH(expression)CHAR_LENGTH(expression) and CHARACTER_LENGTH(expression) do the same thing.

Examples:

  • LENGTH('ДД') is 2, the string has 2 characters.
  • LENGTH(CAST('ДД' AS VARBINARY)) is 4, the string has 4 bytes.
  • LENGTH(CHAR(0, 65)) is 2, ‘0’ does not mean ‘end of string’.
  • LENGTH(X'410041') is 3, X’…’ byte sequences have type VARBINARY.

LIKELIHOOD

Syntax:

LIKELIHOOD(expression, number literal)

Return the result of the expression, provided that the number literal is between 0.0 and 1.0.

Example: LIKELIHOOD('a' = 'b', .0) is FALSE

LIKELY

Syntax:

LIKELY(expression)

Return TRUE if the expression is probably true.

Example: LIKELY('a' <= 'b') is TRUE

LOWER

Syntax:

LOWER(string-expression)

Return the expression, with upper-case characters converted to lower case. The reverse of LOWER is UPPER.

Example: LOWER('ДA') is ‘дa’

NULLIF

Syntax:

NULLIF(expression-1, expression-2)

Return expression-1 if expression-1 <> expression-2, otherwise return NULL.

Examples:

  • NULLIF('a', 'A') is ‘a’.
  • NULLIF(1.00, 1) is NULL.

POSITION

Syntax:

POSITION(expression-1, expression-2)

Return the position of expression-1 within expression-2, or return 0 if expression-1 does not appear within expression-2. The data types of the expressions must be either STRING or VARBINARY. If the expressions have data type STRING, then the result is the character position. If the expressions have data type VARBINARY, then the result is the byte position.

Short example: POSITION('C', 'ABC') is 3

Long example: The UTF-8 encoding for the Latin letter A is hexadecimal 41; the UTF-8 encoding for the Cyrillic letter Д is hexadecimal D094 – you can confirm this by saying SELECT HEX(‘ДA’); and seeing that the result is ‘D09441’. If you now execute SELECT POSITION('A', 'ДA'); the result will be 2, because ‘A’ is the second character in the string. However, if you now execute SELECT POSITION(X'41', X'D09441'); the result will be 3, because X’41’ is the third byte in the byte sequence.

PRINTF

Syntax:

PRINTF(string-expression [, expression ...])

Return a string formatted according to the rules of the C sprintf() function, where %d%s means the next two arguments are a number and a string, and so on.

If an argument is missing or is NULL, it becomes:

  • ‘0’ if the format requires an integer,
  • ‘0.0’ if the format requires a number with a decimal point,
  • ‘’ if the format requires a string.

Example: PRINTF('%da', 5) is ‘5a’.

QUOTE

Syntax:

QUOTE(string-literal)

Return a string with enclosing quotes if necessary, and with quotes inside the enclosing quotes if necessary. This function is useful for creating strings which are part of SQL statements, because of SQL’s rules that string literals are enclosed by single quotes, and single quotes inside such strings are shown as two single quotes in a row.

Example: QUOTE('a') is 'a'.

RAISE

Syntax:

RAISE(FAIL, error-message)

This may only be used within a triggered statement. See also Trigger Activation.

RANDOM

Syntax: RANDOM()

Return a 19-digit integer which is generated by a pseudo-random number generator,

Example: RANDOM() is 6832175749978026034, or it is any other integer

RANDOMBLOB

Syntax:

RANDOMBLOB(n)

Return a byte sequence, n bytes long, data type = VARBINARY, containing bytes generated by a pseudo-random byte generator. The result can be translated to hexadecimal. If n is less than 1 or is NULL or is infinity, then NULL is returned.

Example: HEX(RANDOMBLOB(3)) is ‘9EAAA8’, or it is the hex value for any other three-byte string

REPLACE

Syntax:

REPLACE(expression-1, expression-2, expression-3)

Return expression-1, except that wherever expression-1 contains expression-2, replace expression-2 with expression-3. The expressions should all have data type STRING or VARBINARY.

Example: REPLACE('AAABCCCBD', 'B', '!') is ‘AAA!CCC!D’

ROUND

Syntax:

ROUND(numeric-expression-1 [, numeric-expression-2])

Return the rounded value of numeric-expression-1, always rounding .5 upward for floating-point positive numbers or downward for negative numbers. If numeric-expression-2 is supplied then rounding is to the nearest numeric-expression-2 digits after the decimal point; if numeric-expression-2 is not supplied then rounding is to the nearest integer.

Example: ROUND(-1.5) is -2, ROUND(1.7766E1,2) is 17.77.

ROW_COUNT

ROW_COUNT()

Return the number of rows that were inserted / updated / deleted by the last INSERT or UPDATE or DELETE or REPLACE statement. Rows which were updated by an UPDATE statement are counted even if there was no change. Rows which were inserted / updated / deleted due to foreign-key action are not counted. Rows which were inserted / updated / deleted due to a view’s INSTEAD OF triggers are not counted. After a CREATE or DROP statement, ROW_COUNT() is 1. After other statements, ROW_COUNT() is 0.

Example: ROW_COUNT() is 1 after a successful INSERT of a single row.

Special rule if there are BEFORE or AFTER triggers: In effect the ROW_COUNT() counter is pushed at the beginning of a series of triggered statements, and popped at the end. Therefore, after the following statements:

CREATE TABLE t1 (s1 INTEGER PRIMARY KEY);
CREATE TABLE t2 (s1 INTEGER, s2 STRING, s3 INTEGER, PRIMARY KEY (s1, s2, s3));
CREATE TRIGGER tt1 BEFORE DELETE ON t1 FOR EACH ROW BEGIN
  INSERT INTO t2 VALUES (old.s1, '#2 Triggered', ROW_COUNT());
  INSERT INTO t2 VALUES (old.s1, '#3 Triggered', ROW_COUNT());
  END;
INSERT INTO t1 VALUES (1),(2),(3);
DELETE FROM t1;
INSERT INTO t2 VALUES (4, '#4 Untriggered', ROW_COUNT());
SELECT * FROM t2;

The result is:

---
- - [1, '#2 Triggered', 3]
  - [1, '#3 Triggered', 1]
  - [2, '#2 Triggered', 3]
  - [2, '#3 Triggered', 1]
  - [3, '#2 Triggered', 3]
  - [3, '#3 Triggered', 1]
  - [4, '#4 Untriggered', 3]
...

SOUNDEX

Syntax:

SOUNDEX(string-expression)

Return a four-character string which represents the sound of string-expression. Often words and names which have different spellings will have the same Soundex representation if they are pronounced similarly, so it is possible to search by what they sound like. The algorithm works with characters in the Latin alphabet and works best with English words.

Example: SOUNDEX('Crater') and SOUNDEX('Creature') both return C636.

SUBSTR

Syntax:

SUBSTR(expression-1, numeric-expression-1 [, numeric-expression-2])

If expression-1 has data type STRING, then return the substring which begins at character position numeric-expression-1 and continues for numeric-expression-2 characters (if numeric-expression-2 is supplied), or continues till the end of string-expression-1 (if numeric-expression-2 is not supplied).

If expression-1 has data type VARBINARY rather than data type STRING, then positioning and counting is by bytes rather than by characters.

Example: SUBSTR('ABCDEFG', 3, 2) is ‘CD’

TRIM

Syntax:

TRIM([[LEADING|TRAILING|BOTH] [expression-1] FROM] expression-2)

Return expression-2 after removing all leading and/or trailing characters or bytes. The expressions should have data type STRING or VARBINARY. If LEADING|TRAILING|BOTH is omitted, the default is BOTH. If expression-1 is omitted, the default is ‘ ‘ (space) for data type STRING or X’00’ (nul) for data type VARBINARY.

Examples:

TRIM('a' FROM 'abaaaaa') is ‘b’ – all repetitions of ‘a’ are removed on both sides; TRIM(TRAILING 'ב' FROM 'אב') is ‘א’ – if all characters are Hebrew, TRAILING means “left”; TRIM(X'004400') is X’44’ – the default byte sequence to trim is X’00’ when data type is VARBINARY; TRIM(LEADING 'abc' FROM 'abcd') is ‘d’ – expression-1 can have more than 1 character.

TYPEOF

Syntax:

TYPEOF(expression)

Return the data type of the expression.

Examples:

TYPEOF('A') returns ‘string’; TYPEOF(RANDOMBLOB(1)) returns ‘varbinary’; TYPEOF(1e44) returns ‘double’ or ‘number’; TYPEOF(-44) returns ‘integer’; TYPEOF(NULL) returns ‘boolean’

UNICODE

Syntax:

UNICODE(string-expression)

Return the Unicode code point value of the first character of string-expression. If string-expression is empty, the return is NULL. This is the reverse of CHAR(integer).

Example: UNICODE('Щ') is 1065 (hexadecimal 0429).

UNLIKELY

Syntax:

UNLIKELY(expression)

Return TRUE if the expression is probably false. Limitation: in fact UNLIKELY may return the same thing as LIKELY.

Example: UNLIKELY('a' <= 'b') is TRUE.

UPPER

Syntax:

UPPER(string-expression)

Return the expression, with lower-case characters converted to upper case. The reverse of UPPER is LOWER.

Example: UPPER('-4щl') is ‘-4ЩL’.

VERSION

Syntax:

VERSION()

Return the Tarantool version.

Example: for a February 2020 build VERSION() is '2.4.0-35-g57f6fc932'.

ZEROBLOB

Syntax:

ZEROBLOB(n)

Return a byte sequence, data type = VARBINARY, n bytes long.

COLLATE clause

COLLATE collation-name

The collation-name must identify an existing collation.

The COLLATE clause is allowed for STRING or SCALAR items:
() in CREATE INDEX
() in CREATE TABLE as part of column definition
() in CREATE TABLE as part of UNIQUE definition
() in string expressions

Examples:

-- In CREATE INDEX
CREATE INDEX idx_unicode_mb_1 ON mb (s1 COLLATE "unicode");
-- In CREATE TABLE
CREATE TABLE t1 (s1 INTEGER PRIMARY KEY, s2 STRING COLLATE "unicode_ci");
-- In CREATE TABLE ... UNIQUE
CREATE TABLE mb (a STRING, b STRING, PRIMARY KEY(a), UNIQUE(b COLLATE "unicode_ci" DESC));
-- In string expressions
SELECT 'a' = 'b' COLLATE "unicode"
    FROM t
    WHERE s1 = 'b' COLLATE "unicode"
    ORDER BY s1 COLLATE "unicode";

The list of collations can be seen with: PRAGMA collation_list;

The collation rules comply completely with the Unicode Technical Standard #10 (“Unicode Collation Algorithm”) and the default character order is as in the Default Unicode Collation Element Table (DUCET). There are many permanent collations; the commonly used ones include:
    "none" (not applicable)
    "unicode" (characters are in DUCET order with strength = ‘tertiary’)
    "unicode_ci" (characters are in DUCET order with strength = ‘primary’)
    "binary" (characters are in code point order)
These identifiers must be quoted and in lower case because they are in lower case in Tarantool/NoSQL collations.

If one says COLLATE "binary", this is equivalent to asking for what is sometimes called “code point order” because, if the contents are in the UTF-8 character set, characters with larger code points will appear after characters with lower code points.

In an expression, COLLATE is an operator with higher precedence than anything except ~. This is fine because there are no other useful operators except || and comparison. After ||, collation is preserved.

In an expression with more than one COLLATE clause, if the collation names differ, there is an error: “Illegal mix of collations”. In an expression with no COLLATE clauses, literals have collation "binary", columns have the collation specified by CREATE TABLE.

In other words, to pick a collation, we use:
the first COLLATE clause in an expression if it was specified,
else the the column’s COLLATE clause if it was specified,
else "binary".

However, for searches and sometimes for sorting, the collation may be an index’s collation, so all non-index COLLATE clauses are ignored.

EXPLAIN will not show the name of what collation was used, but will show the collation’s characteristics.

Example with Swedish collation:
Knowing that “sv” is the two-letter code for Swedish,
and knowing that “s1” means strength = 1,
and seeing with PRAGMA collation_list; that there is a collation named unicode_sv_s1,
check whether two strings are equal according to Swedish rules (yes they are):
SELECT 'ÄÄ' = 'ĘĘ' COLLATE "unicode_sv_s1";

Example with Russian and Ukrainian and Kyrgyz collations:
Knowing that Russian collation is practically the same as Unicode default,
and knowing that the two-letter codes for Ukrainian and Kyrgyz are ‘uk’ and ‘ky’,
and knowing that in Russian (but not Ukrainian) ‘Г’ = ‘Ґ’ with strength=primary,
and knowing that in Russian (but not Kyrgyz) ‘Е’ = ‘Ё’ with strength=primary,
the three SELECT statements here will return results in three different orders:
CREATE TABLE things (remark STRING PRIMARY KEY);
INSERT INTO things VALUES ('Е2'), ('Ё1');
INSERT INTO things VALUES ('Г2'), ('Ґ1');
SELECT remark FROM things ORDER BY remark COLLATE "unicode";
SELECT remark FROM things ORDER BY remark COLLATE "unicode_uk_s1";
SELECT remark FROM things ORDER BY remark COLLATE "unicode_ky_s1";