This chapter describes the syntax for the SQL statements supported in MySQL.
DELETE SyntaxSingle-table syntax:
DELETE [LOW_PRIORITY] [QUICK] [IGNORE] FROM tbl_name
[WHERE where_definition]
[ORDER BY ...]
[LIMIT row_count]
Multiple-table syntax:
DELETE [LOW_PRIORITY] [QUICK] [IGNORE]
tbl_name[.*] [, tbl_name[.*] ...]
FROM table_references
[WHERE where_definition]
Or:
DELETE [LOW_PRIORITY] [QUICK] [IGNORE]
FROM tbl_name[.*] [, tbl_name[.*] ...]
USING table_references
[WHERE where_definition]
DELETE deletes rows from tbl_name that satisfy the condition
given by where_definition, and returns the number of records deleted.
If you issue a DELETE statement with no WHERE clause, all
rows are deleted. A faster way to do this, when you don't want to know
the number of delete rows, is using TRUNCATE TABLE. See section 14.1.9 TRUNCATE Syntax.
In MySQL 3.23,DELETE without a WHERE clause returns zero
as the number of affected records.
In MySQL 3.23, if you really want to know how many records are deleted
when you are deleting all rows, and are willing to suffer a speed
penalty, you can use a DELETE statement that includes a
WHERE clause with an expression that is true for every row. For
example:
mysql> DELETE FROM tbl_name WHERE 1>0;
This is much slower than TRUNCATE tbl_name, because it deletes
rows one at a time.
The DELETE statement supports the following modifiers:
LOW_PRIORITY keyword, execution of the
DELETE is delayed until no other clients are reading from the table.
MyISAM tables, if you specify the QUICK keyword, the
storage engine does not merge index leaves during delete, which may speed up
certain kind of deletes.
IGNORE keyword causes MySQL to ignore all errors during the
process of deleting rows. (Errors encountered during the parsing stage are
processed in the usual manner.) Errors that are ignored due to the use of
this option are returned as warnings. This option first appeared in MySQL
4.1.1.
The speed of delete operations may also be affected by factors discussed in
section 7.2.13 Speed of DELETE Queries.
In MyISAM tables, deleted records are maintained in a linked list and
subsequent INSERT operations reuse old record positions. To
reclaim unused space and reduce file sizes, use the OPTIMIZE
TABLE statement or the myisamchk utility to reorganize tables.
OPTIMIZE TABLE is easier, but myisamchk is faster. See
section 14.5.2.5 OPTIMIZE TABLE Syntax and section 5.6.2.10 Table Optimization.
The MySQL-specific LIMIT row_count option to DELETE tells
the server the maximum number of rows to be deleted before control is
returned to the client. This can be used to ensure that a specific
DELETE statement doesn't take too much time. You can simply repeat
the DELETE statement until the number of affected rows is less than
the LIMIT value.
If the DELETE statement includes an ORDER BY clause, the rows
are deleted in the order specified by the clause. This is really useful only
in conjunction with LIMIT. For example, the following statement
finds rows matching the WHERE clause, sorts them in timestamp
order, and deletes the first (oldest) one:
DELETE FROM somelog WHERE user = 'jcole' ORDER BY timestamp LIMIT 1
ORDER BY can be used with DELETE beginning with MySQL 4.0.0.
From MySQL 4.0, you can specify multiple tables in the DELETE
statement to delete rows from one or more tables depending on a particular
condition in multiple tables. However, you cannot use ORDER BY
or LIMIT in a multiple-table DELETE.
The first multiple-table DELETE syntax is supported starting from
MySQL 4.0.0. The second is supported starting from MySQL 4.0.2. The
table_references part lists the tables involved in the join.
Its syntax is described in section 14.1.7.1 JOIN Syntax.
For the first syntax, only matching rows from the tables listed before the
FROM clause are deleted. For the second syntax, only matching rows
from the tables listed in the FROM clause (before the USING
clause) are deleted. The effect is that you can delete rows from many
tables at the same time and also have additional tables that are used for
searching:
DELETE t1,t2 FROM t1,t2,t3 WHERE t1.id=t2.id AND t2.id=t3.id;
Or:
DELETE FROM t1,t2 USING t1,t2,t3 WHERE t1.id=t2.id AND t2.id=t3.id;
These statements use all three files when searching for rows to delete, but
delete matching rows only from tables t1 and t2.
The examples show inner joins using the comma operator, but
multiple-table DELETE statements can use any type of
join allowed in SELECT statements, such as LEFT JOIN.
The syntax allows .* after the table names for compatibility with
Access.
If you use a multiple-table DELETE statement involving
InnoDB tables for which there are foreign key constraints,
the MySQL optimizer might process tables in an order that differs from
that of their parent/child relationship. In this case, the statement
fails and rolls back. Instead, delete from a single table and rely on the
ON DELETE capabilities that InnoDB provides to cause the
other tables to be modified accordingly.
Note: In MySQL 4.0 you should refer to the table names to be deleted with the true table name. In MySQL 4.1 you must use the alias (if exists) when refering to a table name:
In MySQL 4.0:
DELETE test FROM test AS t1, test2 WHERE ...
In MySQL 4.1:
DELETE t1 FROM test AS t1, test2 WHERE ...
The reason we didn't do the above change in 4.0 is because we didn't want to break any old applications in 4.0 using the old syntax.
DO SyntaxDO expression [, expression ...]
DO executes the expressions but doesn't return any results. This is
shorthand for SELECT expression, ..., but has the advantage that it's
slightly faster when you don't care about the result.
DO is useful mainly with functions that have side effects, such as
RELEASE_LOCK().
HANDLER Syntax
HANDLER tbl_name OPEN [ AS alias ]
HANDLER tbl_name READ index_name { = | >= | <= | < } (value1,value2,...)
[ WHERE where_condition ] [LIMIT ... ]
HANDLER tbl_name READ index_name { FIRST | NEXT | PREV | LAST }
[ WHERE where_condition ] [LIMIT ... ]
HANDLER tbl_name READ { FIRST | NEXT }
[ WHERE where_condition ] [LIMIT ... ]
HANDLER tbl_name CLOSE
The HANDLER statement provides direct access to table storage engine
interfaces. It is available for MyISAM tables as MySQL 4.0.0 and
InnoDB tables as of MySQL 4.0.3.
The HANDLER ... OPEN statement opens a table, making
it accessible via subsequent HANDLER ... READ statements.
This table object is not shared by other threads and is not closed
until the thread calls HANDLER ... CLOSE or the thread terminates.
If you open the table using an alias, further references to the table with
other HANDLER statements must use the alias rather than the table
name.
The first HANDLER ... READ syntax fetches a row where the index
specified satisfies the given values and the WHERE condition is met.
If you have a multiple-column index, specify the index column values as a
comma-separated list. Either specify values for all the columns in the
index, or specify values for a leftmost prefix of the index columns. Suppose
an index includes three columns named col_a, col_b, and
col_c, in that order. The HANDLER statement can specify
values for all three columns in the index, or for the columns in a leftmost
prefix. For example:
HANDLER ... index_name = (col_a_val,col_b_val,col_c_val) ... HANDLER ... index_name = (col_a_val,col_b_val) ... HANDLER ... index_name = (col_a_val) ...
The second HANDLER ... READ syntax fetches a row from the table in
index order that that matches WHERE condition.
The third HANDLER ... READ syntax fetches a row from the table in
natural row order that matches the WHERE condition. It is faster than
HANDLER tbl_name READ index_name when a full table scan is desired.
Natural row order is the order in which rows are stored in a MyISAM
table datafile. This statement works for InnoDB tables as well, but
there is no such concept because there is no separate datafile.
Without a LIMIT clause, all forms of HANDLER ... READ fetch a
single row if one is available. To return a specific number of rows, include a
LIMIT clause. It has the same syntax as for the SELECT
statement.
See section 14.1.7 SELECT Syntax.
HANDLER ... CLOSE closes a table that was opened with
HANDLER ... OPEN.
Note: To use the HANDLER interface to refer to a table's
PRIMARY KEY, use the quoted identifier `PRIMARY`:
HANDLER tbl READ `PRIMARY` > (...);
HANDLER is a somewhat low-level statement. For example, it does not
provide consistency. That is, HANDLER ... OPEN does NOT
take a snapshot of the table, and does NOT lock the table. This
means that after a HANDLER ... OPEN statement is issued, table data
can be modified (by this or any other thread) and these modifications might
appear only partially in HANDLER ... NEXT or HANDLER ... PREV
scans.
There are several reasons to use the HANDLER interface instead of
normal SELECT statements:
HANDLER is faster than SELECT:
HANDLER ... OPEN. The object is reused for the following
HANDLER statements for the table; it need not be reinitialized for
each one.
SELECT doesn't normally allow.
HANDLER makes it much easier to port applications that use an
ISAM-like interface to MySQL.
HANDLER allows you to traverse a database in a manner that is not
easy to do with SELECT (or perhaps even impossible). The HANDLER
interface is a more natural way to look at data when working with
applications that provide an interactive user interface to the database.
INSERT Syntax
INSERT [LOW_PRIORITY | DELAYED] [IGNORE]
[INTO] tbl_name [(col_name,...)]
VALUES ({expression | DEFAULT},...),(...),...
[ ON DUPLICATE KEY UPDATE col_name=expression, ... ]
Or:
INSERT [LOW_PRIORITY | DELAYED] [IGNORE]
[INTO] tbl_name
SET col_name={expression | DEFAULT}, ...
[ ON DUPLICATE KEY UPDATE col_name=expression, ... ]
Or:
INSERT [LOW_PRIORITY | DELAYED] [IGNORE]
[INTO] tbl_name [(col_name,...)]
SELECT ...
INSERT inserts new rows into an existing table. The INSERT ...
VALUES and INSERT ... SET forms of the statement insert rows based
on explicitly specified values. The INSERT ... SELECT form inserts
rows selected from another table or tables. The INSERT ... VALUES
form with multiple value lists is supported in MySQL 3.22.5 or
later. The INSERT ... SET syntax is supported in MySQL
3.22.10 or later.
INSERT ... SELECT is discussed further in
See section 14.1.4.1 INSERT ... SELECT Syntax.
tbl_name is the table into which rows should be inserted. The columns
for which the statement provides values can be specified as follows:
SET clause indicates the columns
explicitly.
INSERT ... VALUES or INSERT
... SELECT, values for every column in the table must be provided in the
VALUES() list or by the SELECT. If you don't know the order of
the columns in the table, use DESCRIBE tbl_name to find out.
Column values can be given in several ways:
CREATE TABLE Syntax.
MySQL always has a default value for all columns. This is something
that is imposed on MySQL to be able to work with both transactional
and non-transactional tables.
Our view is that checking of column content should be done in the
application and not in the database server.
Note: If you want INSERT statements to generate an error unless you
explicitly specify values for all columns that require a non-NULL
value, you can configure MySQL using the DONT_USE_DEFAULT_FIELDS
option. This behavior is available only if you compile MySQL from source.
See section 2.3.2 Typical configure Options.
DEFAULT to explicitly set a column to its
default value. (New in MySQL 4.0.3.) This makes it easier to write
INSERT statements that assign values to all but a few columns,
because it allows you to avoid writing an incomplete VALUES list
(a list that does not include a value for each column in the table).
Otherwise, you would have to write out the list of column names
corresponding to each value in the VALUES list.
If both the column list and the VALUES list are empty, INSERT
creates a row with each column set to its default value.
expression can refer to any column that was set earlier in a value
list. For example, you can do this because the value for col2 refers
to col1, which has already been assigned:
mysql> INSERT INTO tbl_name (col1,col2) VALUES(15,col1*2);But you cannot do this because the value for
col1 refers to
col2, which is assigned after col1:
mysql> INSERT INTO tbl_name (col1,col2) VALUES(col2*2,15);
The INSERT statement supports the following modifiers:
DELAYED keyword, the server puts the row or
rows to be inserted into a buffer, and the client issuing the INSERT
DELAYED statement then can continue on. If the table is busy, the server
holds the rows. When the table becomes free, it begins inserting rows,
checking periodically to see whether there are new read requests for the
table. If there are, the delayed row queue is suspended until the table
becomes free again.
See section 14.1.4.2 INSERT DELAYED Syntax.
LOW_PRIORITY keyword, execution of the
INSERT is delayed until no other clients are reading from the
table. This includes other clients that began reading while existing
clients are reading, and while the INSERT LOW_PRIORITY statement
is waiting. It is possible, therefore, for a client that issues an
INSERT LOW_PRIORITY statement to wait for a very long time (or
even forever) in a read-heavy environment.
(This is in contrast to INSERT DELAYED, which lets the client
continue at once.) See section 14.1.4.2 INSERT DELAYED Syntax. Note
that LOW_PRIORITY should normally not be used with MyISAM
tables as this disables concurrent inserts.
See section 15.1 The MyISAM Storage Engine.
IGNORE keyword in an INSERT with many rows,
any rows that duplicate an existing UNIQUE index or PRIMARY
KEY value in the table are ignored and are not inserted. If you do not
specify IGNORE, the insert is aborted if there is any row that
duplicates an existing key value. You can determine with the
mysql_info() C API function how many rows were inserted into the
table.
If you specify ON DUPLICATE KEY UPDATE clause (new in MySQL 4.1.0), and
a row is inserted that would cause a duplicate value in a UNIQUE index
or
PRIMARY KEY, an UPDATE of the old row is performed. For
example,
if column a is declared as UNIQUE and already contains the value
1, the following two statements have identical effect:
mysql> INSERT INTO table (a,b,c) VALUES (1,2,3)
-> ON DUPLICATE KEY UPDATE c=c+1;
mysql> UPDATE table SET c=c+1 WHERE a=1;
Note: If column b is unique too, the INSERT would be
equivalent to this UPDATE statement instead:
mysql> UPDATE table SET c=c+1 WHERE a=1 OR b=2 LIMIT 1;
If a=1 OR b=2 matches several rows, only one row
is updated! In general, you should try to avoid using
ON DUPLICATE KEY clause on tables with multiple UNIQUE keys.
Since MySQL 4.1.1, you can use the VALUES(col_name) function in the
UPDATE clause to refer to column values from the INSERT part
of the INSERT ... UPDATE statement. In other words,
VALUES(col_name) in the UPDATE clause refers to the value of
col_name that would be inserted if no duplicate-key conflict
occurred. This function is especially useful in multiple-row inserts. The
VALUES() function is meaningful only in INSERT ... UPDATE
statements and returns NULL otherwise.
Example:
mysql> INSERT INTO table (a,b,c) VALUES (1,2,3),(4,5,6)
-> ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);
That statement is identical to the following two statements:
mysql> INSERT INTO table (a,b,c) VALUES (1,2,3)
-> ON DUPLICATE KEY UPDATE c=3;
mysql> INSERT INTO table (a,b,c) VALUES (4,5,6)
-> ON DUPLICATE KEY UPDATE c=9;
When you use ON DUPLICATE KEY UPDATE, the DELAYED option is
ignored.
You can find the value used for an AUTO_INCREMENT column by using the
LAST_INSERT_ID() function. From within the C API, use the
mysql_insert_id function. However, note that the two functions do
not behave quite identically under all circumstances.
The behavior of INSERT statements with respect to AUTO_INCREMENT
columns is discussed further in section 13.8.3 Information Functions and
section 20.2.3.32 mysql_insert_id().
If you use an INSERT ... VALUES statement with multiple value lists
or INSERT ... SELECT, the statement returns an information string in
this format:
Records: 100 Duplicates: 0 Warnings: 0
Records indicates the number of rows processed by the statement.
(This is not necessarily the number of rows actually inserted;
Duplicates can be nonzero.)
Duplicates indicates the number of rows that couldn't be inserted
because they would duplicate some existing unique index value.
Warnings indicates the number of attempts to insert column values that
were problematic in some way. Warnings can occur under any of the following
conditions:
NULL into a column that has been declared NOT NULL.
For multiple-row INSERT statements or INSERT ... SELECT
statements,
the column is set to the default value appropriate for the column type.
This is 0 for numeric types, the empty string ('') for
string types, and the ``zero'' value for date and time types.
'10.34 a' to a numeric column. The
trailing non-numeric text is stripped off and the remaining numeric part is
inserted. If the string value has no leading numeric part, the column is
set to 0.
CHAR, VARCHAR, TEXT, or
BLOB) that exceeds the column's maximum length. The value is
truncated to the column's maximum length.
If you are using the C API, the information string can be obtained by invoking
the mysql_info() function.
See section 20.2.3.30 mysql_info().
INSERT ... SELECT Syntax
INSERT [LOW_PRIORITY] [IGNORE] [INTO] tbl_name [(column list)]
SELECT ...
With INSERT ... SELECT, you can quickly insert many rows
into a table from one or many tables.
For example:
INSERT INTO tblTemp2 (fldID)
SELECT tblTemp1.fldOrder_ID FROM tblTemp1
WHERE tblTemp1.fldOrder_ID > 100;
The following conditions hold for an INSERT ... SELECT statement:
INSERT ... SELECT implicitly operates in
IGNORE mode. As of MySQL 4.0.1, specify IGNORE
explicitly to ignore records that would cause duplicate-key violations.
DELAYED with INSERT ... SELECT.
INSERT statement cannot appear in the
FROM clause of the SELECT part of the query.
This limitation is lifted in 4.0.14.
AUTO_INCREMENT columns work as usual.
INSERT ... SELECT.
You can use REPLACE instead of INSERT to overwrite old rows.
REPLACE is the counterpart to INSERT IGNORE in the treatment
of new rows that contain unique key values that duplicate old rows:
The new rows are used to replace the old rows rather than being discarded.
INSERT DELAYED SyntaxINSERT DELAYED ...
The DELAYED option for the INSERT statement is a
MySQL extension to standard SQL that is very useful if you have clients
that can't wait for the INSERT to complete. This is a common
problem when you use MySQL for logging and you also
periodically run SELECT and UPDATE statements that take a
long time to complete. DELAYED was introduced in MySQL
3.22.15.
There are some constraints on the use of DELAYED:
INSERT DELAYED works only with MyISAM and ISAM
tables.
For MyISAM tables, if there is are no free blocks in the middle of the
datafile, concurrent SELECT and INSERT statements are supported.
Under these circumstances, you very seldom need to use INSERT
DELAYED with MyISAM. See section 15.1 The MyISAM Storage Engine.
INSERT DELAYED should be used only for INSERT statements that
specify value lists. This is enforced as of MySQL 4.0.18. The server ignores
DELAYED for INSERT DELAYED ... SELECT statements.
DELAYED for INSERT DELAYED ... ON DUPLICATE UPDATE statements.
LAST_INSERT_ID() to get the AUTO_INCREMENT
value the statement might generate.
When a client uses INSERT DELAYED, it gets an okay from the server at
once, and the row is queued to be inserted when the table is not in use by
any other thread.
Another major benefit of using INSERT DELAYED is that inserts
from many clients are bundled together and written in one block. This is much
faster than doing many separate inserts.
Note that currently the queued rows are only held in memory until they are
inserted into the table. This means that if you terminate mysqld
forcefully (for example, with kill -9) or if mysqld dies
unexpectedly, any queued rows that have not been written to disk are lost!
The following describes in detail what happens when you use the
DELAYED option to INSERT or REPLACE. In this
description, the ``thread'' is the thread that received an INSERT
DELAYED statement and ``handler'' is the thread that handles all
INSERT DELAYED statements for a particular table.
DELAYED statement for a table, a handler
thread is created to process all DELAYED statements for the table, if
no such handler already exists.
DELAYED
lock already; if not, it tells the handler thread to do so. The
DELAYED lock can be obtained even if other threads have a READ
or WRITE lock on the table. However, the handler will wait for all
ALTER TABLE locks or FLUSH TABLES to ensure that the table
structure is up to date.
INSERT statement, but instead of writing
the row to the table, it puts a copy of the final row into a queue that
is managed by the handler thread. Any syntax errors are noticed by the
thread and reported to the client program.
AUTO_INCREMENT value for the resulting row, because the
INSERT returns before the insert operation has been completed. (If
you use the C API, the mysql_info() function doesn't return anything
meaningful, for the same reason.)
delayed_insert_limit rows are written, the handler checks
whether any SELECT statements are still pending. If so, it
allows these to execute before continuing.
INSERT DELAYED statements are received within
delayed_insert_timeout seconds, the handler terminates.
delayed_queue_size rows are pending already in a
specific handler queue, the thread requesting INSERT DELAYED
waits until there is room in the queue. This is done to ensure that
the mysqld server doesn't use all memory for the delayed memory
queue.
delayed_insert in the Command column. It will be killed if
you execute a FLUSH TABLES statement or kill it with KILL
thread_id. However, before exiting it will first store all queued rows into
the table. During this time it will not accept any new INSERT
statements from another thread. If you execute an INSERT DELAYED
statement after this, a new handler thread will be created.
Note that this means that INSERT DELAYED statements have higher
priority than normal INSERT statements if there is an INSERT
DELAYED handler already running! Other update statements will have to wait
until the INSERT DELAYED queue is empty, someone terminates the handler
thread (with KILL thread_id), or someone executes FLUSH TABLES.
INSERT
DELAYED statements:
| Variable | Meaning |
Delayed_insert_threads | Number of handler threads |
Delayed_writes | Number of rows written with INSERT DELAYED
|
Not_flushed_delayed_rows | Number of rows waiting to be written |
SHOW STATUS statement or
by executing a mysqladmin extended-status command.
Note that INSERT DELAYED is slower than a normal INSERT if the
table is not in use. There is also the additional overhead for the server
to handle a separate thread for each table for which there are delayed rows.
This means that you should use INSERT DELAYED only when you are
really sure you need it!
LOAD DATA INFILE Syntax
LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name.txt'
[REPLACE | IGNORE]
INTO TABLE tbl_name
[FIELDS
[TERMINATED BY '\t']
[[OPTIONALLY] ENCLOSED BY '']
[ESCAPED BY '\\' ]
]
[LINES
[STARTING BY '']
[TERMINATED BY '\n']
]
[IGNORE number LINES]
[(col_name,...)]
The LOAD DATA INFILE statement reads rows from a text file into a
table at a very high speed.
For more information about the efficiency of INSERT versus
LOAD DATA INFILE and speeding up LOAD DATA INFILE,
See section 7.2.11 Speed of INSERT Queries.
You can also load datafiles by using the mysqlimport utility; it
operates by sending a LOAD DATA INFILE statement to the server. The
--local option causes mysqlimport to read datafiles from the
client host. You can specify the --compress option to get better
performance over slow networks if the client and server support the
compressed protocol.
See section 8.10 The mysqlimport Data Import Program.
If you specify the LOW_PRIORITY keyword, execution of the
LOAD DATA statement is delayed until no other clients are reading
from the table.
If you specify the CONCURRENT keyword with a MyISAM table that
satisfies the condition for concurrent inserts (that is, it contains no free
blocks in the middle),
then other threads can retrieve data from the table while LOAD DATA
is executing. Using this option affects the performance of LOAD DATA
a bit, even if no other thread is using the table at the same time.
If the LOCAL keyword is specified, it is
interpreted with respect to the client end of the connection:
LOCAL is specified, the file is read by the client program on the
client host and sent to the server.
LOCAL is not specified, the
file must be located on the server host and is read directly by the server.
LOCAL is available in MySQL 3.22.6 or later.
For security reasons, when reading text files located on the server, the
files must either reside in the database directory or be readable by all.
Also, to use LOAD DATA INFILE on server files, you must have the
FILE privilege.
See section 5.4.3 Privileges Provided by MySQL.
Using LOCAL is a bit slower than letting the server access the files
directly, because the contents of the file must be sent over the connection
by the client to the server. On the other hand, you do not need the
FILE privilege to load local files.
As of MySQL 3.23.49 and MySQL 4.0.2 (4.0.13 on Windows),
LOCAL works only if your server
and your client both have been enabled to allow it. For example, if
mysqld was started with --local-infile=0, LOCAL will
not work.
See section 5.3.4 Security Issues with LOAD DATA LOCAL.
If you need LOAD DATA to read from a pipe, you can use the
following technique:
mkfifo /mysql/db/x/x chmod 666 /mysql/db/x/x cat < /dev/tcp/10.1.1.12/4711 > /mysql/db/x/x mysql -e "LOAD DATA INFILE 'x' INTO TABLE x" x
If you are using a version of MySQL older than 3.23.25,
you can use this technique only with LOAD DATA LOCAL INFILE.
If you are using MySQL before Version 3.23.24 you can't read from a
FIFO with LOAD DATA INFILE. If you need to read from a FIFO (for
example the output from gunzip), use LOAD DATA LOCAL INFILE
instead.
When locating files on the server host, the server uses the following rules:
Note that these rules mean a file named as `./myfile.txt' is read from
the server's data directory, whereas the same file named as `myfile.txt' is
read from the database directory of the default database. For example,
the following LOAD DATA statement reads the file `data.txt'
from the database directory for db1 because db1 is the current
database, even though the statement explicitly loads the file into a
table in the db2 database:
mysql> USE db1; mysql> LOAD DATA INFILE 'data.txt' INTO TABLE db2.my_table;
The REPLACE and IGNORE keywords control handling of input
records that duplicate existing records on unique key values.
If you specify REPLACE, input rows replace existing rows (in other
words rows that have the same value for a primary or unique index as an
existing row). See section 14.1.6 REPLACE Syntax.
If you specify IGNORE, input rows that duplicate an existing row
on a unique key value are skipped. If you don't specify either option,
the behavior depends on whether or not the LOCAL keyword is specified.
Without LOCAL, an error occurs when a duplicate key value is
found, and the rest of the text file is ignored. With LOCAL,
the default behavior is the same as if IGNORE is specified;
this is because the server has no way to stop transmission of the file
in the middle of the operation.
If you want to ignore foreign key constraints during the load operation, you
can issue a SET FOREIGN_KEY_CHECKS=0 statement before executing
LOAD DATA.
If you use LOAD DATA INFILE on an empty MyISAM table, all
non-unique indexes are created in a separate batch (like in
REPAIR). This normally makes LOAD DATA INFILE much faster
when you have many indexes. Normally this is very fast, but in some
extreme cases you can create the indexes even faster by turning them off
with ALTER TABLE .. DISABLE KEYS before loading the file into the
table and using ALTER TABLE .. ENABLE KEYS to re-create the indexes
after loading the file.
See section 7.2.11 Speed of INSERT Queries.
LOAD DATA INFILE is the complement of SELECT ... INTO OUTFILE.
See section 14.1.7 SELECT Syntax.
To write data from a table to a file, use SELECT ... INTO OUTFILE.
To read the file back into a table, use LOAD DATA INFILE.
The syntax of the FIELDS and LINES clauses is the same for
both statements. Both clauses are optional, but FIELDS
must precede LINES if both are specified.
If you specify a FIELDS clause,
each of its subclauses (TERMINATED BY, [OPTIONALLY] ENCLOSED
BY, and ESCAPED BY) is also optional, except that you must
specify at least one of them.
If you don't specify a FIELDS clause, the defaults are the
same as if you had written this:
FIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\\'
If you don't specify a LINES clause, the default
is the same as if you had written this:
LINES TERMINATED BY '\n' STARTING BY ''
In other words, the defaults cause LOAD DATA INFILE to act as follows
when reading input:
Conversely, the defaults cause SELECT ... INTO OUTFILE to act as
follows when writing output:
Note that to write FIELDS ESCAPED BY '\\', you must specify two
backslashes for the value to be read as a single backslash.
Note: If you have generated the text file on a Windows system, you
might have to use LINES TERMINATED BY '\r\n' to read the file
properly, because Windows programs typically use two characters as a line
terminator. Some programs, like wordpad, might use \r as a line
terminator when writing files. To read such files, use LINES
TERMINATED BY '\r.
If all the lines you want to read in have a common prefix that you want to
ignore, you can use LINES STARTING BY 'prefix_string' to skip over
the prefix. If a line doesn't include the prefix, the entire line is
skipped.
The IGNORE number LINES option can be used to ignore lines at
the start of the file. For example, you can use IGNORE 1 LINES
to skip over an initial header line containing column names:
mysql> LOAD DATA INFILE '/tmp/file_name'
-> INTO TABLE test IGNORE 1 LINES;
When you use SELECT ... INTO OUTFILE in tandem with LOAD
DATA INFILE to write data from a database into a file and then read
the file back into the database later, the field and line handling
options for both statements must match. Otherwise, LOAD DATA
INFILE will not interpret the contents of the file properly. Suppose
you use SELECT ... INTO OUTFILE to write a file with
fields delimited by commas:
mysql> SELECT * INTO OUTFILE 'data.txt'
-> FIELDS TERMINATED BY ','
-> FROM table2;
To read the comma-delimited file back in, the correct statement would be:
mysql> LOAD DATA INFILE 'data.txt' INTO TABLE table2
-> FIELDS TERMINATED BY ',';
If instead you tried to read in the file with the statement shown here, it
wouldn't work because it instructs LOAD DATA INFILE to look for
tabs between fields:
mysql> LOAD DATA INFILE 'data.txt' INTO TABLE table2
-> FIELDS TERMINATED BY '\t';
The likely result is that each input line would be interpreted as a single field.
LOAD DATA INFILE can be used to read files obtained from
external sources, too. For example, a file in dBASE format will have
fields separated by commas and enclosed within double quotes. If lines in
the file are terminated by newlines, the statement shown here
illustrates the field and line handling options you would use to load
the file:
mysql> LOAD DATA INFILE 'data.txt' INTO TABLE tbl_name
-> FIELDS TERMINATED BY ',' ENCLOSED BY '"'
-> LINES TERMINATED BY '\n';
Any of the field or line handling options can specify an empty string
(''). If not empty, the FIELDS [OPTIONALLY] ENCLOSED BY
and FIELDS ESCAPED BY values must be a single character. The
FIELDS TERMINATED BY, LINES STARTING BY, and LINES
TERMINATED BY values can be more than one character. For example, to write
lines that are terminated by carriage return/linefeed pairs, or to read a
file containing such lines, specify a LINES TERMINATED BY '\r\n'
clause.
To read a file containing jokes that are separated by lines consisting of
of %%, you can do this
mysql> CREATE TABLE jokes
-> (a INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
-> joke TEXT NOT NULL);
mysql> LOAD DATA INFILE '/tmp/jokes.txt' INTO TABLE jokes
-> FIELDS TERMINATED BY ''
-> LINES TERMINATED BY '\n%%\n' (joke);
FIELDS [OPTIONALLY] ENCLOSED BY controls quoting of fields. For
output (SELECT ... INTO OUTFILE), if you omit the word
OPTIONALLY, all fields are enclosed by the ENCLOSED BY
character. An example of such output (using a comma as the field
delimiter) is shown here:
"1","a string","100.20" "2","a string containing a , comma","102.20" "3","a string containing a \" quote","102.20" "4","a string containing a \", quote and comma","102.20"
If you specify OPTIONALLY, the ENCLOSED BY character is
used only to enclose CHAR and VARCHAR fields:
1,"a string",100.20 2,"a string containing a , comma",102.20 3,"a string containing a \" quote",102.20 4,"a string containing a \", quote and comma",102.20
Note that occurrences of the ENCLOSED BY character within a
field value are escaped by prefixing them with the ESCAPED BY
character. Also note that if you specify an empty ESCAPED BY
value, it is possible to generate output that cannot be read properly by
LOAD DATA INFILE. For example, the preceding output just shown would
appear as follows if the escape character is empty. Observe that the
second field in the fourth line contains a comma following the quote, which
(erroneously) appears to terminate the field:
1,"a string",100.20 2,"a string containing a , comma",102.20 3,"a string containing a " quote",102.20 4,"a string containing a ", quote and comma",102.20
For input, the ENCLOSED BY character, if present, is stripped
from the ends of field values. (This is true whether or not OPTIONALLY
is specified; OPTIONALLY has no effect on input interpretation.)
Occurrences of the ENCLOSED BY character preceded by the
ESCAPED BY character are interpreted as part of the current
field value.
If the field begins with the ENCLOSED BY character, instances
of that character are recognized as terminating a field value only
if followed by the field or line TERMINATED BY sequence.
To avoid ambiguity, occurrences of the ENCLOSED BY character
within a field value can be doubled and will be interpreted as a
single instance of the character. For example, if ENCLOSED
BY '"' is specified, quotes are handled as shown here:
"The ""BIG"" boss" -> The "BIG" boss The "BIG" boss -> The "BIG" boss The ""BIG"" boss -> The ""BIG"" boss
FIELDS ESCAPED BY controls how to write or read special characters.
If the FIELDS ESCAPED BY character is not empty, it is used to prefix
the following characters on output:
FIELDS ESCAPED BY character
FIELDS [OPTIONALLY] ENCLOSED BY character
FIELDS TERMINATED BY and
LINES TERMINATED BY values
0 (what is actually written following the escape character is
ASCII '0', not a zero-valued byte)
If the FIELDS ESCAPED BY character is empty, no characters are
escaped and NULL is output as NULL, not \N. It is
probably not a good idea to specify an empty escape character,
particularly if field values in your data contain any of the characters
in the list just given.
For input, if the FIELDS ESCAPED BY character is not empty, occurrences
of that character are stripped and the following character is taken literally
as part of a field value. The exceptions are an escaped `0' or
`N' (for example, \0 or \N if the escape character is
`\'). These sequences are interpreted as ASCII NUL (a zero-valued
byte) and NULL. The rules on NULL handling are described later
in this section.
For more information about `\'-escape syntax, see section 10.1 Literal Values.
In certain cases, field and line handling options interact:
LINES TERMINATED BY is an empty string and FIELDS
TERMINATED BY is non-empty, lines are also terminated with
FIELDS TERMINATED BY.
FIELDS TERMINATED BY and FIELDS ENCLOSED BY values
are both empty (''), a fixed-row (non-delimited) format is used.
With fixed-row format, no delimiters are used between fields (but you
can still have a line terminator). Instead, column values are written
and read using the ``display'' widths of the columns. For example, if a
column is declared as INT(7), values for the column are written
using 7-character fields. On input, values for the column are obtained
by reading 7 characters.
LINES TERMINATED BY is still used to separate lines. If a line
doesn't contain all fields, the rest of the columns are set to their
default values. If you don't have a line terminator, you should set this
to ''. In this case, the text file must contain all fields for
each row.
Fixed-row format also affects handling of NULL values, as described
later.
Note that fixed-size format will not work if you are using a multi-byte
character set.
Handling of NULL values varies according to the FIELDS and
LINES options in use:
FIELDS and LINES values, NULL is
written as a field value of \N for output, and a field value of
\N is read as NULL for input (assuming the ESCAPED BY
character is `\').
FIELDS ENCLOSED BY is not empty, a field containing the literal
word NULL as its value is read as a NULL value. This differs
from the word NULL enclosed within FIELDS ENCLOSED BY
characters, which is read as the string 'NULL'.
FIELDS ESCAPED BY is empty, NULL is written as the word
NULL.
FIELDS TERMINATED BY and
FIELDS ENCLOSED BY are both empty), NULL is written as an empty
string. Note that this causes both NULL values and empty strings in
the table to be indistinguishable when written to the file because they are
both written as empty strings. If you need to be able to tell the two apart
when reading the file back in, you should not use fixed-row format.
Some cases are not supported by LOAD DATA INFILE:
FIELDS TERMINATED BY and FIELDS ENCLOSED
BY both empty) and BLOB or TEXT columns.
LOAD DATA INFILE won't be able to interpret the input properly.
For example, the following FIELDS clause would cause problems:
FIELDS TERMINATED BY '"' ENCLOSED BY '"'
FIELDS ESCAPED BY is empty, a field value that contains an occurrence
of FIELDS ENCLOSED BY or LINES TERMINATED BY
followed by the FIELDS TERMINATED BY value will cause LOAD
DATA INFILE to stop reading a field or line too early.
This happens because LOAD DATA INFILE cannot properly determine
where the field or line value ends.
The following example loads all columns of the persondata table:
mysql> LOAD DATA INFILE 'persondata.txt' INTO TABLE persondata;
By default, when no column list is provided at the end of the LOAD
DATA INFILE statement, input lines are expected to contain a field for each
table column. If you want to load only some of a table's columns, specify a
column list:
mysql> LOAD DATA INFILE 'persondata.txt'
-> INTO TABLE persondata (col1,col2,...);
You must also specify a column list if the order of the fields in the input file differs from the order of the columns in the table. Otherwise, MySQL cannot tell how to match up input fields with table columns.
If an input line has too few fields, the table columns for which no input
field is present are set to their default values. Default value assignment
is described in section 14.2.5 CREATE TABLE Syntax.
An empty field value is interpreted differently than if the field value is missing:
0.
These are the same values that result if you assign an empty
string explicitly to a string, numeric, or date or time type explicitly
in an INSERT or UPDATE statement.
TIMESTAMP columns are set to the current date and time only if there
is a NULL value for the column (that is, \N), or (for the
first TIMESTAMP column only) if the TIMESTAMP column is
omitted from the field list when a field list is specified.
LOAD DATA INFILE regards all input as strings, so you can't use
numeric values for ENUM or SET columns the way you can with
INSERT statements. All ENUM and SET values must be
specified as strings!
If an input row has too many fields, the extra fields are ignored and the number of warnings is incremented.
When the LOAD DATA INFILE
statement finishes it returns an information string in the following format:
Records: 1 Deleted: 0 Skipped: 0 Warnings: 0
If you are using the C API, you can get information about the statement by
calling the mysql_info() C API function.
See section 20.2.3.30 mysql_info().
Warnings occur under the same circumstances as when values are inserted
via the INSERT statement (see section 14.1.4 INSERT Syntax), except
that LOAD DATA INFILE also generates warnings when there are too few
or too many fields in the input row. The warnings are not stored anywhere;
the number of warnings can only be used as an indication if everything went
well.
From MySQL 4.1.1 on, you can use SHOW WARNINGS to get a list of the
first max_error_count warnings as information about what went wrong.
See section 14.5.3.20 SHOW WARNINGS Syntax.
Before MySQL 4.1.1, only a warning count is available to indicate that
something went wrong. If you get warnings and want to know exactly why you
got them, one way to do this is to dump the table into another file using
SELECT ... INTO OUTFILE and compare the file to your original input
file.
REPLACE Syntax
REPLACE [LOW_PRIORITY | DELAYED]
[INTO] tbl_name [(col_name,...)]
VALUES ({expression | DEFAULT},...),(...),...
Or:
REPLACE [LOW_PRIORITY | DELAYED]
[INTO] tbl_name
SET col_name={expression | DEFAULT}, ...
Or:
REPLACE [LOW_PRIORITY | DELAYED]
[INTO] tbl_name [(col_name,...)]
SELECT ...
REPLACE works exactly like INSERT, except that if an old
record in the table has the same value as a new record for a PRIMARY
KEY or a UNIQUE index, the old record is deleted before the new
record is inserted.
See section 14.1.4 INSERT Syntax.
Note that unless the table has a PRIMARY KEY or UNIQUE index,
using a REPLACE statement makes no sense. It becomes equivalent to
INSERT, because there is no index to be used to determine whether a new
row duplicates another.
Values for all columns are taken from the values specified in the
REPLACE statement. Any missing columns are set to their default
values, just as happens for INSERT. You can't refer to values from
the old row and use them in the new row. It appeared that you could do this
in some old MySQL versions, but that was a bug that has been corrected.
To be able to use REPLACE, you must have INSERT and
DELETE privileges for the table.
The REPLACE statement returns a count to indicate the number of rows
affected. This is the sum of the rows deleted and inserted. If the count is 1
for a single-row REPLACE, a row was inserted and no rows were deleted.
If the count is greater than 1, one or more old rows were deleted before the
new row was inserted. It is possible for a single row to replace more than one
old row if the table contains multiple unique indexes and the new row
duplicates values for different old rows in different unique indexes.
The affected-rows count makes it easy to determine whether REPLACE
only added a row or whether it also replaced any rows: Check whether the
count is 1 (added) or greater (replaced).
If you are using the C API, the affected-rows count can be obtained using the
mysql_affected_rows() function.
Here follows the used algorithm in more detail:
(This is also used with LOAD DATA ... REPLACE.)
SELECT Syntax
SELECT
[ALL | DISTINCT | DISTINCTROW ]
[HIGH_PRIORITY]
[STRAIGHT_JOIN]
[SQL_SMALL_RESULT] [SQL_BIG_RESULT] [SQL_BUFFER_RESULT]
[SQL_CACHE | SQL_NO_CACHE] [SQL_CALC_FOUND_ROWS]
select_expression,...
[INTO OUTFILE 'file_name' export_options
| INTO DUMPFILE 'file_name']
[FROM table_references
[WHERE where_definition]
[GROUP BY {col_name | expr | position}
[ASC | DESC], ... [WITH ROLLUP]]
[HAVING where_definition]
[ORDER BY {col_name | expr | position}
[ASC | DESC] ,...]
[LIMIT [offset,] row_count | row_count OFFSET offset]
[PROCEDURE procedure_name(argument_list)]
[FOR UPDATE | LOCK IN SHARE MODE]]
SELECT is used to retrieve rows selected from one or more tables.
select_expression indicates a column you want to retrieve.
table_references indicates the table or tables from which to retrieve rows.
Its syntax is described in section 14.1.7.1 JOIN Syntax.
where_definition indicates any conditions that selected rows must
satisfy.
SELECT can also be used to retrieve rows computed without reference to
any table.
For example:
mysql> SELECT 1 + 1;
-> 2
All clauses used must be given in exactly the order shown in the syntax
description. For example,
a HAVING clause must come after any GROUP BY clause and before
any ORDER BY clause.
select_expression can be given an alias using AS alias_name.
The alias is used as the expression's column name and can be used in
GROUP BY,
ORDER BY, or HAVING clauses. For example:
mysql> SELECT CONCAT(last_name,', ',first_name) AS full_name
-> FROM mytable ORDER BY full_name;
The AS keyword is optional when aliasing a select_expression.
The preceding example could have been written like this:
mysql> SELECT CONCAT(last_name,', ',first_name) full_name
-> FROM mytable ORDER BY full_name;
Because the AS is optional, a subtle problem can occur
if you forget the comma between two SELECT expressions: MySQL
interprets the second as an alias name. For example, in the following
statement, columnb is treated as an alias name:
mysql> SELECT columna columnb FROM mytable;
WHERE clause,
because the column value might not yet be determined when the
WHERE clause is executed.
See section A.5.4 Problems with Column Aliases.
FROM table_references clause indicates the tables from which to
retrieve rows. If you name more than one table, you are performing a
join. For information on join syntax, see section 14.1.7.1 JOIN Syntax.
For each table specified, you can optionally specify an alias.
tbl_name [[AS] alias]
[[USE INDEX (key_list)]
| [IGNORE INDEX (key_list)]
| [FORCE INDEX (key_list)]]
The use of
USE INDEX,
IGNORE INDEX,
FORCE INDEX
to give the optimizer hints about how to choose indexes is described in
section 14.1.7.1 JOIN Syntax.
In MySQL 4.0.14, you can use SET MAX_SEEKS_FOR_KEY=value as an
alternative way to force MySQL to prefer key scans instead of table scans.
tbl_name
(within the current database), or as db_name.tbl_name to explicitly
specify a database. You can refer to a column as col_name,
tbl_name.col_name, or db_name.tbl_name.col_name. You need not
specify a tbl_name or db_name.tbl_name prefix for a column
reference unless the reference would be ambiguous. See section 10.2 Database, Table, Index, Column, and Alias Names
for examples of ambiguity that require the more explicit column reference
forms.
DUAL as a dummy
table name in situations where no tables are referenced:
mysql> SELECT 1 + 1 FROM DUAL;
-> 2
DUAL is purely a compatibility feature. Some other servers require
this syntax.
tbl_name [AS] alias_name:
mysql> SELECT t1.name, t2.salary FROM employee AS t1, info AS t2
-> WHERE t1.name = t2.name;
mysql> SELECT t1.name, t2.salary FROM employee t1, info t2
-> WHERE t1.name = t2.name;
WHERE clause, you can use any of the functions that
MySQL supports, except for aggregate (summary) functions.
See section 13 Functions and Operators.
ORDER BY and
GROUP BY clauses using column names, column aliases, or column
positions. Column positions are integers and begin with 1:
mysql> SELECT college, region, seed FROM tournament
-> ORDER BY region, seed;
mysql> SELECT college, region AS r, seed AS s FROM tournament
-> ORDER BY r, s;
mysql> SELECT college, region, seed FROM tournament
-> ORDER BY 2, 3;
To sort in reverse order, add the DESC (descending) keyword to the
name of the column in the ORDER BY clause that you are sorting by.
The default is ascending order; this can be specified explicitly using
the ASC keyword.
Use of column positions is deprecated because the syntax has been removed from
the SQL standard.
GROUP BY, output rows are sorted according to the
GROUP BY columns as if you had an ORDER BY for the same columns.
MySQL has extended the GROUP BY clause as of version 3.23.34 so that
you can also specify ASC and DESC after columns named in the
clause:
SELECT a,COUNT(b) FROM test_table GROUP BY a DESC
GROUP BY to allow you to
select fields that are not mentioned in the GROUP BY clause.
If you are not getting the results you expect from your query, please
read the GROUP BY description.
See section 13.9 Functions and Modifiers for Use with GROUP BY Clauses.
GROUP BY allows a WITH ROLLUP modifier.
See section 13.9.2 GROUP BY Modifiers.
HAVING clause can refer to any column or alias named in a
select_expression. It is applied nearly last, just before items are
sent to the client, with no optimization.
(LIMIT is applied after HAVING.)
HAVING for items that
should be in the WHERE clause. For example, do not write this:
mysql> SELECT col_name FROM tbl_name HAVING col_name > 0;Write this instead:
mysql> SELECT col_name FROM tbl_name WHERE col_name > 0;
HAVING clause can refer to aggregate functions:
mysql> SELECT user, MAX(salary) FROM users
-> GROUP BY user HAVING MAX(salary)>10;
However, that does not work in older MySQL servers (before version 3.22.5).
Instead, you can use a column alias in the select list and refer to the
alias in the HAVING clause:
mysql> SELECT user, MAX(salary) AS max_salary FROM users
-> GROUP BY user HAVING max_salary>10;
LIMIT clause can be used to constrain the number of rows returned
by the SELECT statement. LIMIT takes one or two numeric
arguments, which must be integer constants.
With two arguments, the first argument specifies the offset of the first row to
return, and the second specifies the maximum number of rows to return.
The offset of the initial row is 0 (not 1):
mysql> SELECT * FROM table LIMIT 5,10; # Retrieve rows 6-15For compatibility with PostgreSQL, MySQL also supports the
LIMIT row_count OFFSET offset syntax.
To retrieve all rows from a certain offset up to the end of the result set,
you can use some large number for the second parameter. This statement
retrieves all rows from the 96th row to the last:
mysql> SELECT * FROM table LIMIT 95,18446744073709551615;With one argument, the value specifies the number of rows to return from the beginning of the result set:
mysql> SELECT * FROM table LIMIT 5; # Retrieve first 5 rowsIn other words,
LIMIT n is equivalent to LIMIT 0,n.
SELECT ... INTO OUTFILE 'file_name' form of SELECT writes
the selected rows to a file. The file is created on the server host, so you
must have the FILE privilege to use this form of SELECT. The
file cannot already exist, which among other things prevents database tables
and files such as `/etc/passwd' from being destroyed.
The SELECT ... INTO OUTFILE statement is intended primarily to let
you very quickly dump a table on the server machine. If you want to create
the resulting file on some host other than the server host, you can't use
SELECT ... INTO OUTFILE. In that case, you should instead use some
client program like mysql -e "SELECT ..." > outfile to generate the file.
SELECT ... INTO OUTFILE is the complement of LOAD DATA
INFILE; the syntax for the export_options part of the statement
consists of the same FIELDS and LINES clauses that are used
with the LOAD DATA INFILE statement.
See section 14.1.5 LOAD DATA INFILE Syntax.
In the resulting text file, only the following characters are escaped by
the ESCAPED BY character:
ESCAPED BY character
FIELDS TERMINATED BY
LINES TERMINATED BY
ESCAPED BY
followed by '0' (ASCII 48).
The reason for the above is that you must escape any FIELDS
TERMINATED BY, ESCAPED BY, or LINES TERMINATED BY
characters to reliably be able to read the file back. ASCII NUL is
escaped to make it easier to view with some pagers.
The resulting file doesn't have to conform to the SQL syntax, so nothing
else need be escaped.
Here is an example that produces a file in the comma-separated values format
used by many programs:
SELECT a,b,a+b INTO OUTFILE '/tmp/result.text' FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\n' FROM test_table;
INTO DUMPFILE instead of INTO OUTFILE, MySQL writes
only one row into the file, without any column or line termination and
without performing any escape processing. This is useful if you want to
store a BLOB value in a file.
INTO OUTFILE or INTO
DUMPFILE is writable by all users on the server host. The reason for
this is that the MySQL server can't create a file that is owned by anyone
other than the user it's running as (you should never run mysqld as
root). The file thus must be world-writable so that you can
manipulate its contents.
PROCEDURE clause names a procedure that should process the data
in the result set. For an example, see section 22.3.1 Procedure Analyse.
FOR UPDATE on a storage engine with page or row locks,
the examined rows are write-locked until the end of the current
transaction.
Following the SELECT keyword, you can give a number of options
that affect the operation of the statement.
The ALL, DISTINCT, and DISTINCTROW options specify
whether duplicate rows should be returned. If none of these options are
given, the default is ALL (all matching rows are returned).
DISTINCT and DISTINCTROW are synonyms and specify that
duplicate rows in the result set should be removed.
HIGH_PRIORITY, STRAIGHT_JOIN, and options beginning with
SQL_ are MySQL extensions to standard SQL.
HIGH_PRIORITY will give the SELECT higher priority than
a statement that updates a table. You should use this only for queries
that are very fast and must be done at once. A SELECT HIGH_PRIORITY
query that is issued while the table is locked for reading will run even if
there is already an update statement waiting for the table to be free.
STRAIGHT_JOIN forces the optimizer to join the tables in the order in
which they are listed in the FROM clause. You can use this to speed up
a query if the optimizer joins the tables in non-optimal order.
See section 7.2.1 EXPLAIN Syntax (Get Information About a SELECT).
STRAIGHT_JOIN also can be used in the table_references list.
See section 14.1.7.1 JOIN Syntax.
SQL_BIG_RESULT can be used with GROUP BY or DISTINCT
to tell the optimizer that the result set will have many rows. In this case,
MySQL will directly use disk-based temporary tables if needed.
MySQL will also, in this case, prefer sorting to using a
temporary table with a key on the GROUP BY elements.
SQL_BUFFER_RESULT forces the result to be put into a temporary
table. This helps MySQL free the table locks early and helps
in cases where it takes a long time to send the result set to the client.
SQL_SMALL_RESULT can be used
with GROUP BY or DISTINCT to tell the optimizer that the
result set will be small. In this case, MySQL uses fast
temporary tables to store the resulting table instead of using sorting. In
MySQL 3.23 and up, this shouldn't normally be needed.
SQL_CALC_FOUND_ROWS (available in MySQL 4.0.0 and up) tells MySQL
to calculate how many rows there would be in the result set, disregarding
any LIMIT clause. The number of rows can then be retrieved with
SELECT FOUND_ROWS().
See section 13.8.3 Information Functions.
Before MySQL 4.1.0, this option does not work with
LIMIT 0, which is optimized to return instantly (resulting in a
row count of 0).
See section 7.2.10 How MySQL Optimizes LIMIT.
SQL_CACHE tells MySQL to store the query result in the query cache if
you are using a query_cache_type value of 2 or DEMAND.
For a query that uses UNION or subqueries, this option takes effect
to be used in any SELECT of the query.
See section 5.10 The MySQL Query Cache.
SQL_NO_CACHE tells MySQL not to store the query result
in the query cache. See section 5.10 The MySQL Query Cache.
For a query that uses UNION or subqueries, this
option takes effect to be used in any SELECT of the query.
JOIN Syntax
MySQL supports the following JOIN syntaxes for the
table_references part of SELECT statements and multiple-table
DELETE and UPDATE statements:
table_reference, table_reference
table_reference [INNER | CROSS] JOIN table_reference [join_condition]
table_reference STRAIGHT_JOIN table_reference
table_reference LEFT [OUTER] JOIN table_reference [join_condition]
table_reference NATURAL [LEFT [OUTER]] JOIN table_reference
{ OJ table_reference LEFT OUTER JOIN table_reference
ON conditional_expr }
table_reference RIGHT [OUTER] JOIN table_reference [join_condition]
table_reference NATURAL [RIGHT [OUTER]] JOIN table_reference
Where table_reference is defined as:
tbl_name [[AS] alias]
[[USE INDEX (key_list)]
| [IGNORE INDEX (key_list)]
| [FORCE INDEX (key_list)]]
and join_condition is defined as:
ON conditional_expr | USING (column_list)
You should generally not have any conditions in the ON part that are
used to restrict which rows you want in the result set, but rather specify
these conditions in the WHERE clause. There are exceptions to this rule.
Note that INNER JOIN syntax allows a join_condition only from
MySQL 3.23.17 on. The same is true for JOIN and CROSS JOIN only
as of MySQL 4.0.11.
The {OJ ... LEFT OUTER JOIN ...} syntax shown in the preceding list
exists only for compatibility with ODBC.
tbl_name AS alias_name or
tbl_name alias_name:
mysql> SELECT t1.name, t2.salary FROM employee AS t1, info AS t2
-> WHERE t1.name = t2.name;
ON conditional is any conditional of the form that can be used in
a WHERE clause.
ON or
USING part in a LEFT JOIN, a row with all columns set to
NULL is used for the right table. You can use this fact to find
records in a table that have no counterpart in another table:
mysql> SELECT table1.* FROM table1
-> LEFT JOIN table2 ON table1.id=table2.id
-> WHERE table2.id IS NULL;
This example finds all rows in table1 with an id value that is
not present in table2 (that is, all rows in table1 with no
corresponding row in table2). This assumes that table2.id is
declared NOT NULL.
See section 7.2.8 How MySQL Optimizes LEFT JOIN and RIGHT JOIN.
USING (column_list) clause names a list of columns that must
exist in both tables. The following two clauses are semantically identical:
a LEFT JOIN b USING (c1,c2,c3) a LEFT JOIN b ON a.c1=b.c1 AND a.c2=b.c2 AND a.c3=b.c3
NATURAL [LEFT] JOIN of two tables is defined to be
semantically equivalent to an INNER JOIN or a LEFT JOIN
with a USING clause that names all columns that exist in both
tables.
INNER JOIN and , (comma) are semantically equivalent in
the absence of a join condition: both will produce a Cartesian product
between the specified tables (that is, each and every row in the first table
will be joined onto all rows in the second table).
RIGHT JOIN works analogously to LEFT JOIN. To keep code
portable across databases, it's recommended to use LEFT JOIN
instead of RIGHT JOIN.
STRAIGHT_JOIN is identical to JOIN, except that the left table
is always read before the right table. This can be used for those (few)
cases in which the join optimizer puts the tables in the wrong order.
As of MySQL 3.23.12, you can give hints about which index MySQL
should use when retrieving information from a table. By specifying
USE INDEX (key_list), you can tell MySQL to use only one of the
possible indexes to find rows in the table. The alternative syntax
IGNORE INDEX (key_list) can be used to tell MySQL to not use some
particular index. These hints are useful if EXPLAIN shows that MySQL
is using the wrong index from the list of possible indexes.
From MySQL 4.0.9 on, you can also use FORCE INDEX. This acts likes
USE INDEX (key_list) but with the addition that a table scan
is assumed to be very expensive. In other words, a table scan will
only be used if there is no way to use one of the given indexes to
find rows in the table.
USE KEY, IGNORE KEY, and FORCE KEY are synonyms for
USE INDEX, IGNORE INDEX, and FORCE INDEX.
Note: USE INDEX, IGNORE INDEX, and FORCE INDEX
only affect which indexes are used when MySQL decides how to find rows in
the table and how to do the join. They do not affect whether an index will
be used when resolving an ORDER BY or GROUP BY.
Some examples:
mysql> SELECT * FROM table1,table2 WHERE table1.id=table2.id;
mysql> SELECT * FROM table1 LEFT JOIN table2 ON table1.id=table2.id;
mysql> SELECT * FROM table1 LEFT JOIN table2 USING (id);
mysql> SELECT * FROM table1 LEFT JOIN table2 ON table1.id=table2.id
-> LEFT JOIN table3 ON table2.id=table3.id;
mysql> SELECT * FROM table1 USE INDEX (key1,key2)
-> WHERE key1=1 AND key2=2 AND key3=3;
mysql> SELECT * FROM table1 IGNORE INDEX (key3)
-> WHERE key1=1 AND key2=2 AND key3=3;
See section 7.2.8 How MySQL Optimizes LEFT JOIN and RIGHT JOIN.
UNION SyntaxSELECT ... UNION [ALL | DISTINCT] SELECT ... [UNION [ALL | DISTINCT] SELECT ...]
UNION is used to combine the result from many SELECT
statements into one result set. UNION is available from MySQL 4.0.0
on.
The columns listed in the select_expression portion of the SELECT
should have the same type. The column names used in the first
SELECT query will be used as the column names for the results
returned.
The SELECT statements are normal select statements, but with the following
restrictions:
SELECT statement can have INTO OUTFILE.
If you don't use the keyword ALL for the UNION, all
returned rows will be unique, as if you had done a DISTINCT for
the total result set. If you specify ALL, you will get all
matching rows from all the used SELECT statements.
The DISTINCT keyword is an optional word (introduced in MySQL 4.0.17).
It does nothing, but is required by the SQL standard.
If you want to use an ORDER BY to sort the entire UNION result,
you should use parentheses:
(SELECT a FROM tbl_name WHERE a=10 AND B=1 ORDER BY a LIMIT 10) UNION (SELECT a FROM tbl_name WHERE a=11 AND B=2 ORDER BY a LIMIT 10) ORDER BY a;
Note: You cannot mix UNION ALL and UNION
DISTINCT in the same query yet. If you use ALL for one
UNION then it is used for all of them.
The types and lengths of the columns in the result set of a UNION
take into account the values retrieved by all the SELECT statements.
Before MySQL 4.1.1, a limitation of UNION is that only the values from
the first SELECT were used to determine result column types and lengths.
This could result in value truncation if, for example, the first
SELECT retrieves shorter values than the second SELECT:
mysql> SELECT REPEAT('a',1) UNION SELECT REPEAT('b',10);
+---------------+
| REPEAT('a',1) |
+---------------+
| a |
| b |
+---------------+
That limitation has been removed as of MySQL 4.1.1:
mysql> SELECT REPEAT('a',1) UNION SELECT REPEAT('b',10);
+---------------+
| REPEAT('a',1) |
+---------------+
| a |
| bbbbbbbbbb |
+---------------+
A subquery is a SELECT statement inside another statement.
For example:
SELECT * FROM t1 WHERE column1 = (SELECT column1 FROM t2);
In this example, SELECT * FROM t1 ... is the outer query
(or outer statement), and (SELECT column1 FROM t2) is the
subquery.
We say that the subquery is nested in the outer query, and in fact
it's possible to nest subqueries within other subqueries, to a great depth.
A subquery must always appear within parentheses.
The main advantages of subqueries are:
SQL ``Structured Query Language''.
Starting with MySQL 4.1, all subquery forms and operations that the SQL standard requires are supported, as well as a few features that are MySQL-specific.
With earlier MySQL versions it was necessary to work around or avoid the use of subqueries, but people starting to write code now will find that subqueries are a very useful part of the MySQL toolkit.
Here is an example statement that shows the major points about subquery syntax as specified by the SQL standard and supported in MySQL:
DELETE FROM t1
WHERE s11 > ANY
(SELECT COUNT(*) /* no hint */ FROM t2
WHERE NOT EXISTS
(SELECT * FROM t3
WHERE ROW(5*t2.s1,77)=
(SELECT 50,11*s1 FROM t4 UNION SELECT 50,77 FROM
(SELECT * FROM t5) AS t5)));
For MySQL versions prior to 4.1, most subqueries can be successfully rewritten using joins and other methods. See section 14.1.8.11 Rewriting Subqueries for Earlier MySQL Versions.
In its simplest form (the scalar subquery as opposed to the
row or table subqueries that are discussed later),
a subquery is a simple operand. Thus, you can use it wherever a column value
or literal is legal, and you can expect it to have those characteristics
that all operands have: a data type, a length, an indication whether it can
be NULL, and so on.
For example:
CREATE TABLE t1 (s1 INT, s2 CHAR(5) NOT NULL); SELECT (SELECT s2 FROM t1);
The subquery in this SELECT has a data type of CHAR,
a length of 5, a character set and collation equal to the defaults in
effect at CREATE TABLE time, and an indication that the value in
the column can be NULL. In fact, almost all subqueries can be
NULL, because if the table is empty as in the example,
the value of the subquery will be NULL.
There are few restrictions.
SELECT, INSERT, UPDATE, DELETE,
SET, or DO.
SELECT can contain:
DISTINCT, GROUP BY, ORDER BY, LIMIT,
joins, hints, UNION constructs, comments, functions, and so on.
So, when you see examples in the following sections that contain the rather
spartan construct (SELECT column1 FROM t1), imagine that your own
code will contain much more diverse and complex constructions.
For example, suppose we make two tables:
CREATE TABLE t1 (s1 INT); INSERT INTO t1 VALUES (1); CREATE TABLE t2 (s1 INT); INSERT INTO t2 VALUES (2);
Then perform a SELECT:
SELECT (SELECT s1 FROM t2) FROM t1;
The result will be 2 because there is a row in t2 containing a
column s1 that has a value of 2.
The subquery can be part of an expression. If it is an operand for a function, don't forget the parentheses. For example:
SELECT UPPER((SELECT s1 FROM t1)) FROM t2;
The most common use of a subquery is in the form:
non_subquery_operand comparison_operator (subquery)
Where comparison_operator is one of:
= > < >= <= <>
For example:
... 'a' = (SELECT column1 FROM t1)
At one time the only legal place for a subquery was on the right side of a comparison, and you might still find some old DBMSs which insist on that.
Here is an example of a common-form subquery comparison that you can't do
with a join. It finds all the values in table t1 that are equal to a
maximum value in table t2:
SELECT column1 FROM t1
WHERE column1 = (SELECT MAX(column2) FROM t2);
Here is another example, which again is impossible with a join because it
involves aggregating for one of the tables. It finds all rows in table
t1 that contain a value which occurs twice:
SELECT * FROM t1
WHERE 2 = (SELECT COUNT(column1) FROM t1);
ANY, IN, and SOMESyntax:
operand comparison_operator ANY (subquery) operand IN (subquery) operand comparison_operator SOME (subquery)
The ANY keyword, which must follow a comparison operator, means
``return TRUE if the comparison is TRUE for ANY of the
rows that the subquery returns.''
For example,
SELECT s1 FROM t1 WHERE s1 > ANY (SELECT s1 FROM t2);
Suppose that there is a row in table t1 containing (10).
The expression is TRUE if table t2 contains (21,14,7) because
there is a value 7 in t2 that is less than 10.
The expression is FALSE if table t2 contains (20,10),
or if table t2 is empty.
The expression is UNKNOWN if table t2 contains
(NULL,NULL,NULL).
The word IN is an alias for = ANY. Thus these two statements
are the same:
SELECT s1 FROM t1 WHERE s1 = ANY (SELECT s1 FROM t2); SELECT s1 FROM t1 WHERE s1 IN (SELECT s1 FROM t2);
The word SOME is an alias for ANY. Thus these two statements
are the same:
SELECT s1 FROM t1 WHERE s1 <> ANY (SELECT s1 FROM t2); SELECT s1 FROM t1 WHERE s1 <> SOME (SELECT s1 FROM t2);
Use of the word SOME is rare, but this example shows why it might be
useful. To most people's ears, the English phrase ``a is not equal to any
b'' means ``there is no b which is equal to a,'' but that isn't what is
meant by the SQL syntax. Using <> SOME instead helps ensure that
everyone understands the true meaning of the query.
ALLSyntax:
operand comparison_operator ALL (subquery)
The word ALL, which must follow a comparison operator, means
``return TRUE if the comparison is TRUE for ALL of
the rows that the subquery returns''.
For example,
SELECT s1 FROM t1 WHERE s1 > ALL (SELECT s1 FROM t2);
Suppose that there is a row in table t1 containing (10).
The expression is TRUE if table t2 contains (-5,0,+5)
because 10 is greater than all three values in t2.
The expression is FALSE if table t2 contains
(12,6,NULL,-100) because there is a single value 12 in table t2
that is greater than 10.
The expression is UNKNOWN if table t2 contains (0,NULL,1).
Finally, if table t2 is empty, the result is TRUE.
You might think the result should be UNKNOWN, but
sorry, it's TRUE. So, rather oddly,
SELECT * FROM t1 WHERE 1 > ALL (SELECT s1 FROM t2);
is TRUE when table t2 is empty, but
SELECT * FROM t1 WHERE 1 > (SELECT s1 FROM t2);
is UNKNOWN when table t2 is empty. In addition,
SELECT * FROM t1 WHERE 1 > ALL (SELECT MAX(s1) FROM t2);
is UNKNOWN when table t2 is empty.
In general, tables with NULL values and empty tables are
edge cases. When writing subquery code, always consider whether
you have taken those two possibilities into account.
A correlated subquery is a subquery that contains a reference to a column that also appears in the outer query. For example:
SELECT * FROM t1 WHERE column1 = ANY
(SELECT column1 FROM t2 WHERE t2.column2 = t1.column2);
Notice that the subquery contains a reference to a column
of t1, even though the subquery's FROM clause doesn't mention
a table t1. So MySQL looks outside the subquery, and finds t1 in the
outer query.
Suppose that table t1 contains a row where column1 = 5 and
column2 = 6; meanwhile table t2 contains a row where
column1 = 5 and column2 = 7. The simple expression
... WHERE column1 = ANY (SELECT column1 FROM t2) would be
TRUE, but in this example the WHERE clause within the
subquery is FALSE (because 7 <> 5), so the subquery as a whole is
FALSE.
Scoping rule: MySQL evaluates from inside to outside. For example:
SELECT column1 FROM t1 AS x
WHERE x.column1 = (SELECT column1 FROM t2 AS x
WHERE x.column1 = (SELECT column1 FROM t3
WHERE x.column2 = t3.column1));
In this statement, x.column2 must be a column in table t2 because
SELECT column1 FROM t2 AS x ... renames t2. It is not a
column in table t1 because SELECT column1 FROM t1 ... is an
outer query that is further out.
For subqueries in HAVING or ORDER BY clauses, MySQL also
looks for column names in the outer select list.
For certain cases, a correlated subquery is optimized. For example:
val IN (SELECT key_val FROM tbl_name WHERE correlated_condition)
Otherwise, they are inefficient and likely to be slow. Rewriting the query as a join might improve performance.
EXISTS and NOT EXISTS
If a subquery returns any values at all, then EXISTS subquery is
TRUE, and NOT EXISTS subquery is FALSE.
For example:
SELECT column1 FROM t1 WHERE EXISTS (SELECT * FROM t2);
Traditionally, an EXISTS subquery starts with SELECT * but it
could begin with SELECT 5 or SELECT column1 or anything at
all. MySQL ignores the SELECT list in such a subquery, so it
doesn't matter.
For the preceding example, if t2 contains any rows, even rows with
nothing but NULL values, then the EXISTS condition is
TRUE. This is actually an unlikely example, since almost always a
[NOT] EXISTS subquery will contain correlations.
Here are some more realistic examples:
SELECT DISTINCT store_type FROM Stores
WHERE EXISTS (SELECT * FROM Cities_Stores
WHERE Cities_Stores.store_type = Stores.store_type);
SELECT DISTINCT store_type FROM Stores
WHERE NOT EXISTS (SELECT * FROM Cities_Stores
WHERE Cities_Stores.store_type = Stores.store_type);
SELECT DISTINCT store_type FROM Stores S1
WHERE NOT EXISTS (
SELECT * FROM Cities WHERE NOT EXISTS (
SELECT * FROM Cities_Stores
WHERE Cities_Stores.city = Cities.city
AND Cities_Stores.store_type = Stores.store_type));
The last example is a double-nested NOT EXISTS query. It has a
NOT EXISTS clause within a NOT EXISTS clause. Formally, it
answers the question ``does a city exist with a store which is not in
Stores?''. But it's easier to say that a nested NOT EXISTS answers
the question ``is x TRUE for all y?''.
The discussion to this point has been of column (or scalar) subqueries: subqueries that return a single column value. A row subquery is a subquery variant that returns a single row value -- and can thus return more than one column value. Here are two examples:
SELECT * FROM t1 WHERE (1,2) = (SELECT column1, column2 FROM t2); SELECT * FROM t1 WHERE ROW(1,2) = (SELECT column1, column2 FROM t2);
The queries here are both TRUE if table t2 has
a row where column1 = 1 and column2 = 2.
The expressions (1,2) and ROW(1,2) are sometimes called
row constructors. The two are equivalent.
They are legal in other contexts, too. For example, the following two
statements are semantically equivalent (though currently only the second one
can be optimized):
SELECT * FROM t1 WHERE (column1,column2) = (1,1); SELECT * FROM t1 WHERE column1 = 1 AND column2 = 1;
The normal use of row constructors, though, is for comparisons with subqueries that return two or more columns. For example, the following query answers the request, ``find all rows in table t1 which are duplicated in table t2'':
SELECT column1,column2,column3
FROM t1
WHERE (column1,column2,column3) IN
(SELECT column1,column2,column3 FROM t2);
FROM clause
Subqueries are legal in a SELECT statement's FROM clause.
The syntax that you'll actually see is:
SELECT ... FROM (subquery) AS name ...
The AS name clause is mandatory, because every table in a
FROM clause must have a name. Any columns in the subquery
select list must have unique names. You can find this syntax described
elsewhere in this manual, where the term used is ``derived tables''.
For illustration, assume you have this table:
CREATE TABLE t1 (s1 INT, s2 CHAR(5), s3 FLOAT);
Here's how to use a subquery in the FROM clause, using
the example table:
INSERT INTO t1 VALUES (1,'1',1.0);
INSERT INTO t1 VALUES (2,'2',2.0);
SELECT sb1,sb2,sb3
FROM (SELECT s1 AS sb1, s2 AS sb2, s3*2 AS sb3 FROM t1) AS sb
WHERE sb1 > 1;
Result: 2, '2', 4.0.
Here's another example: Suppose you want to know the average of the sum for a grouped table. This won't work:
SELECT AVG(SUM(column1)) FROM t1 GROUP BY column1;
But this query will provide the desired information:
SELECT AVG(sum_column1)
FROM (SELECT SUM(column1) AS sum_column1
FROM t1 GROUP BY column1) AS t1;
Notice that the column name used within the subquery
(sum_column1) is recognized in the outer query.
At the moment, subqueries in the FROM clause cannot be correlated
subqueries.
Subquery in the FROM clause will be executed (that is, derived temporary
tables will be built) even for the EXPLAIN statement, because upper
level queries need information about all tables during optimization phase.
There are some new error returns that apply only to subqueries. This section groups them together because reviewing them will help remind you of some points.
ERROR 1235 (ER_NOT_SUPPORTED_YET) SQLSTATE = 42000 Message = "This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'"This means that
SELECT * FROM t1 WHERE s1 IN (SELECT s2 FROM t2 ORDER BY s1 LIMIT 1)will not work, but only in some early versions, such as MySQL 4.1.1.
ERROR 1240 (ER_CARDINALITY_COL) SQLSTATE = 21000 Message = "Operand should contain 1 column(s)"This error will occur in cases like this:
SELECT (SELECT column1, column2 FROM t2) FROM t1;It's okay to use a subquery that returns multiple columns, if the purpose is comparison. See section 14.1.8.7 Row Subqueries. But in other contexts the subquery must be a scalar operand.
ERROR 1241 (ER_SUBSELECT_NO_1_ROW) SQLSTATE = 21000 Message = "Subquery returns more than 1 row"This error will occur in cases like this:
SELECT * FROM t1 WHERE column1 = (SELECT column1 FROM t2);but only when there is more than one row in
t2. That means this
error might occur in code that has been working for years, because somebody
happened to make a change that affected the number of rows that the
subquery can return. Remember that if the object is to find any number of
rows, not just one, then the correct statement would look like this:
SELECT * FROM t1 WHERE column1 = ANY (SELECT column1 FROM t2);
Error 1093 (ER_UPDATE_TABLE_USED) SQLSTATE = HY000 Message = "You can't specify target table 'x' for update in FROM clause"This error will occur in cases like this:
UPDATE t1 SET column2 = (SELECT MAX(column1) FROM t1);
It's okay to use a subquery for assignment within an
UPDATE statement, since subqueries are legal in UPDATE
and DELETE statements as well as in SELECT statements.
However, you cannot use the same table, in this case table t1, for
both the subquery's FROM clause and the update target.
Usually, failure of the subquery causes the entire statement to fail.
Development is ongoing, so no optimization tip is reliable for the long term. Some interesting tricks that you might want to play with are:
SELECT * FROM t1 WHERE t1.column1 IN (SELECT column1 FROM t2 ORDER BY column1); SELECT * FROM t1 WHERE t1.column1 IN (SELECT DISTINCT column1 FROM t2); SELECT * FROM t1 WHERE EXISTS (SELECT * FROM t2 LIMIT 1);
SELECT DISTINCT column1 FROM t1 WHERE t1.column1 IN ( SELECT column1 FROM t2);Instead of this query:
SELECT DISTINCT t1.column1 FROM t1, t2 WHERE t1.column1 = t2.column1;
SELECT * FROM t1 WHERE s1 IN (SELECT s1 FROM t1 UNION ALL SELECT s1 FROM t2);Instead of this query:
SELECT * FROM t1 WHERE s1 IN (SELECT s1 FROM t1) OR s1 IN (SELECT s1 FROM t2);For another example, use this query:
SELECT (SELECT column1 + 5 FROM t1) FROM t2;Instead of this query:
SELECT (SELECT column1 FROM t1) + 5 FROM t2;
SELECT * FROM t1 WHERE (column1,column2) IN (SELECT column1,column2 FROM t2);Instead of this query:
SELECT * FROM t1 WHERE EXISTS (SELECT * FROM t2 WHERE t2.column1=t1.column1 AND t2.column2=t1.column2);
NOT (a = ANY (...)) rather than a <> ALL (...).
x = ANY (table containing (1,2)) rather than
x=1 OR x=2.
= ANY rather than EXISTS
These tricks might cause programs to go faster or slower. Using MySQL
facilities like the BENCHMARK() function, you can get an idea about
what helps in your own situation. Don't worry too much about transforming
to joins except for compatibility with older versions of MySQL (before 4.1)
that do not support subqueries.
Some optimizations that MySQL itself will make are:
EXPLAIN
to make sure that a given subquery really is non-correlated.
IN/ALL/ANY/SOME subqueries
in an attempt to take advantage of the possibility that the select-list
columns in the subquery are indexed.
... IN (SELECT indexed_column FROM single_table ...)with an index-lookup function, which
EXPLAIN will describe as a
special join type.
value {ALL|ANY|SOME} {> | < | >= | <=} (non-correlated subquery)
with an expression involving MIN() or MAX() (unless NULL
values or empty sets are involved). For example, this WHERE clause:
WHERE 5 > ALL (SELECT x FROM t)Might be treated by the optimizer like this:
WHERE 5 > (SELECT MAX(x) FROM t)
There is a chapter titled ``How MySQL Transforms Subqueries'' in the MySQL Internals Manual. You can obtain this document by downloading the MySQL source package and looking for a file named `internals.texi' in the `Docs' directory.
Before MySQL 4.1, only nested queries of the form
INSERT ... SELECT ... and REPLACE ... SELECT ...
are supported.
The IN() construct can be used in other contexts to test membership in
a set of values.
It is often possible to rewrite a query without a subquery:
SELECT * FROM t1 WHERE id IN (SELECT id FROM t2);
This can be rewritten as:
SELECT t1.* FROM t1,t2 WHERE t1.id=t2.id;
The queries:
SELECT * FROM t1 WHERE id NOT IN (SELECT id FROM t2); SELECT * FROM t1 WHERE NOT EXISTS (SELECT id FROM t2 WHERE t1.id=t2.id);
Can be rewritten as:
SELECT table1.* FROM table1 LEFT JOIN table2 ON table1.id=table2.id
WHERE table2.id IS NULL;
A LEFT [OUTER] JOIN can be faster than an equivalent subquery
because the server might be able to optimize it better--a fact that is
not specific to MySQL Server alone.
Prior to SQL-92, outer joins did not exist, so subqueries were the only way
to do certain things in those bygone days. Today, MySQL Server and many
other modern database systems offer a whole range of outer join types.
For more complicated subqueries, you can often create temporary tables
to hold the subquery. In some cases, however, this option will not
work. The most frequently encountered of these cases arises with
DELETE statements, for which standard SQL does not support joins
(except in subqueries). For this situation, there are three options
available:
DELETE statements.
SELECT query to obtain the primary keys
for the records to be deleted, and then use these values to construct
the DELETE statement (DELETE FROM ... WHERE key_col IN (key1,
key2, ...)).
DELETE statements automatically, using the MySQL
extension CONCAT() (in lieu of the standard || operator).
For example:
SELECT
CONCAT('DELETE FROM tab1 WHERE pkid = ', "'", tab1.pkid, "'", ';')
FROM tab1, tab2
WHERE tab1.col1 = tab2.col2;
You can place this query in a script file, use the file as input to one
instance of the mysql program, and use the program output
as input to a second instance of mysql:
shell> mysql --skip-column-names mydb < myscript.sql | mysql mydb
MySQL Server 4.0 supports multiple-table DELETE statements that can be used to
efficiently delete rows based on information from one table or even
from many tables at the same time.
Multiple-table UPDATE statements are also supported as of MySQL 4.0.
TRUNCATE SyntaxTRUNCATE TABLE tbl_name
In MySQL 3.23, TRUNCATE TABLE is mapped to
COMMIT; DELETE FROM tbl_name. See section 14.1.1 DELETE Syntax.
TRUNCATE TABLE differs from DELETE FROM ...
in the following ways:
AUTO_INCREMENT value
but may start counting from the beginning. This is true for
MyISAM, ISAM, and BDB tables.
TRUNCATE TABLE is an Oracle SQL extension.
This statement was added in MySQL 3.23.28, although from 3.23.28
to 3.23.32, the keyword TABLE must be omitted.
UPDATE SyntaxSingle-table syntax:
UPDATE [LOW_PRIORITY] [IGNORE] tbl_name
SET col_name1=expr1 [, col_name2=expr2 ...]
[WHERE where_definition]
[ORDER BY ...]
[LIMIT row_count]
Multiple-table syntax:
UPDATE [LOW_PRIORITY] [IGNORE] tbl_name [, tbl_name ...]
SET col_name1=expr1 [, col_name2=expr2 ...]
[WHERE where_definition]
The UPDATE statement updates columns in existing table rows with new values.
The SET clause indicates which columns to modify and the values
they should be given. The WHERE clause, if given, specifies
which rows should be updated. Otherwise, all rows are updated. If the
ORDER BY clause is specified, the rows will be updated in the
order that is specified. The LIMIT clause places a limit on the number
of rows that can be updated.
The UPDATE statement supports the following modifiers:
LOW_PRIORITY keyword, execution of the
UPDATE is delayed until no other clients are reading from the table.
IGNORE keyword, the update statement will not
abort even if duplicate-key errors occur during the update. Rows for which
conflicts occur are not updated.
If you access a column from tbl_name in an expression,
UPDATE uses the current value of the column. For example, the
following statement sets the age column to one more than its
current value:
mysql> UPDATE persondata SET age=age+1;
UPDATE assignments are evaluated from left to right. For example, the
following statement doubles the age column, then increments it:
mysql> UPDATE persondata SET age=age*2, age=age+1;
If you set a column to the value it currently has, MySQL notices this and doesn't update it.
If you update a column that has been declared NOT NULL by
setting to NULL, the column is set to the default value appropriate
for the column type and the warning count is incremented. The default
value is is 0 for numeric types, the empty string ('')
for string types, and the ``zero'' value for date and time types.
UPDATE returns the number of rows that were actually changed.
In MySQL 3.22 or later, the C API function mysql_info()
returns the number of rows that were matched and updated and the number of
warnings that occurred during the UPDATE.
Starting from MySQL 3.23, you can use LIMIT row_count to
restrict the scope of the UPDATE. A LIMIT clause works as
follows:
LIMIT is a rows-affected restriction.
The statement stops as soon as it has changed row_count rows that
satisfy the WHERE clause.
LIMIT is a rows-matched restriction. The statement
stops as soon as it has found row_count rows that satisfy the
WHERE clause, whether or not they actually were changed.
If an UPDATE statement includes an ORDER BY clause, the rows
are updated in the order specified by the clause.
ORDER BY can be used from MySQL 4.0.0.
Starting with MySQL 4.0.4, you can also perform UPDATE
operations that cover multiple tables:
UPDATE items,month SET items.price=month.price WHERE items.id=month.id;
The example shows an inner join using the comma operator, but
multiple-table UPDATE statements can use any type of
join allowed in SELECT statements, such as LEFT JOIN.
Note: You cannot use ORDER BY or LIMIT with multiple-table
UPDATE.
Before MySQL 4.0.18, you needed the UPDATE privilege for all
tables used in a multiple-table UPDATE, even if they were not
updated. As of MySQL 4.0.18, you need only the SELECT privilege for
any columns that are read but not modified.
If you use a multiple-table UPDATE statement involving
InnoDB tables for which there are foreign key constraints,
the MySQL optimizer might process tables in an order that differs from
that of their parent/child relationship. In this case, the statement will
fail and roll back. Instead, update a single table and rely on the
ON UPDATE capabilities that InnoDB provides to cause the
other tables to be modified accordingly.
ALTER DATABASE Syntax
ALTER DATABASE db_name
alter_specification [, alter_specification] ...
alter_specification:
[DEFAULT] CHARACTER SET charset_name
| [DEFAULT] COLLATE collation_name
ALTER DATABASE allows you to change the overall characteristics of a
database. These characteristics are stored in the `db.opt' file in the
database directory.
The CHARACTER SET clause changes the default database character set.
The COLLATE clause changes the default database collation.
Character set and collation names are discussed in
section 11 Character Set Support.
To use ALTER DATABASE, you need the ALTER privilege on the
database.
ALTER DATABASE was added in MySQL 4.1.1.
ALTER TABLE Syntax
ALTER [IGNORE] TABLE tbl_name
alter_specification [, alter_specification] ...
alter_specification:
ADD [COLUMN] create_definition [FIRST | AFTER col_name ]
| ADD [COLUMN] (create_definition,...)
| ADD INDEX [index_name] [index_type] (index_col_name,...)
| ADD [CONSTRAINT [symbol]]
PRIMARY KEY [index_type] (index_col_name,...)
| ADD [CONSTRAINT [symbol]]
UNIQUE [index_name] [index_type] (index_col_name,...)
| ADD FULLTEXT [index_name] (index_col_name,...)
| ADD [CONSTRAINT [symbol]]
FOREIGN KEY [index_name] (index_col_name,...)
[reference_definition]
| ALTER [COLUMN] col_name {SET DEFAULT literal | DROP DEFAULT}
| CHANGE [COLUMN] old_col_name create_definition
[FIRST | AFTER col_name]
| MODIFY [COLUMN] create_definition [FIRST | AFTER col_name]
| DROP [COLUMN] col_name
| DROP PRIMARY KEY
| DROP INDEX index_name
| DISABLE KEYS
| ENABLE KEYS
| RENAME [TO] new_tbl_name
| ORDER BY col_name
| CHARACTER SET character_set_name [COLLATE collation_name]
| table_options
ALTER TABLE allows you to change the structure of an existing table.
For example, you can add or delete columns, create or destroy indexes, change
the type of existing columns, or rename columns or the table itself. You can
also change the comment for the table and type of the table.
The syntax for many of the allowable alterations is similar to clauses of the
CREATE TABLE statement.
See section 14.2.5 CREATE TABLE Syntax.
If you use ALTER TABLE to change a column specification but
DESCRIBE tbl_name indicates that your column was not changed, it is
possible that MySQL ignored your modification for one of the reasons
described in section 14.2.5.1 Silent Column Specification Changes. For example, if you try to change
a VARCHAR column to CHAR, MySQL will still use
VARCHAR if the table contains other variable-length columns.
ALTER TABLE works by making a temporary copy of the original table.
The alteration is performed on the copy, then the original table is
deleted and the new one is renamed. This is done in such a way that
all updates are automatically redirected to the new table without
any failed updates. While ALTER TABLE is executing, the original
table is readable by other clients. Updates and writes to the table
are stalled until the new table is ready.
Note that if you use any other option to ALTER TABLE than
RENAME, MySQL always creates a temporary table, even if the data
wouldn't strictly need to be copied (such as when you change the name of a
column). We plan to fix this in the future, but because ALTER TABLE
is not normally a statement that is used frequently, this isn't high on our
TODO list. For MyISAM tables, you can speed up the index re-creation
operation (which is the slowest part of the alteration process) by setting
the myisam_sort_buffer_size system variable to a high value.
ALTER TABLE, you need ALTER, INSERT,
and CREATE privileges for the table.
IGNORE is a MySQL extension to standard SQL.
It controls how ALTER TABLE works if there are duplicates on
unique keys in the new table.
If IGNORE isn't specified, the copy is aborted and rolled back if
duplicate-key errors occur.
If IGNORE is specified, then for rows with duplicates on a unique
key, only the first row is used; the others are deleted.
ADD, ALTER, DROP, and
CHANGE clauses in a single ALTER TABLE statement. This is a
MySQL extension to standard SQL, which allows only one of each clause
per ALTER TABLE statement.
CHANGE col_name, DROP col_name, and DROP
INDEX are MySQL extensions to standard SQL.
MODIFY is an Oracle extension to ALTER TABLE.
COLUMN is a pure noise word and can be omitted.
ALTER TABLE tbl_name RENAME TO new_tbl_name without any other
options, MySQL simply renames any files that correspond to the table
tbl_name. There is no need to create a temporary table.
You can also use the RENAME TABLE statement to rename tables.
See section 14.2.9 RENAME TABLE Syntax.
create_definition clauses use the same syntax for ADD and
CHANGE as for CREATE TABLE. Note that this syntax includes
the column name, not just the column type.
See section 14.2.5 CREATE TABLE Syntax.
CHANGE old_col_name create_definition
clause. To do so, specify the old and new column names and the type that
the column currently has. For example, to rename an INTEGER column
from a to b, you can do this:
mysql> ALTER TABLE t1 CHANGE a b INTEGER;If you want to change a column's type but not the name,
CHANGE
syntax still requires an old and new column name, even if they are the same.
For example:
mysql> ALTER TABLE t1 CHANGE b b BIGINT NOT NULL;However, as of MySQL 3.22.16a, you can also use
MODIFY
to change a column's type without renaming it:
mysql> ALTER TABLE t1 MODIFY b BIGINT NOT NULL;
CHANGE or MODIFY to shorten a column for which
an index exists on part of the column (for instance, if you have an index
on the first 10 characters of a VARCHAR column), you cannot make
the column shorter than the number of characters that are indexed.
CHANGE or MODIFY, MySQL
tries to convert existing column values to the new type as well as possible.
FIRST or
AFTER col_name to add a column at a specific position
within a table row. The default is to add the column last.
From MySQL 4.0.1 on, you can also use FIRST and
AFTER in CHANGE or MODIFY operations.
ALTER COLUMN specifies a new default value for a column
or removes the old default value.
If the old default is removed and the column can be NULL, the new
default is NULL. If the column cannot be NULL, MySQL
assigns a default value, as described in
section 14.2.5 CREATE TABLE Syntax.
DROP INDEX removes an index. This is a MySQL extension to
standard SQL. See section 14.2.7 DROP INDEX Syntax.
DROP TABLE instead.
DROP PRIMARY KEY drops the primary index. (Prior to MySQL 4.1.2,
if no primary index exists, DROP PRIMARY KEY drops the first