mirror of
https://github.com/launchbadge/sqlx.git
synced 2025-12-29 21:00:54 +00:00
doc(FAQ): add entry explaining prepared statements (#2997)
This commit is contained in:
parent
609fcc33e3
commit
fd53e95ff9
156
FAQ.md
156
FAQ.md
@ -69,6 +69,162 @@ However, if you do encounter this error, please try to capture a Wireshark or `t
|
||||
in covering cases that trigger this (as it might indicate a protocol handling bug or the server is doing something non-standard):
|
||||
https://github.com/rustls/rustls/issues/893
|
||||
|
||||
----------------------------------------------------------------
|
||||
### How does SQLx help prevent SQL Injection?
|
||||
### How do Query Parameters work?
|
||||
### Why does SQLx use Prepared Statements for most queries?
|
||||
### Can I Use Query Parameters to add conditional SQL to my query?
|
||||
### Why can't I use DDL (e.g. `CREATE TABLE`, `ALTER TABLE`, etc.) with the `sqlx::query*()` functions or `sqlx::query*!()` macros?
|
||||
|
||||
These questions can all be answered by a thorough explanation of prepared statements. Feel free to skip the parts you already know.
|
||||
|
||||
Back in the day, if a web application wanted to include user input in a SQL query,
|
||||
a search parameter for example, it had no choice but to simply format that data into the query.
|
||||
PHP applications used to be full of snippets like this:
|
||||
|
||||
```php
|
||||
/* Imagine this is user input */
|
||||
$city = "Munich";
|
||||
|
||||
/* $query = "SELECT country FROM city WHERE name='Munich'" */
|
||||
$query = sprintf("SELECT country FROM city WHERE name='%s'", $city);
|
||||
$result = $mysqli->query($query);
|
||||
```
|
||||
|
||||
However, this leaves the application vulnerable to [SQL injection attacks](https://en.wikipedia.org/wiki/SQL_injection),
|
||||
because it's trivial to craft an input string that will terminate the existing query and begin a new one,
|
||||
and the database won't know the difference and will execute both. As illustrated in the famous XKCD #327:
|
||||
|
||||
<a href="https://xkcd.com/327/"><img src="https://imgs.xkcd.com/comics/exploits_of_a_mom.png" title="Her daughter is named Help I'm trapped in a driver's license factory." alt="Exploits of a Mom" srcset="https:////imgs.xkcd.com/comics/exploits_of_a_mom_2x.png 2x" style="image-orientation:none">
|
||||
|
||||
The fictional school's student database application might have contained a query that looked like this:
|
||||
```php
|
||||
$student_name = "Robert');DROP TABLE Students;--"
|
||||
|
||||
$query = sprintf("INSERT INTO Students (name) VALUES ('%s')", $student_name);
|
||||
$result = $mysqli->query($query);
|
||||
```
|
||||
|
||||
When formatted into the middle of this query, the maliciously crafted input string closes the quotes and finishes the statement (`Robert');`),
|
||||
then starts another one with the nefarious payload (`DROP TABLE Students;`), and causes the rest of the original query to be ignored by starting a SQL comment (`--`).
|
||||
Thus, the database server sees, and executes, three separate statements like so:
|
||||
|
||||
```SQL
|
||||
INSERT INTO Students(firstname) VALUES ('Robert');
|
||||
DROP TABLE Students;
|
||||
--');
|
||||
```
|
||||
|
||||
And thus the school has lost this year's student records (at least they had last years' backed up?).
|
||||
|
||||
The original mitigation for this attack was to make sure that any untrustworthy user input was properly escaped (or "sanitized"),
|
||||
and many frameworks provided utility functions for this, such as PHP's [`mysqli::real_escape_string()`](https://www.php.net/manual/en/mysqli.real-escape-string.php) (not to be confused with the obsolete [`mysql_real_escape_string()`](https://www.php.net/manual/en/function.mysql-real-escape-string) or [`mysql_escape_string()`](https://www.php.net/manual/en/function.mysql-escape-string.php)).
|
||||
|
||||
These would prefix any syntactically significant characters (in this case, quotation marks) with a backslash,
|
||||
so it's less likely to affect the database server's interpretation of the query:
|
||||
|
||||
```php
|
||||
$student_name = $mysqli->real_escape_string("Robert');DROP TABLE Students;--");
|
||||
|
||||
/*
|
||||
Everything is okay now as the dastardly single-quote has been inactivated by the backslash:
|
||||
"INSERT INTO Students (name) VALUES ('Robert\');DROP TABLE Students;--');"
|
||||
*/
|
||||
$query = sprintf("INSERT INTO Students (name) VALUES ('%s')", $student_name);
|
||||
```
|
||||
|
||||
The database server sees the backslash and knows that the single-quote is part of the string content, not its terminating character.
|
||||
|
||||
However, this was something that you still had to _remember_ to do, making it only half a solution. Additionally, properly escaping the string requires knowledge of the current character set of the connection which is why the `mysqli` object is a required parameter
|
||||
(or the receiver in object-oriented style). And you could always just forget to wrap the string parameter in quotes (`'%s'`) in the first place, which these wouldn't help with.
|
||||
|
||||
Even when everything is working correctly, formatting dynamic data into a query still requires the database server to
|
||||
re-parse and generate a new query plan with every new variant--caching helps, but is not a silver bullet.
|
||||
|
||||
#### Prepared Statements to the rescue!
|
||||
|
||||
These solve both problems (injection and re-parsing) by **completely separating** the query from any dynamic input data.
|
||||
|
||||
Instead of formatting data into the query, you use a (database-specific) token to signify a value that will be passed separately:
|
||||
|
||||
```SQL
|
||||
-- MySQL
|
||||
INSERT INTO Students (name) VALUES(?);
|
||||
-- Postgres and SQLite
|
||||
INSERT INTO Students (name) VALUES($1);
|
||||
```
|
||||
|
||||
The database will substitute a given value when _executing_ the query, long after it's finished parsing it.
|
||||
The database will effectively treat the parameter as a variable.
|
||||
There is, by design, **no way** for a query parameter to modify the SQL of a query,
|
||||
unless you're using some `exec()`-like SQL function that lets you execute a string as a query,
|
||||
but then hopefully you know what you're doing.
|
||||
|
||||
In fact, parsing and executing prepared statements are explicitly separate steps in pretty much every database's protocol,
|
||||
where the query string, without any values attached, is parsed first and given an identifier, then a separate execution step
|
||||
simply passes that identifier along with the values to substitute.
|
||||
|
||||
The response from the initial parsing often contains useful metadata about the query, which SQLx's query macros use to great effect
|
||||
(see "How do the query macros work under the hood?" below).
|
||||
|
||||
Unfortunately, query parameters do not appear to be standardized, as every database has a different syntax.
|
||||
Look through the project for specific examples for your database, and consult your database manual about prepared statements
|
||||
for more information.
|
||||
|
||||
The syntax SQLite supports is effectively a superset of many databases' syntaxes, including MySQL and Postgres.
|
||||
To simplify our examples, we use the same syntax for Postgres and SQLite; though SQLite's syntax technically allows
|
||||
alphanumeric identifiers, that's not currently exposed in SQLx, and it's expected to be a numeric 1-based index like Postgres.
|
||||
|
||||
Some databases, like MySQL and PostgreSQL, may have special statements that let the user explicitly create and execute prepared statements (often `PREPARE` and `EXECUTE`, respectively),
|
||||
but most of the time an application, or library like SQLx, will interact with prepared statements using specialized messages in the database's client/server protocol.
|
||||
Prepared statements created through this protocol may or may not be accessible using explicit SQL statements, depending on the database flavor.
|
||||
|
||||
Since the dynamic data is handled separately, an application only needs to prepare a statement once,
|
||||
and then it can execute it as many times as it wants with all kinds of different data (at least of the same type and number).
|
||||
Prepared statements are generally tracked per-connection, so an application may need to re-prepare a statement several times over its lifetime as it opens new connections.
|
||||
If it uses a connection pool, ideally all connections will eventually have all statements already prepared (assuming a closed set of statements),
|
||||
so the overhead of parsing and generating a query plan is amortized.
|
||||
|
||||
Query parameters are also usually transmitted in a compact binary format, which saves bandwidth over having to send them as human-readable strings.
|
||||
|
||||
Because of the obvious security and performance benefits of prepared statements, the design of SQLx tries to make them as easy to use and transparent as possible.
|
||||
The `sqlx::query*()` family of functions, as well as the `sqlx::query*!()` macros, will always prefer prepared statements. This was an explicit goal from day one.
|
||||
|
||||
SQLx will **never** substitute query parameters for values on the client-side, it will always let the database server handle that. We have concepts for making certain usage patterns easier,
|
||||
like expanding a dynamic list of parameters (e.g. `?, ?, ?, ?, ...`) since MySQL and SQLite don't really support arrays, but will never simply format data into a query implicitly.
|
||||
|
||||
Our pervasive use of prepared statements can cause some problems with third-party database implementations, e.g. projects like CockroachDB or PGBouncer that support the Postgres protocol but have their own semantics.
|
||||
In this case, you might try setting [`.persistent(false)`](https://docs.rs/sqlx/latest/sqlx/query/struct.Query.html#method.persistent) before executing a query, which will cause the connection not to retain
|
||||
the prepared statement after executing it.
|
||||
|
||||
Not all SQL statements are allowed in prepared statements, either.
|
||||
As a general rule, DML (Data Manipulation Language, i.e. `SELECT`, `INSERT`, `UPDATE`, `DELETE`) is allowed while DDL (Data Definition Language, e.g. `CREATE TABLE`, `ALTER TABLE`, etc.) is not.
|
||||
Consult your database manual for details.
|
||||
|
||||
To execute DDL requires using a different API than `query*()` or `query*!()` in SQLx.
|
||||
Ideally, we'd like to encourage you to use SQLx's built-in support for migrations (though that could be better documented, we'll get to it).
|
||||
However, in the event that isn't feasible, or you have different needs, you can execute pretty much any statement,
|
||||
including multiple statements separated by semicolons (`;`), by directly invoking methods of the [`Executor` trait](https://docs.rs/sqlx/latest/sqlx/trait.Executor.html#method.execute)
|
||||
on any type that implements it, and passing your query string, e.g.:
|
||||
|
||||
```rust
|
||||
use sqlx::postgres::PgConnection;
|
||||
use sqlx::Executor;
|
||||
|
||||
let mut conn: PgConnection = connect().await?;
|
||||
|
||||
conn
|
||||
.execute(
|
||||
"CREATE TABLE IF NOT EXISTS StudentContactInfo (student_id INTEGER, person_name TEXT, relation TEXT, phone TEXT);\
|
||||
INSERT INTO StudentContactInfo (student_id, person_name, relation, phone) \
|
||||
SELECT student_id, guardian_name, guardian_relation, guardian_phone FROM Students;\
|
||||
ALTER TABLE Students DROP guardian_name, guardian_relation, guardian_phone;"
|
||||
)
|
||||
.await?;
|
||||
```
|
||||
|
||||
This is also pending a redesign to make it easier to discover and utilize.
|
||||
|
||||
----------------------------------------------------------------
|
||||
### How can I do a `SELECT ... WHERE foo IN (...)` query?
|
||||
|
||||
|
||||
Loading…
x
Reference in New Issue
Block a user