Professional Documents
Culture Documents
SP Imp Spec
SP Imp Spec
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
There are three structures involved in the execution of a query which are of
interest to the stored procedure implementation:
- Lex (mentioned above) is the "compiled" query, that is the output from
the parser and what is then interpreted to do the actual work.
It constains an enum value (sql_command) which is the query type, and
all the data collected by the parser needed for the execution (table
names, fields, values, etc).
- THD is the "run-time" state of a connection, containing all that is
needed for a particular client connection, and, among other things, the
Lex structure currently being executed.
- Item_*: During parsing, all data is translated into "items", objects of
the subclasses of "Item", such as Item_int, Item_real, Item_string, etc,
for basic datatypes, and also various more specialized Item types for
expressions to be evaluated (Item_func objects).
- Parameters:
name, type and mode (IN/OUT/INOUT) is pushed to spcont
- Declared local variables:
Same as parameters (mode is then IN)
- Local Variable references:
If an identifier is found in in spcont, an Item_splocal is created
with the variable's frame index, otherwise an Item_field or Item_ref
is created (as before).
- Statements:
The Lex in THD is replaced by a new Lex structure and the statement,
is parsed as usual. A sp_instr_stmt is created, containing the new
Lex, and added to added to the instructions in sphead.
Afterwards, the procedure's Lex is restored in THD.
- SET var:
Setting a local variable generates a sp_instr_set instruction,
containing the variable's frame offset, the expression (an Item),
and the type.
- Flow control:
Flow control constructs like, IF, WHILE, etc, generate a conditional
and unconditional jumps in the "obvious" way, but a few notes may
be required:
- Forward jumps: When jumping forward, the exact destination is not
known at the time of the creation of the jump instruction. The
sphead therefore contains list of instruction-label pairs for
each forward reference. When the position later is known, the
instructions in the list are updated with the correct location.
- Loop constructs have optional labels. If a loop doesn't have a
label, an anonymous label is generated to simplify the parsing.
- There are two types of CASE. The "simple" case is implemented
with an anonymous variable bound to the value to be tested.
- A simple example
Note that the contents of the spcont is changing during the parsing,
at all times reflecting the state of the would-be runtime frame.
The m_instr is an array of instructions:
Pos. Instruction
0 sp_instr_set(1, '3')
1 sp_instr_jump_if_not(5, 'x>0')
2 sp_instr_set(1, 'x-1')
3 sp_instr_stmt('insert into ...')
4 sp_instr_jump(1)
5 <end>
Here, '3', 'x>0', etc, represent the Items or Lex for the respective
expressions or statements.
The main difference during parsing is that we store the result type
in the sp_head. However, there are big differences when it comes to
invoking a FUNCTION. (See below.)
This means that we can reparse the procedure as many time as we want.
The first time, the resulting Lex is used to store the procedure in
the database (using the function sp.c:sp_create_procedure()).
The simplest way would be to just leave it at that, and re-read the
procedure from the database each time it is called. (And in fact, that's
the way the earliest implementation will work.)
However, this is not very efficient, and we can do better. The full
implementation should work like this:
1) Upon creation time, parse and store the procedure. Note that we still
need to parse it to catch syntax errors, but we can't check if called
procedures exists for instance.
2) Upon first CALL, read from the database, parse it, and cache the
resulting Lex in memory. This time we can do more error checking.
3) Upon subsequent CALLs, use the cached Lex.
Note that this implies that the Lex structure with its sphead must be
reentrant, that is, reusable and shareable between different threads
and calls. The runtime state for a procedure is kept in the sp_rcontext
in THD.
The mechanisms of storing, finding, and dropping procedures are
encapsulated in the files sp.{cc,h}.
- CALLing a procedure
A CALL is parsed just like any statement. The resulting Lex has the
sql_command SQLCOM_CALL, the procedure's name and the parameters are
pushed to the Lex' value_list.
- USE database
- Evaluating Items
- Calling a FUNCTION
The existance of UDFs are checked during the lexical analysis (in
sql_lex.cc:find_keyword()). This has the drawback that they must
exist before they are refered to, which was ok before SPs existed,
but then it becomes a problem. The first implementation of SP FUNCTIONs
will work the same way, but this should be fixed a.s.a.p. (This will
required some reworking of the way UDFs are handled, which is why it's
not done from the start.)
For the time being, a FUNCTION is detected the same way, and returns
the token SP_FUNC. During the parsing we only check for the *existance*
of the function, we don't parse it, since wa can't call the parser
recursively.
So, the solution is to collect the names of the refered FUNCTIONs during
parsing in the lex.
Then, before doing anything else in mysql_execute_command(), read all
functions from the database an keep them in the THD, where the function
sp_find_function() can find them during the execution.
Note: Even with an in-memory cache, we must still make sure that the
functions are indeed read and cached at this point.
The code that read and cache functions from the database must also be
invoked recursively for each read FUNCTION to make sure we have *all* the
functions we need.
Condition names are lexical entities and are kept in the parser context
just like variables. But, condition are just "aliases" for SQLSTATE
strings, or mysqld error codes (which is a non-standard extension in
MySQL), and are only used during parsing.
Handlers comes in three types, CONTINUE, EXIT and UNDO. The latter is
like an EXIT handler with an implicit rollback, and is currently not
implemented.
The EXIT handler jumps to the end of its BEGIN-END block when finished.
The CONTINUE handler returns to the statement following that which
invoked the handler.
It might seems strange to jump past the handlers like that, but there's
no extra cost in doing this, and for technical reasons it's easiest for
the parser to generate the handler instructions when they occur in the
source.
When an error occurs, one of the error routines is called and an error
message is normally sent back to the client immediately.
Catching a condition must be done in these error routines (there are
quite a few) to prevent them from doing this. We do this by calling
a method in the THD's sp_rcontext (if there is one). If a handler is
found, this is recorded in the context and the routine returns without
sending the error message.
The exectution loop (sp_head::execute()) checks for this after each
statement and invokes the handler that has been found. If several
errors or warnings occurs during one statement, only the first is
caught, the rest are ignored.
- Examples:
- EXIT handler
begin
declare x int default 0;
begin
declare exit handler for 'XXXXX' set x = 1;
(statement1);
(statement2);
end;
(statement3);
end
Pos. Instruction
0 sp_instr_set(0, '0')
1 sp_instr_hpush_jump(4, 1) # location and frame size
2 sp_instr_set(0, '1')
3 sp_instr_jump(6)
4 sp_instr_stmt('statement1')
5 sp_instr_stmt('statement2')
6 sp_instr_hpop(1)
7 sp_instr_stmt('statement3')
- CONTINUE handler
create procedure hndlr1(val int)
begin
declare x int default 0;
declare foo condition for 1146;
declare continue handler for foo set x = 1;
insert into t3 values ("hndlr1", val); # Non-existing table?
if x>0 then
insert into t1 values ("hndlr1", val); # This instead then
end if;
end|
Pos. Instruction
0 sp_instr_set(1, '0')
1 sp_instr_hpush_jump(4, 2)
2 sp_instr_set(1, '1')
3 sp_instr_hreturn(2) # frame size
4 sp_instr_stmt('insert ... t3 ...')
5 sp_instr_jump_if_not(7, 'x>0')
6 sp_instr_stmt('insert ... t1 ...')
7 sp_instr_hpop(2)
- Cursors
begin
declare x int;
declare c cursor for select a from t1;
open c;
fetch c into x;
close c;
end
Pos. Instruction
0 sp_instr_cpush('select a from ...')
1 sp_instr_copen(0) # The 0'th cursor
2 sp_instr_cfetch(0) # Contains the variable list
3 sp_instr_cclose(0)
4 sp_instr_cpop(1)
- The SP cache
There is however one issue with multiple caches: dropping and altering
procedures. Normally, this should be a very rare event in a running
system; it's typically something you do during development and testing,
so it's not unthinkable that we would simply ignore the issue and let
any threads running with a cached version of an SP keep doing so until
its disconnected.
But assuming we want to keep the caches consistent with respect to drop
and alter, it can be done:
1) A global counter is needed, initialized to 0 at start.
2) At each DROP or ALTER, increase the counter by one.
3) Each cache has its own copy of the counter, copied at the last read.
4) When looking up a name in the cache, first check if the global counter
is larger than the local copy.
If so, clear the cache and return "not found", and update the local
counter; otherwise, lookup as usual.
This minimizes the cost to a single brief lock for the access of an
integer when operating normally. Only in the event of an actual drop or
alter, is the cache cleared. This may seem to be drastic, but since we
assume that this is a rare event, it's not a problem.
It would of course be possible to have a much more fine-grained solution,
keeping track of each SP, but the overhead of doing so is not worth the
effort.
typedef enum
{
sp_param_in,
sp_param_out,
sp_param_inout
} sp_param_mode_t;
typedef struct
{
LEX_STRING name;
enum enum_field_types type;
sp_param_mode_t mode;
uint offset; // Offset in current frame
bool isset;
} sp_pvar_t;
class sp_pcontext
{
sp_pcontext();
// Push a cursor
void push_cursor(LEX_STRING *name);
// Find a cursor
bool find_cursor(LEX_STRING *name, uint *poff);
#define SP_HANDLER_NONE 0
#define SP_HANDLER_EXIT 1
#define SP_HANDLER_CONTINUE 2
#define SP_HANDLER_UNDO 3
typedef struct
{
struct sp_cond_type *cond;
uint handler; // Location of handler
int type;
uint foffset; // Frame offset for the handlers declare level
} sp_handler_t;
class sp_rcontext
{
// 'fsize' is the max size of the context, 'hmax' the number of handlers,
// 'cmax' the number of cursors
sp_rcontext(uint fsize, uint hmax, , uint cmax);
// Set the "out" index 'oidx' for slot 'idx. If it's an IN slot,
// use 'oidx' -1.
void set_oindex(uint idx, int oidx);
// Find a handler for this error. This sets the state for a found
// handler in the context. If called repeatedly without clearing,
// only the first call's state is kept.
int find_handler(uint sql_errno);
// Returns 1 if a handler has been found, with '*ip' and '*fp' set
// to the handler location and frame size respectively.
int found_handler(uint *ip, uint *fp);
#define TYPE_ENUM_FUNCTION 1
#define TYPE_ENUM_PROCEDURE 2
class sp_head
{
int m_type; // TYPE_ENUM_FUNCTION or TYPE_ENUM_PROCEDURE
sp_head();
// Invoke a FUNCTION
int
execute_function(THD *thd, Item **args, uint argcount, Item **resp);
// CALL a PROCEDURE
int
execute_procedure(THD *thd, List<Item> *args);
// Add the instruction to this procedure.
void add_instr(sp_instr *);
// Restores lex in 'thd' from our copy, but keeps some status from the
// one in 'thd', like ptr, tables, fields, etc.
void restore_lex(THD *);
- Instructions
- Statement instruction:
class sp_instr_stmt : public sp_instr
{
sp_instr_stmt(uint ip);
int execute(THD *, uint *nextp);
- SET instruction:
class sp_instr_set : public sp_instr
{
// 'offset' is the variable's frame offset, 'val' the value,
// and 'type' the variable type.
sp_instr_set(uint ip,
uint offset, Item *val, enum enum_field_types type);
- Unconditional jump
class sp_instr_jump : public sp_instr
{
// No destination, must be set.
sp_instr_jump(uint ip);
- Conditional jump
class sp_instr_jump_if_not : public sp_instr_jump
{
// Jump if 'i' evaluates to false. Destination not set yet.
sp_instr_jump_if_not(uint ip, Item *i);
- Pops handlers
class sp_instr_hpop : public sp_instr
{
// Pop 'count' handlers
sp_instr_hpop(uint ip, uint count);
- Push a CURSOR
class sp_instr_cpush : public sp_instr_stmt
{
// Push a cursor for statement 'lex'
sp_instr_cpush(uint ip, LEX *lex)
- Pop CURSORs
class sp_instr_cpop : public sp_instr_stmt
{
// Pop 'count' cursors
sp_instr_cpop(uint ip, uint count)
- Open a CURSOR
class sp_instr_copen : public sp_instr_stmt
{
// Open the 'c'th cursor
sp_instr_copen(uint ip, uint c);
- Close a CURSOR
class sp_instr_cclose : public sp_instr
{
// Close the 'c'th cursor
sp_instr_cclose(uint ip, uint c);
int execute(THD *thd, uint *nextp);
}
#define SP_OK 0
#define SP_KEY_NOT_FOUND -1
#define SP_OPEN_TABLE_FAILED -2
#define SP_WRITE_ROW_FAILED -3
#define SP_DELETE_ROW_FAILED -4
#define SP_GET_FIELD_FAILED -5
#define SP_PARSE_ERROR -6
// Finds a stored procedure given its name. Returns NULL if not found.
sp_head *sp_find_procedure(THD *, LEX_STRING *name);
// Finds a stored function given its name. Returns NULL if not found.
sp_head *sp_find_function(THD *, LEX_STRING *name);
/* Lookup an SP in cache */
sp_head *sp_cache_lookup(sp_cache **cp, char *name, uint namelen);
--