NAME
    SQL::Statement - SQL parsing and processing engine

SYNOPSIS
        require SQL::Statement;

        # Create a parser
        my($parser) = SQL::Parser->new('Ansi');

        # Parse an SQL statement
        $@ = '';
        my ($stmt) = eval {
            SQL::Statement->new("SELECT id, name FROM foo WHERE id > 1",
                                $parser);
        };
        if ($@) {
            die "Cannot parse statement: $@";
        }

        # Query the list of result columns;
        my $numColums = $stmt->columns();  # Scalar context
        my @columns = $stmt->columns();    # Array context
        # @columns now contains SQL::Statement::Column instances

        # Likewise, query the tables being used in the statement:
        my $numTables = $stmt->tables();   # Scalar context
        my @tables = $stmt->tables();      # Array context
        # @tables now contains SQL::Statement::Table instances

        # Query the WHERE clause; this will retrieve an
        # SQL::Statement::Op instance
        my $where = $stmt->where();

        # Evaluate the WHERE clause with concrete data, represented
        # by an SQL::Eval object
        my $result = $stmt->eval_where($eval);

        # Execute a statement:
        $stmt->execute($data, $params);

DESCRIPTION
    For installing the module, see the section on "INSTALLATION"
    below.

    The SQL::Statement module implements a small, abstract SQL
    engine. This module is not usefull itself, but as a base class
    for deriving concrete SQL engines. The implementation is
    designed to work fine with the DBI driver DBD::CSV, thus
    probably not so well suited for a larger environment, but I'd
    hope it is extendable without too much problems.

    By parsing an SQL query you create an SQL::Statement instance.
    This instance offers methods for retrieving syntax, for WHERE
    clause and statement evaluation.

  Creating a parser object

    What's accepted as valid SQL, depends on the parser object.
    There is a set of so-called features that the parsers may have
    or not. Usually you start with a builtin parser:

        my $parser = SQL::Parser->new($name, [ \%attr ]);

    Currently two parsers are builtin: The *Ansi* parser implements
    a proper subset of ANSI SQL. (At least I hope so. :-) The
    *SQL::Statement* parser is used by the DBD:CSV driver.

    You can query or set individual features. Currently available
    are:

    create.type_blob
    create.type_real
    create.type_text
            These enable the respective column types in a *CREATE
            TABLE* clause. They are all disabled in the *Ansi*
            parser, but enabled in the *SQL::Statement* parser.
            Example:

    select.join
            This enables the use of multiple tables in a SELECT
            statement, for example

              SELECT a.id, b.name FROM a, b WHERE a.id = b.id AND a.id = 2

    To enable or disable a feature, for example *select.join*, use
    the following:

      # Enable feature
      $parser->feature("select", "join", 1);
      # Disable feature
      $parser->feature("select", "join", 0);

    Of course you can query features:

      # Query feature
      my $haveSelectJoin = $parser->feature("select", "join");

    The `new' method allows a shorthand for setting features. For
    example, the following is equivalent to the *SQL::Statement*
    parser:

      $parser = SQL::Statement->new('Ansi',
                                    { 'create' => { 'type_text' => 1,
                                                    'type_real' => 1,
                                                    'type_blob' => 1 },
                                      'select' => { 'join' => 0 }});

  Parsing a query

    A statement can be parsed with

        my $stmt = SQL::Statement->new($query, $parser);

    In case of syntax errors or other problems, the method throws a
    Perl exception. Thus, if you want to catch exceptions, the above
    becomes

        $@ = '';
        my $stmt = eval { SQL::Statement->new($query, $parser) };
        if ($@) { print "An error occurred: $@"; }

    The accepted SQL syntax is restricted, though easily extendable.
    See the section on "SQL syntax" below. See the section on
    "Creating a parser object" above.

  Retrieving query information

    The following methods can be used to obtain information about a
    query:

    command Returns the SQL command, currently one of *SELECT*,
            *INSERT*, *UPDATE*, *DELETE*, *CREATE* or *DROP*, the
            last two referring to *CREATE TABLE* and *DROP TABLE*.
            See the section on "SQL syntax" below. Example:

                my $command = $stmt->command();

    columns my $numColumns = $stmt->columns(); # Scalar context my
            @columnList = $stmt->columns(); # Array context
            my($col1, $col2) = ($stmt->columns(0), $stmt-
            >columns(1));

            This method is used to retrieve column lists. The
            meaning depends on the query command:

                SELECT $col1, $col2, ... $colN FROM $table WHERE ...
                UPDATE $table SET $col1 = $val1, $col2 = $val2, ...
                    $colN = $valN WHERE ...
                INSERT INTO $table ($col1, $col2, ..., $colN) VALUES (...)

            When used without arguments, the method returns a list
            of the columns $col1, $col2, ..., $colN, you may
            alternatively use a column number as argument. Note that
            the column list may be empty, like in

                INSERT INTO $table VALUES (...)

            and in *CREATE* or *DROP* statements.

            But what does "returning a column" mean? It is returning
            an SQL::Statement::Column instance, a class that
            implements the methods `table' and `name', both
            returning the respective scalar. For example, consider
            the following statements:

                INSERT INTO foo (bar) VALUES (1)
                SELECT bar FROM foo WHERE ...
                SELECT foo.bar FROM foo WHERE ...

            In all these cases exactly one column instance would be
            returned with

                $col->name() eq 'bar'
                $col->table() eq 'foo'

    tables  my $tableNum = $stmt->tables(); # Scalar context my @tables
            = $stmt->tables(); # Array context my($table1, $table2)
            = ($stmt->tables(0), $stmt->tables(1));

            Similar to `columns', this method returns instances of
            `SQL::Statement::Table'. For *UPDATE*, *DELETE*,
            *INSERT*, *CREATE* and *DROP*, a single table will
            always be returned. *SELECT* statements can return more
            than one table, in case of joins. Table objects offer a
            single method, `name' which returns the table name.

    params  my $paramNum = $stmt->params(); # Scalar context my @params
            = $stmt->params(); # Array context my($p1, $p2) =
            ($stmt->params(0), $stmt->params(1));

            The `params' method returns information about the input
            parameters used in a statement. For example, consider
            the following:

                INSERT INTO foo VALUES (?, ?)

            This would return two instances of
            SQL::Statement::Param. Param objects implement a single
            method, `$param-'num()>, which retrieves the parameter
            number. (0 and 1, in the above example). As of now, not
            very usefull ... :-)

    row_values
                my $rowValueNum = $stmt->row_values(); # Scalar context
                my @rowValues = $stmt->row_values();   # Array context
                my($rval1, $rval2) = ($stmt->row_values(0),
                                      $stmt->row_values(1));

            This method is used for statements like

                UPDATE $table SET $col1 = $val1, $col2 = $val2, ...
                    $colN = $valN WHERE ...
                INSERT INTO $table (...) VALUES ($val1, $val2, ..., $valN)

            to read the values $val1, $val2, ... $valN. It returns
            scalar values or SQL::Statement::Param instances.

    order   my $orderNum = $stmt->order(); # Scalar context my @order =
            $stmt->order(); # Array context my($o1, $o2) = ($stmt-
            >order(0), $stmt->order(1));

            In *SELECT* statements you can use this for looking at
            the ORDER clause. Example:

                SELECT * FROM FOO ORDER BY id DESC, name

            In this case, `order' could return 2 instances of
            SQL::Statement::Order. You can use the methods `$o-
            >table()', `$o->column()' and `$o->desc()' to examine
            the order object.

    limit   my $l = $stmt->limit(); if ($l) { my $offset = $l->offset();
            my $limit = $l->limit(); }

            In a SELECT statement you can use a `LIMIT' clause to
            implement cursoring:

                SELECT * FROM FOO LIMIT 5
                SELECT * FROM FOO LIMIT 5, 5
                SELECT * FROM FOO LIMIT 10, 5

            These three statements would retrieve the rows 0..4,
            5..9, 10..14 of the table FOO, respectively. If no
            `LIMIT' clause is used, then the method `$stmt->limit'
            returns undef. Otherwise it returns an instance of
            SQL::Statement::Limit. This object has the methods
            `offset' and `limit' to retrieve the index of the first
            row and the maximum number of rows, respectively.

    where   my $where = $stmt->where();

            This method is used to examine the syntax tree of the
            `WHERE' clause. It returns undef (if no WHERE clause was
            used) or an instance of SQL::Statement::Op. The Op
            instance offers 4 methods:

    op                  returns the operator, one of `AND', `OR', `=',
                        `<>', `>=', `>', `<=', `<', `LIKE', `CLIKE'
                        or `IS'.

    arg1
    arg2                returns the left-hand and right-hand sides of
                        the operator. This can be a scalar value, an
                        SQL::Statement::Param object or yet another
                        SQL::Statement::Op instance.

    neg                 returns a TRUE value, if the operation result
                        must be negated after evalution.

            To evaluate the *WHERE* clause, fetch the topmost Op
            instance with the `where' method. Then evaluate the
            left-hand and right-hand side of the operation, perhaps
            recursively. Once that is done, apply the operator and
            finally negate the result, if required.

    To illustrate the above, consider the following WHERE clause:

        WHERE NOT (id > 2 AND name = 'joe') OR name IS NULL

    We can represent this clause by the following tree:

                  (id > 2)   (name = 'joe')
                         \   /
              NOT         AND
                             \      (name IS NULL)
                              \    /
                                OR

    Thus the WHERE clause would return an SQL::Statement::Op
    instance with the op() field set to 'OR'. The arg2() field would
    return another SQL::Statement::Op instance with arg1() being the
    SQL::Statement::Column instance representing id, the arg2()
    field containing the value undef (NULL) and the op() field being
    'IS'.

    The arg1() field of the topmost Op instance would return an Op
    instance with op() eq 'AND' and neg() returning TRUE. The arg1()
    and arg2() fields would be Op's representing "id > 2" and "name
    = 'joe'".

    Of course there's a ready-for-use method for WHERE clause
    evaluation:

  Evaluating a WHERE clause

    The WHERE clause evaluation depends on an object being used for
    fetching parameter and column values. Usually this can be an
    SQL::Eval object, but in fact it can be any object that supplies
    the methods

        $val = $eval->param($paramNum);
        $val = $eval->column($table, $column);

    See the SQL::Eval manpage for a detailed description of these
    methods. Once you have such an object, you can call a

        $match = $stmt->eval_where($eval);

  Evaluating queries

    So far all methods have been concrete. However, the interface
    for executing and evaluating queries is abstract. That means,
    for using them you have to derive a subclass from SQL::Statement
    that implements at least certain missing methods and/or
    overwrites others. See the `test.pl' script for an example
    subclass.

    Something that all methods have in common is that they simply
    throw a Perl exception in case of errors.

    execute After creating a statement, you must execute it by calling
            the `execute' method. Usually you put an eval statement
            around this call:

                $@ = '';
                my $rows = eval { $self->execute($data); };
                if ($@) { die "An error occurred!"; }

            In case of success the method returns the number of
            affected rows or -1, if unknown. Additionally it sets
            the attributes

                $self->{'NUM_OF_FIELDS'}
                $self->{'NUM_OF_ROWS'}
                $self->{'data'}

            the latter being an array ref of result rows. The
            argument $data is for private use by concrete subclasses
            and will be passed through to all methods. (It is
            intentionally not implemented as attribute: Otherwise we
            might well become self referencing data structures which
            could prevent garbage collection.)

    CREATE
    DROP
    INSERT
    UPDATE
    DELETE
    SELECT  Called by `execute' for doing the real work. Usually they
            create an SQL::Eval object by calling `$self-
            >open_tables()', call `$self->verify_columns()' and then
            do their job. Finally they return the triple

                ($self->{'NUM_OF_ROWS'}, $self->{'NUM_OF_FIELDS'},
                 $self->{'data'})

            so that execute can setup these attributes. Example:

                ($self->{'NUM_OF_ROWS'}, $self->{'NUM_OF_FIELDS'},
                 $self->{'data'}) = $self->SELECT($data);

    verify_columns
            Called for verifying the row names that are used in the
            statement. Example:

                $self->verify_columns($eval, $data);

    open_tables
            Called for creating an SQL::Eval object. In fact what it
            returns doesn't need to be derived from SQL::Eval, it's
            completely sufficient to implement the same interface of
            methods. See the SQL::Eval manpage for details. The
            arguments `$data', `$createMode' and `$lockMode' are
            corresponding to those of SQL::Eval::Table::open_table
            and usually passed through. Example:

                my $eval = $self->open_tables($data, $createMode, $lockMode);

            The eval object can be used for calling `$self-
            'verify_columns> or `$self-'eval_where>.

    open_table
            This method is completely abstract and *must* be
            implemented by subclasses. The default implementation of
            `$self-'open_tables> calls this method for any table
            used by the statement. See the `test.pl' script for an
            example of imlplementing a subclass.

SQL syntax
    The SQL::Statement module is far away from ANSI SQL or something
    similar, it is designed for implementing the DBD::CSV module.
    See the DBD::CSV(3) manpage.

    I do not want to give a formal grammar here, more an informal
    description: Read the statement definition in sql_yacc.y, if you
    need something precise.

    The main lexical elements of the grammar are:

    Integers
    Reals   Syntax obvious

    Strings Surrounded by either single or double quotes; some
            characters need to be escaped with a backslash, in
            particular the backslash itself (\\), the NUL byte (\0),
            Line feeds (\n), Carriage return (\r), and the quotes
            (\' or \").

    Parameters
            Parameters represent scalar values, like Integers, Reals
            and Strings do. However, their values are read inside
            Execute() and not inside Prepare(). Parameters are
            represented by question marks (?).

    Identifiers
            Identifiers are table or column names. Syntactically
            they consist of alphabetic characters, followed by an
            arbitrary number of alphanumeric characters. Identifiers
            like SELECT, INSERT, INTO, ORDER, BY, WHERE, ... are
            forbidden and reserved for other tokens.

    What it offers is the following:

  CREATE

    This is the CREATE TABLE command:

        CREATE TABLE $table ( $col1 $type1, ..., $colN $typeN,
                              [ PRIMARY KEY ($col1, ... $colM) ] )

    The column names are $col1, ... $colN. The column types can be
    `INTEGER', `CHAR(n)', `VARCHAR(n)', `REAL' or `BLOB'. These
    types are currently completely ignored. So is the (optional)
    `PRIMARY KEY' clause.

  DROP

    Very simple:

        DROP TABLE $table

  INSERT

    This can be

        INSERT INTO $table [ ( $col1, ..., $colN ) ]
            VALUES ( $val1, ... $valN )

  DELETE

        DELETE FROM $table [ WHERE $where_clause ]

    See the SELECT manpage below for a decsription of $where_clause

  UPDATE

        UPDATE $table SET $col1 = $val1, ... $colN = $valN
            [ WHERE $where_clause ]

    See the SELECT manpage below for a decsription of $where_clause

  SELECT

        SELECT [DISTINCT] $col1, ... $colN FROM $table
            [ WHERE $where_clause ] [ ORDER BY $ocol1, ... $ocolM ]

    The $where_clause is based on boolean expressions of the form
    $val1 $op $val2, with $op being one of '=', '<>', '>', '<',
    '>=', '<=', 'LIKE', 'CLIKE' or IS. You may use OR, AND and
    brackets to combine such boolean expressions or NOT to negate
    them.

INSTALLATION
    Like most other Perl modules, you simply do a

        perl Makefile.PL
        make                (nmake or dmake, if you are using Win32)
        make test           (Let me know, if any tests fail)
        make install

    Known problems are:

    *       Some flavours of SCO Unix don't seem to have alloca() or
            something similar. I recommend using gcc or egcs for
            compiling Perl and the SQL::Statement module: Both
            compilers have a builtin alloca().

            Another option could be to use external alloca.c, for
            example

              http://www.pu.informatik.th-darmstadt.de/FTP/pub/pu/alloca.c
              http://www.cs.purdue.edu/homes/young/src2www-example/alloca.c.html

            I did test neither of them and cannot give detailed
            instructions for including them into the SQL::Statement
            module. However, it should be sufficient to compile
            alloca.c with the same instructions than, for example,
            sql_yacc.c and finally repeat the linker command by
            inserting alloca.o after sql_yacc.o.

            Note that I cannot modify the sources to work without
            alloca(), as it is the bison parser that's using
            alloca() and I don't have the bison generated code in my
            hands.

            My thanks to Theo Petersen, <theo@acsp.com>, for
            pointing out this problem and the possible workarounds.

INTERNALS
    Internally the module is splitted into three parts:

  Perl-independent C part

    This part, contained in the files `sql_yacc.y', `sql_data.h',
    `sql_data.c' and `sql_op.c', is completely independent from
    Perl. It might well be used from within another script language,
    Tcl say, or from a true C application.

    You probably ask, why Perl independence? Well, first of all, I
    think this is a valuable target in itself. But the main reason
    was the impossibility to use the Perl headers inside bison
    generated code. The Perl headers export almost the complete Yacc
    interface to XS, for whatever reason, thus redefining constants
    and structures created by your own bison code. :-(

  Perl-dependent C part

    This is contained in `Statement.xs'. The both C parts
    communicate via a C structure sql_stmt_t. In fact, an
    SQL::Statement object is nothing else than a pointer to such a
    structure. The XS calls columns(), Table(), where(), ... do
    nothing more than fetching data from this structure and
    converting it to Perl objects. See the section on "The
    sql_stmt_t structure" below for details on the structure.

  Perl part

    Besides some stub functions for retrieving statement data, this
    is mainly the query processing with the exception of WHERE
    clause evaluation.

  The sql_stmt_t structure

    This structure is designed for optimal performance. A typical
    query will be parsed with only 4 or 5 malloc() calls; in
    particular no memory will be aquired for storing strings; only
    pointers into the query string are used.

    The statement stores its tokens in the values array. The array
    elements are of type sql_val_t, a union, that can represent the
    most interesting tokens; for example integers and reals are
    stored in the data.i and data.d parts of the union, strings are
    stored in the data.str part, columns in the data.col part and so
    on. Arrays are allocated in chunks of 64 elements, thus a single
    malloc() will be usually sufficient for allocating the complete
    array. Some types use pointers into the values array: For
    example, operations are stored in an sql_op_t structure that
    containes elements arg1 and arg2 which are pointers into the
    value table, pointing to other operations or scalars. These
    pointers are stored as indices, so that the array can be
    extended using realloc().

    The sql_stmt_t structure contains other arrays: columns, tables,
    rowvals, order, ... representing the data returned by the
    columns(), tables(), row_values() and order() methods. All of
    these contain pointers into the values array, again stored as
    integers.

    Arrays are initialized with the _InitArray call in
    SQL_Statement_Prepare and deallocated with _DestroyArray in
    SQL_Statement_Destroy. Array elements are obtained by calling
    _AllocData, which returns an index. The number -1 is used for
    errors or as a NULL value.

  The WHERE clause evaluation

    A WHERE clause is evaluated by calling
    SQL_Statement_EvalWhere(). This function is in the Perl
    independent part, but it needs the possibility to retrieve data
    from the Perl part, for example column or parameter values.
    These values are retrieved via callbacks, stored in the
    sql_eval_t structure. The field stmt->evalData points to such a
    structure. Of course the calling method can extend the
    sql_eval_t structure (like eval_where in Statement.xs does) to
    include private data not used by SQL_Statement_EvalWhere.

  Features

    Different parsers are implemented via the sql_parser_t
    structure. This is mainly a set of yes/no flags. If you'd like
    to add features, do the following:

    First of all, extend the sql_parser_t structure. If your feature
    is part of a certain statement, place it into the statements
    section, for example "select.join". Otherwise choose a section
    like "misc" or "general". (There's no particular for the section
    design, but structure never hurts.)

    Second, add your feature to sql_yacc.y. If your feature needs to
    extend the lexer, do it like this:

        if (FEATURE(misc, myfeature) {
            /*  Scan your new symbols  */
            ...
        }

    See the *BOOL* symbol as an example.

    If you need to extend the parser, do it like this:

        my_new_rule:
            /*  NULL, old behaviour, doesn't use my feature  */
            | my_feature
                { YFEATURE(misc, myfeature); }
        ;

    Thus all parsers not having FEATURE(misc, myfeature) set will
    produce a parse error here. Again, see the BOOL symbol for an
    example.

    Third thing is to extend the builtin parsers. If they support
    your feature, add a 1, otherwise a 0. Currently there are two
    builtin parsers: The *ansiParser* in sql_yacc.y and the
    sqlEvalParser in Statement.xs.

    Finally add support for your feature to the `feature' method in
    Statement.xs. That's it!

MULTITHREADING
    The complete module code is reentrant. In particular the parser
    is created with `%pure_parser'. See the bison(1) manpage for
    details on reentrant parsers. That means, the module is ready
    for multithreading, as long as you don't share handles between
    threads. Read-only handles, for example parsers, can even be
    shared.

    Statement handles cannot be shared among threads, at least not,
    if you don't grant serialized access. Per-thread handles are
    always safe.

AUTHOR AND COPYRIGHT
    This module is Copyright (C) 1998 by

        Jochen Wiedmann
        Am Eisteich 9
        72555 Metzingen
        Germany

        Email: joe@ispsoft.de
        Phone: +49 7123 14887

    All rights reserved.

    You may distribute this module under the terms of either the GNU
    General Public License or the Artistic License, as specified in
    the Perl README file.

SEE ALSO
    the DBI(3) manpage, the DBD::CSV(3) manpage