NAME

    DBIx::OnlineDDL - Run DDL on online databases safely

VERSION

    version v0.940.0

SYNOPSIS

        use DBIx::OnlineDDL;
        use DBIx::BatchChunker;
    
        DBIx::OnlineDDL->construct_and_execute(
            rsrc          => $dbic_schema->source('Account'),
            ### OR ###
            dbi_connector => $dbix_connector_retry_object,
            table_name    => 'accounts',
    
            coderef_hooks => {
                # This is the phase where the DDL is actually run
                before_triggers => \&drop_foobar,
    
                # Run other operations right before the swap
                before_swap => \&delete_deprecated_accounts,
            },
    
            process_name => 'Dropping foobar from accounts',
    
            copy_opts => {
                chunk_size => 5000,
                debug => 1,
            },
        );
    
        sub drop_foobar {
            my $oddl  = shift;
            my $name  = $oddl->new_table_name;
            my $qname = $oddl->dbh->quote_identifier($name);
    
            # Drop the 'foobar' column, since it is no longer used
            $oddl->dbh_runner_do("ALTER TABLE $qname DROP COLUMN foobar");
        }
    
        sub delete_deprecated_accounts {
            my $oddl = shift;
            my $name = $oddl->new_table_name;
            my $dbh  = $oddl->dbh;  # only use for quoting!
    
            my $qname = $dbh->quote_identifier($name);
    
            DBIx::BatchChunker->construct_and_execute(
                chunk_size  => 5000,
    
                debug => 1,
    
                process_name     => 'Deleting deprecated accounts',
                process_past_max => 1,
    
                dbic_storage => $oddl->rsrc->storage,
                min_stmt => "SELECT MIN(account_id) FROM $qname",
                max_stmt => "SELECT MAX(account_id) FROM $qname",
                stmt     => join("\n",
                    "DELETE FROM $qname",
                    "WHERE",
                    "    account_type = ".$dbh->quote('deprecated')." AND",
                    "    account_id BETWEEN ? AND ?",
                ),
            );
        }

DESCRIPTION

    This is a database utility class for running DDL operations (like ALTER
    TABLE) safely on large tables. It has a similar scope as
    DBIx::BatchChunker, but is designed for DDL, rather than DML. It also
    has a similar function to other utilities like pt-online-schema-change
    <https://www.percona.com/doc/percona-toolkit/LATEST/pt-online-schema-ch
    ange.html> or gh-ost <https://github.com/github/gh-ost>, but actually
    works properly with foreign keys, and is written as a Perl module to
    hook directly into a DBI handle.

    Like most online schema change tools, this works by creating a new
    shell table that looks just like the old table, running the DDL changes
    (through the "before_triggers" hook), copying data to the new table,
    and swapping the tables. Triggers are created to keep the data in sync.
    See "STEP METHODS" for more information.

    The full operation is protected with an undo stack via
    Eval::Reversible. If any step in the process fails, the undo stack is
    run to return the DB back to normal.

    This module uses as many of the DBI info methods as possible, along
    with ANSI SQL in most places, to be compatible with multiple RDBMS. So
    far, it will work with MySQL or SQLite, but can be expanded to include
    more systems with a relatively small amount of code changes. (See
    DBIx::OnlineDDL::Helper::Base for details.)

    DISCLAIMER: You should not rely on this class to magically fix any and
    all locking problems the DB might experience just because it's being
    used. Thorough testing and best practices are still required.

 When you shouldn't use this module

  Online DDL is already available in the RDBMS

    If you're running MySQL 5.6+ without clustering, just use LOCK=NONE for
    every DDL statement. It is seriously simple and guarantees that the
    table changes you make are not going to lock the table, or it will fail
    right away to tell you it's an incompatible change.

    If you're running something like Galera clusters, this typically
    wouldn't be an option, as it would lock up the clusters while the ALTER
    TABLE statement is running, despite the LOCK=NONE statement. (Galera
    clusters were the prime motivation for writing this module.)

    Other RDBMSs may have support for online DDL as well. Check the
    documentation first. If they don't, patches for this tool are welcome!

  The operation is small

    Does your DDL only take 2 seconds? Just do it! Don't bother with trying
    to swap tables around, wasting time with full table copies, etc. It's
    not worth the time spent or risk.

  When you actually want to run DML, not DDL

    DBIx::BatchChunker is more appropriate for running DML operations (like
    INSERT, UPDATE, DELETE). If you need to do both, you can use the
    "before_triggers" hook for DDL, and the "before_swap" hook for DML. Or
    just run DBIx::BatchChunker after the OnlineDDL process is complete.

  Other online schema change tools fit your needs

    Don't have foreign key constraints and gh-ost is already working for
    you? Great! Keep using it.

ATTRIBUTES

 DBIC Attributes

  rsrc

    A DBIx::Class::ResultSource. This will be the source used for all
    operations, DDL or otherwise. Optional, but recommended for DBIC users.

    The DBIC storage handler's connect_info will be tweaked to ensure sane
    defaults and proper post-connection details.

  dbic_retry_opts

    A hashref of DBIC retry options. These options control how retry
    protection works within DBIC. Right now, this is just limited to
    max_attempts, which controls the number of times to retry. The default
    max_attempts is 20.

 DBI Attributes

  dbi_connector

    A DBIx::Connector::Retry object. Instead of DBI statement handles, this
    is the recommended non-DBIC way for OnlineDDL (and BatchChunker) to
    interface with the DBI, as it handles retries on failures. The
    connection mode used is whatever default is set within the object.

    Required, except for DBIC users, who should be setting "rsrc" above. It
    is also assumed that the correct database is already active.

    The object will be tweaked to ensure sane defaults, proper
    post-connection details, a custom retry_handler, and set a default
    max_attempts of 20, if not already set.

  table_name

    The table name to be copied and eventually replaced. Required unless
    "rsrc" is specified.

  new_table_name

    The new table name to be created, copied to, and eventually used as the
    final table. Optional.

    If not defined, a name will be created automatically. This might be the
    better route, since the default builder will search for an unused name
    in the DB right before OnlineDDL needs it.

 Progress Bar Attributes

  progress_bar

    The progress bar used for most of the process. A different one is used
    for the actual table copy with DBIx::BatchChunker, since that step
    takes longer.

    Optional. If the progress bar isn't specified, a default one will be
    created. If the terminal isn't interactive, the default
    Term::ProgressBar will be set to silent to naturally skip the output.

  progress_name

    A string used to assist in creating a progress bar. Ignored if
    "progress_bar" is already specified.

    This is the preferred way of customizing the progress bar without
    having to create one from scratch.

 Other Attributes

  coderef_hooks

    A hashref of coderefs. Each of these are used in different steps in the
    process. All of these are optional, but it is highly recommended that
    before_triggers is specified. Otherwise, you're not actually running
    any DDL and the table copy is essentially a no-op.

    All of these triggers pass the DBIx::OnlineDDL object as the only
    argument. The "new_table_name" can be acquired from that and used in
    SQL statements. The "dbh_runner" and "dbh_runner_do" methods should be
    used to protect against disconnections or locks.

    There is room to add more hooks here, but only if there's a good reason
    to do so. (Running the wrong kind of SQL at the wrong time could be
    dangerous.) Create a GitHub issue if you can think of one.

   before_triggers

    This is called before the table triggers are applied. Your DDL should
    take place here, for a few reasons:

        1. The table is empty, so DDL should take no time at all now.
    
        2. After this hook, the table is reanalyzed to make sure it has an accurate picture
        of the new columns.  This is critical for the creation of the triggers.

   before_swap

    This is called after the new table has been analyzed, but before the
    big table swap. This hook might be used if a large DML operation needs
    to be done while the new table is still available. If you use this
    hook, it's highly recommended that you use something like
    DBIx::BatchChunker to make sure the changes are made in a safe and
    batched manner.

  copy_opts

    A hashref of different options to pass to DBIx::BatchChunker, which is
    used in the "copy_rows" step. Some of these are defined automatically.
    It's recommended that you specify at least these options:

        chunk_size  => 5000,     # or whatever is a reasonable size for that table
        id_name     => 'pk_id',  # especially if there isn't an obvious integer PK

    Specifying "coderef" in DBIx::BatchChunker is not recommended, since
    Active DBI Processing mode will be used.

    These options will be included into the hashref, unless specifically
    overridden by key name:

        id_name      => $first_pk_column,  # will warn if the PK is multi-column
        target_time  => 1,
        sleep        => 0.5,
    
        # If using DBIC
        dbic_storage => $rsrc->storage,
        rsc          => $id_rsc,
        dbic_retry_opts => {
            max_attempts  => 20,
            # best not to change this, unless you know what you're doing
            retry_handler => $onlineddl_retry_handler,
        },
    
        # If using DBI
        dbi_connector => $oddl->dbi_connector,
        min_stmt      => $min_sql,
        max_stmt      => $max_sql,
    
        # For both
        count_stmt    => $count_sql,
        stmt          => $insert_select_sql,
        progress_name => $copying_msg,

  db_timeouts

    A hashref of timeouts used for various DB operations, and usually set
    at the beginning of each connection. Some of these settings may be
    RDBMS-specific.

   lock_file

    Amount of time (in seconds) to wait when attempting to acquire
    filesystem locks (on filesystems which support locking). Float or
    fractional values are allowed. This currently only applies to SQLite.

    Default value is 1 second. The downside is that the SQLite default is
    actually 0, so other (non-OnlineDDL) connections should have a setting
    that is more than that to prevent lock contention.

   lock_db

    Amount of time (in whole seconds) to wait when attempting to acquire
    table and/or database level locks before falling back to retry.

    Default value is 60 seconds.

   lock_row

    Amount of time (in whole seconds) to wait when attempting to acquire
    row-level locks, which apply to much lower-level operations than
    "lock_db". At this scope, the lesser of either of these two settings
    will take precedence.

    Default value is 2 seconds. Lower values are preferred for row lock
    wait timeouts, so that OnlineDDL is more likely to be the victim of
    lock contention. OnlineDDL can simply retry the connection at that
    point.

   session

    Amount of time (in whole seconds) for inactive session timeouts on the
    database side.

    Default value is 28,800 seconds (8 hours), which is MySQL's default.

  reversible

    A Eval::Reversible object, used for rollbacks. A default will be
    created, if not specified.

CONSTRUCTORS

    See "ATTRIBUTES" for information on what can be passed into these
    constructors.

 new

        my $online_ddl = DBIx::OnlineDDL->new(...);

    A standard object constructor. If you use this constructor, you will
    need to manually call "execute" to execute the DB changes.

    You'll probably just want to use "construct_and_execute".

 construct_and_execute

        my $online_ddl = DBIx::OnlineDDL->construct_and_execute(...);

    Constructs a DBIx::OnlineDDL object and automatically calls each method
    step, including hooks. Anything passed to this method will be passed
    through to the constructor.

    Returns the constructed object, post-execution. This is typically only
    useful if you want to inspect the attributes after the process has
    finished. Otherwise, it's safe to just ignore the return and throw away
    the object immediately.

METHODS

 Step Runners

  execute

    Runs all of the steps as documented in "STEP METHODS". This also
    includes undo protection, in case of exceptions.

  fire_hook

        $online_ddl->fire_hook('before_triggers');

    Fires one of the coderef hooks, if it exists. This also updates the
    progress bar.

    See "coderef_hooks" for more details.

 DBI Helpers

  dbh

        $online_ddl->dbh;

    Acquires a database handle, either from "rsrc" or "dbi_connector". Not
    recommended for active work, as it doesn't offer retry protection.
    Instead, use "dbh_runner" or "dbh_runner_do".

  dbh_runner

        my @items = $online_ddl->dbh_runner(run => sub {
            my $dbh = $_;  # or $_[0]
            $dbh->selectall_array(...);
        });

    Runs the $coderef, locally setting $_ to and passing in the database
    handle. This is essentially a shortcut interface into either
    dbi_connector or DBIC's BlockRunner.

    The first argument can either be run or txn, which controls whether to
    wrap the code in a DB transaction or not. The return is passed directly
    back, and return context is honored.

  dbh_runner_do

        $online_ddl->dbh_runner_do(
            "ALTER TABLE $table_name ADD COLUMN foobar",
            ["ALTER TABLE ? DROP COLUMN ?", undef, $table_name, 'baz'],
        );

    Runs a list of commands, encapsulating each of them in a "dbh_runner"
    coderef with calls to "do" in DBI. This is handy when you want to run a
    list of DDL commands, which you don't care about the output of, but
    don't want to bundle them into a single non-idempotant repeatable
    coderef. Or if you want to save typing on a single do-able SQL command.

    The items can either be a SQL string or an arrayref of options to pass
    to "do" in DBI.

    The statement is assumed to be non-transactional. If you want to run a
    DB transaction, you should use "dbh_runner" instead.

STEP METHODS

    You can call these methods individually, but using
    "construct_and_execute" instead is highly recommended. If you do run
    these yourself, the exception will need to be caught and the
    "reversible" undo stack should be run to get the DB back to normal.

 create_new_table

    Creates the new table, making sure to preserve as much of the original
    table properties as possible.

 create_triggers

    Creates triggers on the original table to make sure any new changes are
    captured into the new table.

 copy_rows

    Fires up a DBIx::BatchChunker process to copy all of the rows from the
    old table to the new.

 swap_tables

    With the new table completely modified and set up, this swaps the
    old/new tables.

 drop_old_table

    Drops the old table. This will also remove old foreign keys on child
    tables. (Those FKs are re-applied to the new table in the next step.)

 cleanup_foreign_keys

    Clean up foreign keys on both the new and child tables.

SEE ALSO

      * Percona's pt-online-schema-change
      <https://www.percona.com/doc/percona-toolkit/LATEST/pt-online-schema-
      change.html>

      * GitHub's gh-ost <https://github.com/github/gh-ost>

      * Facebook's OSC
      <https://www.facebook.com/notes/mysql-at-facebook/online-schema-chang
      e-for-mysql/430801045932/>

      * MySQL's Online DDL
      <https://dev.mysql.com/doc/refman/5.6/en/innodb-online-ddl.html>

WHY YET ANOTHER OSC?

    The biggest reason is that none of the above fully support foreign key
    constraints. Percona's pt-osc comes close, but also includes this
    paragraph:

        Due to a limitation in MySQL, foreign keys will not have the same names after the ALTER
        that they did prior to it. The tool has to rename the foreign key when it redefines it,
        which adds a leading underscore to the name. In some cases, MySQL also automatically
        renames indexes required for the foreign key.

    So, tables swapped with pt-osc are not exactly what they used to be
    before the swap. It also had a number of other quirks that just didn't
    work out for us, related to FKs and the amount of switches required to
    make it (semi-)work.

    Additionally, by making DBIx::OnlineDDL its own Perl module, it's a lot
    easier to run Perl-based schema changes along side DBIx::BatchChunker
    without having to switch between Perl and CLI. If other people want to
    subclass this module for their own environment-specific quirks, they
    have the power to do so, too.

AUTHOR

    Grant Street Group <developers@grantstreet.com>

COPYRIGHT AND LICENSE

    This software is Copyright (c) 2018 - 2021 by Grant Street Group.

    This is free software, licensed under:

      The Artistic License 2.0 (GPL Compatible)