We are running MariaDB v 10.1.30, testing a script to run database maintenance script for defragmenting tables and rebuilding indexes using OPTIMIZE TABLE command by using the new 10.1.1 patch of setting innodb_defragment = 1.

I've tested Alter Table with Alogorithm = INPLACE, works fine but I'm trying to make use of innodb_defragment and use optimize to avoid creating temp files when the tables are being rebuilt as done by Alter table INPLACE algorithm.

On using Optimize, there are no temp tables created however the table gets locked not allowing concurrent connections which is not the case with Alter Table with Alogorithm = INPLACE, the documentation however mentions that the optimize is done using INPLACE algorithm.

https://mariadb.org/defragmenting-unused-space-on-innodb-tablespace/

Is this a bug or am i missing something here, please advise.

1

Best Answer


The benefit for speed is virtually nil.

  • A "point query" (where you have the key and can go directly to the row) depends on the depth of the BTree. For a million rows, the depth will be about 3. For a trillion rows, about 6. Optimizing a table is very unlikely to shrink the depth.

  • A "range scan" (BETWEEN, >, etc) walks across a block, looking at each row. Then it hops (via a link) to the next block until it has found all the rows needed. Sure, you will touch more blocks in an un-optimized table, but the bulk of the effort is in accessing each row.

The benefit for space is limited.

  • An INSERT may add to a non-full block or it may split a full block into two half-full blocks. Later, two adjacent, somewhat empty, blocks will be merged together. Hence, a BTree naturally gravitates toward a state where the average block is 69% full. That is, the benefit of OPTIMIZE TABLE for space is limited.

  • Phrased differently, OPTIMIZE might shrink the disk footprint for a table to only 69% of what it was, but subsequent operations will just grow the table again.

  • If you are using innodb_file_per_table=OFF, then OPTIMIZE cannot return the free blocks to the Operating system. Such blocks can be reused for future INSERTs.

OPTIMIZE TABLE is invasive.

  • It copies the table over, locking it during the process. This is unacceptable to sites that need 100% uptime.

  • If you are using replication, subsequent writes may stack up behind the OPTIMIZE, thereby making the Slave not up-to-the-second.

Big DELETEs

  • After deleting lots of rows, there may be benefit to OPTIMIZE, but check the 69% estimate.

  • If big deletes are a common occurrence, perhaps there are other things that you should be doing. See http://mysql.rjweb.org/doc.php/deletebig

History and internals

  • Old version did OPTIMIZE in a straightforward way: Create a new table (same schema); copy rows into it: rename table; drop. Writes could not be allowed.
  • ALGORITHM=INPLACE probably locks a few blocks, combines them to fill up one block, then slides forward. This requires some degree of locking. Based on the Question, it sounds like it simply locks the whole table.
  • Note that each BTree (the PK+Data, or a secondary index), could be 'optimized' independently. But no command allows for doing such for just the main BTree (PK+data). Optimizing a single secondary index can be done by DROP INDEX + ADD INDEX, but that loses the index. Instead, consider do a NOCOPY ADD INDEX, then INSTANT DROP INDEX. Caution: This could impact USE_INDEX or FORCE INDEX if you are using such.

(Caveat: This Answer applies to InnoDB, not MyISAM.)