We are running MariaDB v 10.1.30, testing a script to run database maintenance script for defragmenting tables and rebuilding indexes using OPTIMIZE TABLE command by using the new 10.1.1 patch of setting innodb_defragment = 1.
I've tested Alter Table with Alogorithm = INPLACE, works fine but I'm trying to make use of innodb_defragment and use optimize to avoid creating temp files when the tables are being rebuilt as done by Alter table INPLACE algorithm.
On using Optimize, there are no temp tables created however the table gets locked not allowing concurrent connections which is not the case with Alter Table with Alogorithm = INPLACE, the documentation however mentions that the optimize is done using INPLACE algorithm.
https://mariadb.org/defragmenting-unused-space-on-innodb-tablespace/
Is this a bug or am i missing something here, please advise.
Best Answer
The benefit for speed is virtually nil.
A "point query" (where you have the key and can go directly to the row) depends on the depth of the BTree. For a million rows, the depth will be about 3. For a trillion rows, about 6. Optimizing a table is very unlikely to shrink the depth.
A "range scan" (
BETWEEN
,>
, etc) walks across a block, looking at each row. Then it hops (via a link) to the next block until it has found all the rows needed. Sure, you will touch more blocks in an un-optimized table, but the bulk of the effort is in accessing each row.
The benefit for space is limited.
An
INSERT
may add to a non-full block or it may split a full block into two half-full blocks. Later, two adjacent, somewhat empty, blocks will be merged together. Hence, a BTree naturally gravitates toward a state where the average block is 69% full. That is, the benefit ofOPTIMIZE TABLE
for space is limited.Phrased differently,
OPTIMIZE
might shrink the disk footprint for a table to only 69% of what it was, but subsequent operations will just grow the table again.If you are using
innodb_file_per_table=OFF
, thenOPTIMIZE
cannot return the free blocks to the Operating system. Such blocks can be reused for futureINSERTs
.
OPTIMIZE TABLE
is invasive.
It copies the table over, locking it during the process. This is unacceptable to sites that need 100% uptime.
If you are using replication, subsequent writes may stack up behind the
OPTIMIZE
, thereby making the Slave not up-to-the-second.
Big DELETEs
After deleting lots of rows, there may be benefit to
OPTIMIZE
, but check the 69% estimate.If big deletes are a common occurrence, perhaps there are other things that you should be doing. See http://mysql.rjweb.org/doc.php/deletebig
History and internals
- Old version did
OPTIMIZE
in a straightforward way: Create a new table (same schema); copy rows into it: rename table; drop. Writes could not be allowed. ALGORITHM=INPLACE
probably locks a few blocks, combines them to fill up one block, then slides forward. This requires some degree of locking. Based on the Question, it sounds like it simply locks the whole table.- Note that each BTree (the PK+Data, or a secondary index), could be 'optimized' independently. But no command allows for doing such for just the main BTree (PK+data). Optimizing a single secondary index can be done by
DROP INDEX
+ADD INDEX
, but that loses the index. Instead, consider do aNOCOPY
ADD INDEX
, thenINSTANT
DROP INDEX
. Caution: This could impactUSE_INDEX
orFORCE INDEX
if you are using such.
(Caveat: This Answer applies to InnoDB, not MyISAM.)