TokuDB is a transactional, fully ACID-compliant storage engine that uses fractal trees for data and indexes, instead of MySQL's standard B-tree implementation. Combined with TokuDB's message-based architecture, TokuDB data and indexes do not fragment, have completely online column and index addition and removal and unlike InnoDB and XtraDB, do not fall apart when indexes no longer fit into memory.
We spoke with Martin Farach-Colton, co-founder and Chief Scientist of Tokutek about the TokuDB storage engine, including the new features in TokuDB 5.0, which was announced on Tuesday.
Percona's detailed review of TokuDB - including some great graphs and these sound byte-type quotes:
"What makes fractal indexes so interesting is the amount of IO operations to update index tree is significantly less than for usual B-Tree index."
"One consequence of having such fast indexes, is that you can maintain a richer set of indexes at a given incoming data rate, enabling much higher query performance. "
Note that at this point, TokuDB is fully transactional and allows more than just SERIALIZABLE isolation levels -- this review was almost 2 years ago, so there are some old points like this that are no longer an issue. Similarly, the recovery logs exist now.
I did some tests in Nov 2009 - no pretty graphs, but there are great charts of numbers in there.
Zardosht explains what a Fractal Tree is at the March 2011 Boston MySQL User Group meeting.
Tokutek sessions at Collaborate:
Understanding Indexing without needing to know about data structures, by Zardosht Kasheff - Monday, April 11 from 10:30 am - 11:30 am
How Fractal Trees Speed Up Trickle Loading While Maintaining Indexes by Bradley Kuszmaul - Monday, April 11 from 2:30 pm - 3:30 pm
Exploiting Fast Indexing in MySQL by Bradley Kuszmaul - Monday, April 11 from 3:45 pm - 4:45 pm
Tokutek session at the O'Reilly MySQL Conference:
Understanding Indexing without needing to know about data structures, by Zardosht Kasheff - Tuesday, April 12 from 11:55 am - 12:40 pm
In this week’s ear candy we talk about how to monitor a long-running batch of inserts by doing a simple SHOW statement (Unfortunately, the command is not SHOW PROCESSLIST, although that does work in TokuDB!)
Where you can see us
On Thursday, April 7, 2011 at 6:30 PM Giuseppe Maxia will speak at the San Francisco MySQL User Group about "Advanced Replication for the Masses". The topic will cover how to do advanced replication tasks not possible with the standard MySQL build, such as allowing a slave to have more than one master and having a multi-threaded process to apply SQL statements to the slave.
Sheeri will be at Collaborate in Orlando, Florida from Sunday April 10th through Thursday April 14th. She is organizing the Community dinner, eastern US edition on Sunday April 10th at 5 pm at Maggiano's Little Italy near the Convention Center.
Sarah will be at the O'Reilly MySQL Conference in Santa Clara, California from Monday, April 11th through Thursday April 14th, including the community dinner, western US edition on Monday April 11th at 7 pm at Pedro's.
Sheeri will be speaking about monitoring MySQL efficiently with Nagios at the Professional IT Community Conference, otherwise known as PICC, in New Brunswick, New Jersey. The conference is Friday, April 29 – Saturday April 30, 2011.
Sheeri will be at OpenDBCamp, Fri May 6th - Sun May 8th in Sardinia, Italy.
Sarah will be at the Velocity Conference, Tuesday June 14 - 16 in Santa Clara, California, speaking about Where is your data cached (and where should it be cached)?.