At Kscope this year, I attended a half day in-depth session entitled Data Warehousing Performance Best Practices, given by Maria Colgan of Oracle. My impression, which was confirmed by folks in the Oracle world, is that she knows her way around the Oracle optimizer.
These are my notes from the session, which include comparisons of how Oracle works (which Maria gave) and how MySQL works (which I researched to figure out the difference, which is why this blog post took a month after the conference to write). Note that I am not an expert on data warehousing in either Oracle or MySQL, so these are more concepts to think about than hard-and-fast advice. In some places, I still have questions, and I am happy to have folks comment and contribute what they know.
Data Warehousing Best Practices: Comparing Oracle to MySQL, part 1 (introduction and power)
Data Warehousing Best Practices: Comparing Oracle to MySQL, part 1 (introduction and power)
At Kscope this year, I attended a half day in-depth session entitled Data Warehousing Performance Best Practices, given by Maria Colgan of Oracle. My impression, which was confirmed by folks in the Oracle world, is that she knows her way around the Oracle optimizer.
These are my notes from the session, which include comparisons of how Oracle works (which Maria gave) and how MySQL works (which I researched to figure out the difference, which is why this blog post took a month after the conference to write). Note that I am not an expert on data warehousing in either Oracle or MySQL, so these are more concepts to think about than hard-and-fast advice. In some places, I still have questions, and I am happy to have folks comment and contribute what they know.
Determining I/O throughput for a system
At Kscope this year, I attended a half day in-depth session entitled Data Warehousing Performance Best Practices, given by Maria Colgan of Oracle. In that session, there was a section on how to determine I/O throughput for a system, because in data warehousing I/O per second (iops) is less important than I/O throughput (how much actual data goes through, not just how many reads/writes).
The section contained an Oracle-specific in-database tool, and a standalone tool that can be used on many operating systems, regardless of whether or not a database exists:
If Oracle is installed, run DBMS_RESOURCE_MANAGER.CALIBRATE_IO:
Determining I/O throughput for a system
At Kscope this year, I attended a half day in-depth session entitled Data Warehousing Performance Best Practices, given by Maria Colgan of Oracle. In that session, there was a section on how to determine I/O throughput for a system, because in data warehousing I/O per second (iops) is less important than I/O throughput (how much actual data goes through, not just how many reads/writes).
The section contained an Oracle-specific in-database tool, and a standalone tool that can be used on many operating systems, regardless of whether or not a database exists:
If Oracle is installed, run DBMS_RESOURCE_MANAGER.CALIBRATE_IO:
Online Verification That Master and Slaves are in Sync
In October 2008, Baron posted How to Check MySQL Replication Integrity Continually. Here at Pythian we have developed a method based on that post, and added “verifying that masters and slaves are in sync” to our standard battery of tests.
We call it “Continual replication sync checking”. This article will explain how it works, how to test and make the procedure non-blocking, benchmarks from the real world, issues we encountered along the way, and finally Pythian’s procedure of setting up continual replication sync in a new environment.
Online Verification That Master and Slaves are in Sync
In October 2008, Baron posted How to Check MySQL Replication Integrity Continually. Here at Pythian we have developed a method based on that post, and added “verifying that masters and slaves are in sync” to our standard battery of tests.
We call it “Continual replication sync checking”. This article will explain how it works, how to test and make the procedure non-blocking, benchmarks from the real world, issues we encountered along the way, and finally Pythian’s procedure of setting up continual replication sync in a new environment.
Online Verification That Master and Slaves are in Sync
In October 2008, Baron posted How to Check MySQL Replication Integrity Continually. Here at Pythian we have developed a method based on that post, and added “verifying that masters and slaves are in sync” to our standard battery of tests.
We call it “Continual replication sync checking”. This article will explain how it works, how to test and make the procedure non-blocking, benchmarks from the real world, issues we encountered along the way, and finally Pythian’s procedure of setting up continual replication sync in a new environment.
Webinar: What you need to know for a MySQL 5.0 -> 5.1 upgrade
IOUG has a free series of three webinars on upgrading MySQL. Each webinar is an hour long, and it starts with a webinar by me tomorrow at 12 noon Central time (GMT-5) on “Why and How to Upgrade to MySQL 5.1″. The webinar assumes you are upgrading from MySQL 5.0 to MySQL 5.1, and talks a little bit about the new features, server variables, and what you need to know when upgrading to MySQL 5.1.
The software used is GoToWebinar (formerly GoToMeeting), so you will need to install that software. To register, use the links on the IOUG MySQL Upgrade Webinar Series page.
Using MySQL Partitioning Instead of MERGE Tables
One common question I get is how to use partitioning instead of MERGE tables. The process I use involves using stored procedures to create and drop partitions. This article will go over the stored procedures I use; special thanks to Roland Bouman for taking a look and giving great feedback to optimize this process.
First, a simple table, not partitioned (yet):
use test;
DROP TABLE IF EXISTS my_part;
CREATE TABLE IF NOT EXISTS my_part (
id int NOT NULL,
creationDate datetime NOT NULL,
PRIMARY KEY (id,creationDate)
) ENGINE=InnoDB;
In real, life there is more to the table than just id and creationDate. The most important part is that the partitioned field(s) need to be part of the primary key.
Using MySQL Partitioning Instead of MERGE Tables
One common question I get is how to use partitioning instead of MERGE tables. The process I use involves using stored procedures to create and drop partitions. This article will go over the stored procedures I use; special thanks to Roland Bouman for taking a look and giving great feedback to optimize this process.
First, a simple table, not partitioned (yet):
use test;
DROP TABLE IF EXISTS my_part;
CREATE TABLE IF NOT EXISTS my_part (
id int NOT NULL,
creationDate datetime NOT NULL,
PRIMARY KEY (id,creationDate)
) ENGINE=InnoDB;
In real, life there is more to the table than just id and creationDate. The most important part is that the partitioned field(s) need to be part of the primary key.