MySQL Proxy: more R/W splitting

In MySQL Proxy learns R/W splitting we did a first implementation on idea of sending non-transactional reads to the slaves and writes and transactions to the master.

While that was mostly a proof of concept it already worked pretty well. In this second round on R/W splitting several issues are taken care of:

SELECT SQL_CALC_FOUND_ROWS ... + SELECT FOUND_ROWS()
INSERT ... + SELECT LAST_INSERT_ID()
default_db not in sync between client and backend

These changes should make Read/Write splitting more robust and make it a drop-in for more users.

Up to now we used the tutorial-keepalive.lua to implement the Read/Write-splitting. As the name says, it is a tutorial for connection keepalive, while rw-splitting is far more complex. To address this the rw-splitting is now a separate script in th lib/ folder which uses a set of base-classes we ship now.

$ cd trunk/
$ LUA_PATH="lib/?.lua" ./src/mysql-proxy \
  --proxy-lua-script=./lib/rw-splitting.lua \
  --proxy-read-only-backend-addresses=:3307 \
  --proxy-read-only-backend-addresses=:3308 \
  --proxy-read-only-backend-addresses=:3309

To start it we have to set LUA_PATH for now, you tell where it find the standard-proxy libs.

Debugging

By default the rw-splitting script will spit out a lot debug information about the different stages. You can either disable it in the script:

-- debug
proxy.global.config.is_debug = true

or by sending:

set GLOBAL rwsplit.debug = 0

on a connection.

[connect_server]
  [1].connected_clients = 0
  [1].idling_connections = 0
  [1].type = 1
  [1].state = 0
  [1] open new connection
[read_query]
  current backend   = 0
  client default db = repl
  client username   = root
  query             = select * from foo
  sending to backend : 127.0.0.1:3306
    is_slave         : false
    server default db: repl
    server username  : root
    in_trans        : false
    in_calc_found   : false
    COM_QUERY       : true

Improvements

keeping default-db in sync

One of the basic problems with rw-splitting is that each connection has a state, e.g. the default_db. If you switch to another backend you have to make sure that before we issue a SQL query, that also set the new default-db, if they are not in sync.

[read_query]
  current backend   = 0
  client default db = mysql
  client username   = root
  query             = select * from user
    server default db: repl
    client default db: mysql
    syncronizing

The client-side did a USE mysql against the master and wanted to SELECT from the a slave afterwards. As the connection the slave was still using repl from the previous query we have apply the DB-change now. For achieve this we insert a USE mysql before sending the SELECT to the slave.

In case the DB want to switch to doesn't exist on the slave, you will get an error like:

ERROR 1000 (00S00): can't change DB 'norepl' to on slave ':3307'

for the SELECT statement.

Stateful SQL Statements

Not all statements in MySQL are stateless and allow easy R/W splitting. Some of them need special support to make sure that still work:

SELECT SQL_CALC_FOUND_ROWS ... will lead to a SELECT FOUND_ROWS() which has to be executed on the same connection
INSERTing into table with auto_increment fields might lead to a SELECT LAST_INSERT_ID() which has to be execute on the same master-connection.
SHOW WARNINGS is similar

The solution is simple: don't give away the connection in that cases and don't send the SELECT FOUND_ROWS() to a slave. For sure there are some more statements which might not harmonize with r/w splitting. For example prepared statements. But that's another story.

Debugging

Improvements

keeping default-db in sync

Stateful SQL Statements

Comments