In http://jan.kneschke.de/2007/8/1/mysql-proxy-learns-r-w-splitting we did a first implementation on idea of sending non-transactional reads to the slaves and writes and transactions to the master.

While that was mostly a proof of concept it already worked pretty well. In this second round on R/W splitting several issues are taken care of:

  • SELECT SQL_CALC_FOUND_ROWS ... + SELECT FOUND_ROWS()
  • INSERT ... + SELECT LAST_INSERT_ID()
  • default_db not in sync between client and backend

These changes should make Read/Write splitting more robust and make it a drop-in for more users.

Up to now we used the tutorial-keepalive.lua to implement the Read/Write-splitting. As the name says, it is a tutorial for connection keepalive, while rw-splitting is far more complex. To address this the rw-splitting is now a separate script in th lib/ folder which uses a set of base-classes we ship now.

$ cd trunk/
$ LUA_PATH="lib/?.lua" ./src/mysql-proxy \
  --proxy-lua-script=./lib/rw-splitting.lua \
  --proxy-read-only-backend-addresses=:3307 \
  --proxy-read-only-backend-addresses=:3308 \
  --proxy-read-only-backend-addresses=:3309

To start it we have to set LUA_PATH for now, you tell where it find the standard-proxy libs.

Debugging

By default the rw-splitting script will spit out a lot debug information about the different stages. You can either disable it in the script:

-- debug
proxy.global.config.is_debug = true

or by sending:

set GLOBAL rwsplit.debug = 0

on a connection.

[connect_server]
  [1].connected_clients = 0
  [1].idling_connections = 0
  [1].type = 1
  [1].state = 0
  [1] open new connection
[read_query]
  current backend   = 0
  client default db = repl
  client username   = root
  query             = select * from foo
  sending to backend : 127.0.0.1:3306
    is_slave         : false
    server default db: repl
    server username  : root
    in_trans        : false
    in_calc_found   : false
    COM_QUERY       : true

Improvements

keeping default-db in sync

One of the basic problems with rw-splitting is that each connection has a state, e.g. the default_db. If you switch to another backend you have to make sure that before we issue a SQL query, that also set the new default-db, if they are not in sync.

[read_query]
  current backend   = 0
  client default db = mysql
  client username   = root
  query             = select * from user
    server default db: repl
    client default db: mysql
    syncronizing

The client-side did a USE mysql against the master and wanted to SELECT from the a slave afterwards. As the connection the slave was still using repl from the previous query we have apply the DB-change now. For achieve this we insert a USE mysql before sending the SELECT to the slave.

In case the DB want to switch to doesn't exist on the slave, you will get an error like:

ERROR 1000 (00S00): can't change DB 'norepl' to on slave ':3307'

for the SELECT statement.

Stateful SQL Statements

Not all statements in MySQL are stateless and allow easy R/W splitting. Some of them need special support to make sure that still work:

  • SELECT SQL_CALC_FOUND_ROWS ... will lead to a SELECT FOUND_ROWS() which has to be executed on the same connection
  • INSERTing into table with auto_increment fields might lead to a SELECT LAST_INSERT_ID() which has to be execute on the same master-connection.
  • SHOW WARNINGS is similar

The solution is simple: don't give away the connection in that cases and don't send the SELECT FOUND_ROWS() to a slave. For sure there are some more statements which might not harmonize with r/w splitting. For example prepared statements. But that's another story.

5 Responses to “MySQL Proxy: more R/W splitting”

  1. safari Says:
    Hi Jan, It seems that these below lines do not make sense: -- we found the two special token, we can't find more if is_insert_id and is_in_select_calc_found_rows then break end I think it should be: if is_insert_id or is_in_select_calc_found_rows then ???
  2. Jan Kneschke Says:
    We might have `SELECT SQL_CALC_FOUND_ROWS LAST_INSERT_ID()` and expect to get both flags set. In case both are set in the loop, we can leave as this is all the loop does, setting those flags. Looping of the next tokens doesn't add anything. It just is just a tiny optimization.
  3. Pao Says:
    Hi Jan, Wow this is very nice, especially how you tokenized the queries so they're easier to manage. Will you continue to ship these base libraries as part of mysql-proxy?
  4. Jan Kneschke Says:
    Pao, the base libs are part of the packaging now and are in SVN.
  5. maple Says:
    Hi Jan, Found error in tokenizer any query with unclosed ' (select ') lead to panic with: file sql-tokenizer.l: line 408 (sql_token_append_last_token): assertion failed: (tokens->len > 0) last svn mysql-proxy

Sorry, comments are closed for this article.