Date

I was wondering for quite a while if MySQLs C-API (libmysqlclient) can really do async queries and where it is lacking support.

Scanning through the source (sql/client.c) you see:

int mysql_real_query(MYSQL *mysql, const char *query, ulong length)
{

  if (mysql_send_query(mysql,query,length))
    return 1;

  return((int) (*mysql->methods->read_query_result)(mysql));
}

Digging a bit deeper you discover that mysql_send_query() and mysql_read_query_result() are public, but undocumented. Well, but that doesn't stop us.

When I talk about async queries I mean:

  • being able to issue several queries in parallel without using threads

That is usually what you need in event-driven applications and allows you to create parallel application without having to use too many system resources (memory, threads, ...).

As long as you use mysql_real_query() you can't get there. It waits until the resultset is received and you can not do anything in this application until the result-set is received. The usual way is "create a thread, do the work in the other thread". In the event-driven model you send your query to the server and ask the kernel (with select(), poll(), ...) to tell you when the server has something for you. In the meantime to you do something with the application (like send more queries) and check from time to time if something was returned from the server.

As an example I would expect a output like below:

async-clib.c:53: connected m->net.fd = 9
async-clib.c:53: connected m->net.fd = 10
async-clib.c:53: connected m->net.fd = 11
async-clib.c:53: connected m->net.fd = 14
async-clib.c:73: sending 'SELECT SLEEP(0.1 * 4)' on m->net.fd = 9
async-clib.c:73: sending 'SELECT SLEEP(0.1 * 3)' on m->net.fd = 10
async-clib.c:73: sending 'SELECT SLEEP(0.1 * 2)' on m->net.fd = 11
async-clib.c:73: sending 'SELECT SLEEP(0.1 * 1)' on m->net.fd = 14
async-clib.c:15: a result for m->net.fd = 14
async-clib.c:15: a result for m->net.fd = 11
async-clib.c:15: a result for m->net.fd = 10
async-clib.c:15: a result for m->net.fd = 9
  • 4 connections are opened
  • each sends a query with a different cost
  • we read the result back as it returns

The Code

 #include <string.h>
 #include <mysql.h>
 #include <glib.h>
 #include <event.h>

We want to use glib2 for some data structures (matter of taste) and libevent for the event-handling to make it portable (also a matter of taste).

To get everything going, we init libevent and setup some structures.

int main(int argc, char **argv) {
    GPtrArray *connections;
    GPtrArray *events;
    int i;
    struct event_base *evbase;

    evbase = event_base_new();

    connections = g_ptr_array_new();
    events = g_ptr_array_new();

... and start up the connections

    /* init all the connections */
    for (i = 0; i < 4; i++) {
            MYSQL *m;

            m = mysql_init(NULL);
            m->free_me = 1;

            if (!mysql_real_connect(m, "127.0.0.1", "root", "password", NULL, 3306, NULL, 0)) {
                    g_error("%s: %s (%d)",
                                    G_STRLOC,
                                    mysql_error(m),
                                    mysql_errno(m));
            }

            g_print("%s: connected m->net.fd = %d\n", G_STRLOC, m->net.fd);
            g_ptr_array_add(connections, m);
    }

... all are connected now and we can actually run queries on them:

    /* send a query to all connections
     *
     * the proof that it works, we give each of the query decreasing cost:
     * - 1st connection waits 4ms
     * - 2nd connection waits 3ms ...
     *
     * we should see that the queries return in reverse order
     */
    for (i = 0; i < connections->len; i++) {
            MYSQL *m = connections->pdata[i];
            struct event *ev;
            gchar *q = g_strdup_printf("SELECT SLEEP(0.1 * %d)", (connections->len - i));

            mysql_send_query(m, q, strlen(q));

            /* register the fd for this fd */
            ev = g_new0(struct event, 1);
            event_set(ev, m->net.fd, EV_READ, read_result, m);
            event_base_set(evbase, ev);

            event_add(ev, NULL); /* poke us when the server sends us a resultset */
            g_ptr_array_add(events, ev);

            g_print("%s: sending '%s' on m->net.fd = %d\n",
                            G_STRLOC,
                            q,
                            m->net.fd);

            g_free(q);
    }

... we call mysql_send_query() to send the query to the server and register the event in the event-pool. In case we can read something from the mysql-socket we want libevent to call our hook.

Everything is setup nicely. Let's wait until the server sends something back:

    /* the event-loop */
    while (0 == event_base_loop(evbase, 0)); /* event_base_loop() returns 1 no events are registered */

If all queries are handled, we just leave the loop, cleanup and exit:

    /* cleanup */
    for (i = 0; i < connections->len; i++) mysql_close(g_ptr_array_index(connections, i));
    g_ptr_array_free(connections, TRUE);

    for (i = 0; i < events->len; i++) g_free(g_ptr_array_index(events, i));
    g_ptr_array_free(events, TRUE);

    event_base_free(evbase);

    return 0;
}

Our callback now only has to read the query result and do something with it. As a dummy we store it and free it right away.

/**
 * handle the result-sets we receive from the server
 */
static void read_result(int fd, short event, void *_userdata) {
    MYSQL *m = _userdata;

    if (0 == mysql_read_query_result(m)) {
            MYSQL_RES *res = mysql_store_result(m);

            g_print("%s: a result for m->net.fd = %d\n",
                            G_STRLOC,
                            m->net.fd);

            mysql_free_result(res);
    } else {
            g_print("%s: mysql_read_query_result() failed for fd = %d: %s (%d)\n",
                            G_STRLOC,
                            m->net.fd,
                            mysql_error(m),
                            mysql_errno(m));
    }
}

That's it. All the pieces are there to create a nice little benchmarking tool that can create several thousand connections in parallel without eating too many resources.

For 10 connections the RSS is 1949k, for 100 connections it is 3212k.

Conclusion

Async Queries are possible with the mysqlclient lib and they are very efficient to write something like a load-generator.

But the API isn't complete at all. Several pieces are missing to make it fully async:

  • mysql_connect() is blocking only
  • mysql_send_query() doesn't seem to handle EAGAIN (send buffer is full)
  • I don't know if mysql_use_result() can be return EAGAIN easily

Even without the above available in the full non-blocking way, you can also get pretty far with mysql_send_query().


Comments

Enable javascript to load comments.