MEMSQL vs. MySQL

Question

I need to start off by pointing out that by no means am I a database expert in any way. I do know how to get around to programming applications in several languages that require database backends, and am relatively familiar with MySQL, Microsoft SQL Server and now MEMSQL - but again, not an expert at databases so your input is very much appreciated.

I have been working on developing an application that has to cross reference several different tables. One very simple example of an issue I recently had, is I have to:

On a daily basis, pull down 600K to 1M records into a temporary table.
Compare what has changed between this new data pull and the old one. Record that information on a separate table.
Repopulate the table with the new records.

Running #2 is a query similar to:

SELECT * FROM (NEW TABLE) LEFT JOIN (OLD TABLE) ON (JOINED FIELD) WHERE (OLD TABLE.FIELD) IS NULL

In this case, I'm comparing the two tables on a given field and then pulling the information of what has changed.

In MySQL (v5.6.26, x64), my query times out. I'm running 4 vCPUs and 8 GB of RAM but note that the rest of my configuration is default configuration (did not tweak any parameters).

In MEMSQL (v5.5.8, x64), my query runs in about 3 seconds on the first try. I'm running the exact same virtual server configuration with 4 vCPUs and 8 GB of RAM, also note that the rest of my configuration is default configuration (did not tweak any parameters).

Also, in MEMSQL, I am running a single node configuration. Same thing for MySQL.

I love the fact that using MEMSQL allowed me to continue developing my project, and I'm coming across even bigger cross-table calculation queries and views that I can run that are running fantastically on MEMSQL... but, in an ideal world, i'd use MySQL. I've already come across the fact that I need to use a different set of tools to manage my instance (ie: MySQL Workbench works relatively well with a MEMSQL server but I actually need to build views and tables using the open source SQL Workbench and the mysql java adapter. Same thing for using the Visual Studio MySQL connector, works, but can be painful at times, for some reason I can add queries but can't add table adapters)... sorry, I'll submit a separate question for that :)

Considering both virtual machines are exactly the same configuration, and SSD backed, can anyone give me any recommendations on how to tweak my MySQL instance to run big queries like the one above on MySQL? I understand I can also create an in-memory database but I've read there might be some persistence issues with doing that, not sure.

Thank you!

Answer 1

The most likely reason this happens is because you don't have index on your joined field in one or both tables. According to this article:

https://www.percona.com/blog/2012/04/04/join-optimizations-in-mysql-5-6-and-mariadb-5-5/

Vanilla MySQL only supports nested loop joins, that require the index to perform well (otherwise they take quadratic time).

Both MemSQL and MariaDB support so-called hash join, which does not require you to have indexes on the tables, but consumes more memory. Since your dataset is negligibly small for modern RAM sizes, that extra memory overhead is not noticed in your case.

So all you need to do to address the issue is to add indexes on joined field in both tables.

Also, please describe the issues you are facing with the open source tools when connect to MemSQL in a separate question, or at chat.memsql.com , so that we can fix it in the next version (I work for MemSQL, and compatibility with MySQL tools is one of the priorities for us).

MEMSQL vs. MySQL

Question

1 answers

solution1
3 2015-09-25 16:59:21

MEMSQL vs. MySQL

Question

1 answers

solution1 3 2015-09-25 16:59:21

solution1
3 2015-09-25 16:59:21