简体   繁体   English

如何加快MYSQL更新?

[英]How can I speed up a MYSQL update?

I have a users table with a datetime field last_seen_at . 我有一个带日期时间字段last_seen_atusers表。 Updating this field takes around 120ms, and i'd like it to be a lot quicker as i do it on pretty much every pageload on my site. 更新该字段大约需要120毫秒,我希望它要快得多,因为我在网站上几乎所有的页面加载中都这样做。 I can't work out why it's so slow: there's around 55,000 records which shouldn't be problematically large (i'd have thought). 我不知道为什么它这么慢:大约有55,000条记录应该不会有太大问题(我想过)。

Here's the table info: 这是表格信息:

mysql> show table status like 'users';
+-------+--------+---------+------------+-------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+
| Name  | Engine | Version | Row_format | Rows  | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time         | Update_time | Check_time | Collation       | Checksum | Create_options | Comment |
+-------+--------+---------+------------+-------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+
| users | InnoDB |      10 | Compact    | 55609 |            954 |    53051392 |               0 |     43352064 |  26214400 |          67183 | 2015-09-22 13:12:13 | NULL        | NULL       | utf8_general_ci |     NULL |                |         |
+-------+--------+---------+------------+-------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+-----------------+----------+----------------+---------+

mysql> desc users;
+---------------------------------+--------------+------+-----+-----------------+----------------+
| Field                           | Type         | Null | Key | Default         | Extra          |
+---------------------------------+--------------+------+-----+-----------------+----------------+
| id                              | int(11)      | NO   | PRI | NULL            | auto_increment |
| last_seen_at                    | datetime     | YES  | MUL | NULL            |                |
+---------------------------------+--------------+------+-----+-----------------+----------------+

mysql> show indexes from users;
+-------+------------+------------------------------------------------+--------------+---------------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name                                       | Seq_in_index | Column_name                     | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+------------------------------------------------+--------------+---------------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| users |          0 | PRIMARY                                        |            1 | id                              | A         |       57609 |     NULL | NULL   |      | BTREE      |         |               |
| users |          1 | index_users_on_last_seen_at                    |            1 | last_seen_at                    | A         |       57609 |     NULL | NULL   | YES  | BTREE      |         |               |
+-------+------------+------------------------------------------------+--------------+---------------------------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

As you can see i've got an index on the last_seen_at column already. 如您所见,我已经在last_seen_at列上建立了索引。 I've ommitted all other columns (apart from id) for clarity's sake. 为了清楚起见,我省略了所有其他列(除了id之外)。

When i update last_seen_at i do it like so: 当我更新last_seen_at时,我会这样:

update users set last_seen_at = '2015-10-05 12:34:45' where id = 1182;

MySQL server info: Server version: 5.5.44-0ubuntu0.12.04.1 (Ubuntu) MySQL服务器信息: Server version: 5.5.44-0ubuntu0.12.04.1 (Ubuntu)

Is there anything i can do to speed up the update? 我有什么办法可以加快更新速度?

EDIT - i'd previously said the query was taking 700ms. 编辑 -我之前曾说过查询需要700毫秒。 it's actually more like 120ms, sorry, i was looking at the wrong query. 它实际上更像是120毫秒,对不起,我正在查看错误的查询。 This still feels a bit too long though. 但这仍然感觉太长了。 Is this actually a reasonable write time after all? 毕竟这实际上是一个合理的写入时间吗?

EDIT - all my timings come from manually entering sql queries in the mysql shell client. 编辑 -我所有的时间来自在mysql shell客户端中手动输入sql查询。 I do use MySQL in my Ruby on Rails web app, but that app is not involved for the purposes of this question: i'm purely looking at the database level. 我在Ruby on Rails网络应用程序中确实使用了MySQL,但是出于这个问题的目的,该应用程序并未涉及:我只是在看数据库级别。

Well, you appear to be performing the update in the most efficient manner - ie using the primary key on the table, so there is not much that can be done there. 好吧,您似乎正在以最有效的方式执行更新-即使用表上的主键,因此在那里没有什么可以做的。 Assuming the 120ms to update is purely the time taken by the db server (as opposed to the round trip in the web page), I can only think of a few things that might help: 假设120ms的更新时间纯粹是数据库服务器所花费的时间(与网页中的往返过程相反),我只能想到一些可能会有所帮助的事情:

  • You have indexed the column being updated - that typically adds a little time to the update as the index has to be maintained. 您已经为要更新的列建立了索引-由于必须维护索引,因此通常会增加一些更新时间。 I see that you need to use that column, so you can't get rid of the index; 我看到您需要使用该列,因此您无法摆脱索引; but if you could, you might well see better performance. 但是如果可以的话,您可能会看到更好的性能。

  • Batching updates is sometimes a good way of avoiding the real-time performance hit, but still achieving what you want. 批处理更新有时是避免实时性能下降的好方法,但仍然可以实现您想要的功能。
    You could have the web-triggered insert go into a holding table with a timestamp field, then (offline) batch update the real data. 您可以将通过Web触发的插入插入带有时间戳字段的保存表中,然后(脱机)批量更新实际数据。 See https://dba.stackexchange.com/questions/28282/whats-the-most-efficient-way-to-batch-update-queries-in-mysql for an example batch update statement. 有关示例批处理更新语句,请参见https://dba.stackexchange.com/questions/28282/whats-the-most-efficiency-way-to-batch-update-queries-in-mysql

  • DB optimisation may help, but only if the db is not in good shape already - so things like memory allocation, tablespace fragmentation, buffer pools etc. 数据库优化可能会有所帮助,但前提是该数据库的状态尚未良好-诸如内存分配,表空间碎片,缓冲池等之类的东西。

Good luck! 祝好运!

There is not much you can do about this. 您对此无能为力。 You already have an index on your column, and it just takes some time to find the row using the index and update it. 您的列上已经有一个索引,只需花费一些时间即可使用索引查找行并对其进行更新。

The index might be fragmented, which will slow down your lookup. 索引可能是零散的,这会减慢您的查找速度。 You can rebuild the index using analyze . 您可以使用analyze重建索引。

An option might be to delay the update or to prevent it from blocking page building by using some asynchronous / background task in the programming environment you are using (aka, fire-and-forget). 一种选择是通过在您正在使用的编程环境中使用某些异步/后台任务(即即发即弃)来延迟update或阻止其阻止页面构建。

Write user events (id, now() equivalent to a log file). 编写用户事件(id,now()等同于日志文件)。 Process the log file from another process such as Create Event or entirely in another programming language such as Java, you name it. 您可以使用其他过程(例如Create Event)或完全使用其他编程语言(例如Java)来处理日志文件。 Let's call that the worker process ( wp ). 我们称它为工作进程( wp )。

So the user is operating in an environment where the activity occurs, but does not endure blocking overhead of the update call slowing his/her UX (user experience). 因此,用户在发生活动的环境中进行操作,但不能忍受阻止更新调用的开销,这会减慢其UX(用户体验)的速度。 Blocking means they wait. 阻止意味着他们等待。 Rather, the activity is logged much quicker, such as an fwrite (language specific) to a log file. 而是将活动记录得更快,例如将fwrite(特定于语言)记录到日志文件中。

The log file (Open for Append) concept can be deployed to a dedicated directory that either has all user activity in 1 file, or 1 file per user. 可以将日志文件(开放用于追加)概念部署到专用目录,该目录可以将所有用户活动记录在1个文件中,或者每个用户包含1个文件。 In the latter case, the wp has an easy task, just get the last line logged for the single update statement. 在后一种情况下, wp的任务很简单,只需获取单个update语句记录的最后一行即可。 For instance, if there are 11 lines in there, there is 1 update call, not 11. 例如,如果其中有11行,则有1个更新调用,而不是11个。

The wp runs in the background, in a cron job, Create Event, anything. wp在后台运行,在cron作业“创建事件”中执行任何操作。 It updates as necessary. 它会根据需要进行更新。 With 55k users, this system is relatively small. 拥有55k用户,该系统相对较小。 Can fire once every nnn minutes, every 10 seconds, whatever. 可以每nnn分钟,每10秒触发一次,无论如何。

As for a mysql Create Event stub to contemplate: 至于要考虑的mysql Create Event存根:

CREATE EVENT userUpdateActivity
    ON SCHEDULE
      EVERY 10 SECOND
    DO
(something)

or the some other wp strategy. 或其他一些wp策略。

The wp processes and deletes the open for append log file. wp处理并删除打开的追加日志文件。 Locking and deletion strategy of the log file periodically (daily?) can be dreamt up. 可以设想定期(每天?)锁定和删除日志文件的策略。

The problem with a single log file is that the wp one must either: 单个日志文件的问题在于, wp必须执行以下任一操作:

  • Read all rows and update each row manually 读取所有行并手动更新每行
  • or, Read all rows and get last one for a given user and update just that one 或者,读取所有行并获取给定用户的最后一行,然后更新该行

It is more difficult to clean up, delete that is, at the user-level 在用户级别进行清理,删除更加困难

The benefit of a single log file is that it is self-contained and no directory searching is required. 单个日志文件的好处是它是独立的,不需要目录搜索。

Mysql Create Event manual page. Mysql Create Event手册页。 One would still need to do a Load Data Infile to get to the data if done purely in mysql. 如果仅在mysql中完成操作,则仍然需要执行Load Data Infile来获取数据。

I would opt for a programming language that is well-suited for such logfile processing, such a java, c#, python, just about anything, rather than a clunky Create Event into a processing table. 我会选择一种非常适合此类日志文件处理的编程语言,例如java,c#,python等几乎所有内容,而不是将笨拙的Create Event放入处理表中。

The main takeaway here, though, is to make it asynchronous. 不过,这里的主要要点是使其异步。

If the record is wide and the table is busy, it may be best to move this column (plus id ) into a "parallel" table. 如果记录很宽并且表忙,那么最好将此列(加上id )移到“并行”表中。

When updating a row, another copy of the row is made until the transaction (and possibly other transactions) are complete. 更新一行时,将完成该行的另一个副本,直到事务(以及可能的其他事务)完成为止。 This involves copying the entire record, possibly involving a block split. 这涉及复制整个记录,可能涉及块拆分。 Plus there are issues with REDO log and UNDO log. 另外,REDO日志和UNDO日志存在问题。 And if you are using Replication, there is the binlog. 如果使用的是复制,则有binlog。 A narrow row will lessen all these issue. 窄行将减少所有这些问题。

120ms sounds very high, so I guess a lot of other stuff is going on in this table. 120ms听起来很高,所以我猜这张桌子上还有很多其他的东西。 So, splitting the table may decrease contention. 因此,拆分表可以减少争用。

Also, is this UPDATE part of a bigger transaction? 另外,此UPDATE是否是较大事务的一部分? Or done outside a transaction but with autocommit=1? 还是在事务之外但使用autocommit = 1完成? The latter makes more sense. 后者更有意义。

It's just really bad design to go and issue a db write on every page view, scales very badly. 在每个页面视图上都发出db写入是非常糟糕的设计,可伸缩性非常差。 It is considered good style to not issue any writes during a GET request - and while you don't necessarily need to be religious over it, it is a very good practice for scaling. 在GET请求期间不发出任何写操作被认为是一种很好的样式-尽管您不一定需要对此保持虔诚,但这是扩展的一种很好的做法。

If you absolutely need those timestamps, a simple way to do it would be to dump them in a key-value storage - memcached, redis, whatever - and write to db from time to time. 如果您绝对需要这些时间戳,那么一个简单的方法就是将它们转储到键值存储中-memcached,redis等,然后不时写入db。

A super-simple way to increase your throughput would be to write updated values only if they differ from previous one by at least an hour (or a day) - that would guarantee every user will basically get one write per browsing session, cutting your writes 10-100 times depending on your site usage patterns. 一种提高吞吐量的超简单方法是仅在更新值与先前值相差至少一个小时(或一天)的情况下写入更新的值-这样可以确保每个用户基本上每个浏览会话都会获得一次写入,从而减少了写入次数10-100次,具体取决于您的网站使用方式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM