[英]Speed up local MySql to launch normalization query on big tables
I'm normalizing and "cleaning" a MySql database wherein the biggest table counts ~ 3 mln records. 我正在规范化并“清理”一个MySql数据库,其中最大的表记录了约300万条记录。
What I have to do is to rename some fields (very fast), change their order (quite fast), and doing some trim, string sanitization, extraction of some to other tables and keep the foreign key id... 我要做的是重命名某些字段(非常快),更改其顺序(非常快),进行一些修剪,字符串清理,将某些字段提取到其他表并保留外键id ...
Is there a way so I can speed up the query on my local machine? 有没有办法可以加快本地计算机上的查询速度?
I've MariaDB 10.1.21 (from XAMPP), and running on a MacBook Air 8GB Ram. 我有MariaDB 10.1.21(来自XAMPP),并在MacBook Air 8GB Ram上运行。
I've already put indexes on many fields but it's still slow as a turtle. 我已经在许多字段上添加了索引,但是它仍然像乌龟一样慢。
Any tip will be appreciated. 任何提示将不胜感激。 Thanks!
谢谢!
Edit: as requested more info and some optimization I am performing. 编辑:根据要求提供更多信息和我正在执行的一些优化。
I've basically a big table that contains columns not normalized that would normally been distributed in three tables. 我基本上有一个大表,其中包含未规范化的列,这些列通常会分布在三个表中。
What I have: 我有的:
companies ( id, name, street, city_name, category_name, subcategory_name )
what I want 我想要的是
companies ( id, name, street, id_city, id_subcategory, ... )
cities( id, name, ... )
categories( id, name )
subcategories( id, name, id_category )
So i clean and exctract the datas as follow. 因此,我按照以下说明清理和提取数据。
Trim and clean carriage returns from "dirty" fields: 修剪和干净的回车从“脏”字段返回:
update companies set mic_cat = TRIM(REPLACE(REPLACE(mic_cat, '\r', ''), '\n', ''));
Delete companies that hasn't a correct category. 删除类别不正确的公司。
delete from companies where mic_cat is null or mic_cat = '' or mac_cat is null or mac_cat = '';
Extract the data from the fields and place in new tables: 从字段中提取数据并放入新表中:
insert into categories (name) select distinct mac_cat from companies;
insert into subcategories (name, id_category) select distinct mic_cat,categories.id from companies JOIN categories ON mac_cat = categories.name;
Add the id_reference: 添加id_reference:
ALTER TABLE companies ADD COLUMN id_subcategory int;
Get the keys... 获取钥匙...
UPDATE companies left join subcategories on companies.mic_cat = subcategories.name set id_subcategory = subcategories.id;
The last one was very slow, so, I dropped all the indexes and then create just two index on companies.mic_cat and subcategories.name and it has been fastened quite a bit. 最后一个非常慢,因此,我删除了所有索引,然后在company.mic_cat和subcategories.name上仅创建了两个索引,并且已对其进行了很多固定。
UPDATE
statement. UPDATE
语句中执行所有更新。 DROP
those indexes first and ADD
back later. DROP
这些索引,然后再ADD
。 (This may help.) ALTERs
in a single ALTER
statement. ALTERs
在一个单一的ALTER
语句。 (This is not always the best advice.) Some issues that the above tries to address: 上面试图解决的一些问题:
UPDATE
without a WHERE
clause (and sometimes with a WHERE
) will scan through the entire table, being rather costly. WHERE
子句(有时带有WHERE
)的UPDATE
将扫描整个表,这是相当昂贵的。 DELETE
plus an INSERT
-- rather costly. DELETE
加上INSERT
相当昂贵。 ALTER
may or may not be able to do the work "in place". ALTER
可能或可能无法“就地”完成工作。 If multiple of your alters cannot be done that way, then it is best to do a single copy (ie, a single ALTER
) to do all the changes simultaneously. ALTER
)来同时进行所有更改。 It effectively creates a new empty table, alters it, copies all the data into it, recreates all the indexes, then renames it back into place. More on indexes... 有关索引的更多信息...
WHERE
clauses to see what indexes would be useful. WHERE
子句,看看哪些索引会有用。 INDEX(a,b)
may be much better than INDEX(a), INDEX(b)
for some queries. INDEX(a), INDEX(b)
INDEX(a,b)
可能比INDEX(a), INDEX(b)
好得多。 3M rows is possibly a lot. 3M行可能很多。 In many situations, it is better to
UPDATE
(or DELETE
) in "chunks". 在许多情况下,最好在“块”中进行
UPDATE
(或DELETE
)。 See my blog . 看我的博客 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.