在这种情况下，MyISAM比mysql中的InnoDB快得多

Question

I have been writing the results from an algorithm that calculates distances between customers in a InnoDB table. 我一直在编写计算结果，该算法计算InnoDB表中客户之间的距离。 For example if my customers were A, B, C and D, the table in the database looks like this, among other columns: 例如，如果我的客户是A，B，C和D，则数据库中的表格如下所示：

From | To    | Distance
  A     B        344
  A     C        274
  A     D        182
  B     C        338

And so on... It is a lot of rows I think I will hit 50 million. 等等...我认为我将达到5000万，这是很多行。

The other columns are product_type and value. 其他列是product_type和value。 Those tell me how much the customer B (customer_to in the columns) buys of that product_type. 那些告诉我客户B（列中的customer_to）买了多少product_type。 That means that I have each pair multiple times depending on how many product_types the customer B buys. 这意味着我每次都有多次，具体取决于客户B购买的product_types数量。

I needed to to a query to group each customer with the products his neighbors buys and the value. 我需要查询将每个客户与他的邻居购买的产品和价值分组。 The query looks like this: 查询如下所示：

select customer_from, product_type, avg(value) as opportunity
from customer_distances
where distance < 500
group by customer_from, product_type
order by opportunity desc;

The innodb table could not answer me that query. innodb表无法回答我那个查询。 Despite I changed the net_read_timeout to 28800, the mysql connection was lost during the query. 尽管我将net_read_timeout更改为28800，但在查询期间mysql连接丢失了。

I tough it had something to do with innodb build for transactional processing and not for intensive queries. 我认为它与用于事务处理的innodb构建有关，而不是用于密集查询。 So I created a new table with MyIsam as engine and insert-select all the records from the innodb table. 所以我用MyIsam作为引擎创建了一个新表，并插入 - 从innodb表中选择所有记录。

As expected, the select was very fast (70 segs) and all other selects like count( distinct customer_from), where almost instantaneous. 正如预期的那样，选择速度非常快（70 segs），所有其他选择都像count（distinct customer_from），几乎是瞬间的。

Just for curiosity I tried to continue the process of inserting the distances in the myisam table. 出于好奇，我试图继续在myisam表中插入距离的过程。 It was a surprise for me when the program started to run at least 100 times faster than when it was working on the innodb table -for INSERTS! 当程序开始运行至少比在innodb表上工作时快100倍时，我感到很惊讶 - 对于INSERTS！

For each customer the program inserts something like 3000 rows (one for each neighbor for each product_type. Something like 300 neighbors and 10 product_types per customer). 对于每个客户，程序会插入3000行（每个product_type每个邻居一个。每个客户有300个邻居和10个product_types）。 With the innodb table inserting a single customer took something between 40 and 60 seconds (aprox. 3000 rows). 使用innodb表插入一个客户需要40到60秒（aprox.3000行）。 With the myisam table, it takes 1 second to insert 3 customers (9000 rows aprox). 使用myisam表，插入3个客户（9000行aprox）需要1秒钟。

Some extra information: 一些额外的信息：

The mysql database is in my PC (localhost). mysql数据库在我的PC（localhost）中。
The program written in java and is running from my pc. 用java编写的程序是从我的电脑上运行的。
I'm using prepared statements and I only change the data between each row and the next. 我正在使用预处理语句，我只更改每行和下一行之间的数据。 This is related to this question Why is myisam storage engine is faster than Innodb storage engine 这与此问题有关为什么myisam存储引擎比Innodb存储引擎更快

So in summary the question is: Why is MyISAM that fast with insert statements? 总而言之，问题是：为什么MyISAM快速插入语句？ What do you think? 你怎么看？

EDIT 1: I'm adding the create statements for both tables, the innodb and myisam. 编辑1：我正在为两个表添加创建语句，即innodb和myisam。 EDIT 2: I deleted some unuseful information and formated a little bit here and there. 编辑2：我删除了一些无用的信息，并在这里和那里形成了一点点。

/* INNODB TABLE */
CREATE TABLE `customer_distances` (
  `customer_from` varchar(50) NOT NULL,
  `customer_from_type` varchar(50) DEFAULT NULL,
  `customer_from_segment` varchar(50) DEFAULT NULL,
  `customer_from_district` int(11) DEFAULT NULL,
  `customer_from_zone` int(11) DEFAULT NULL,
  `customer_from_longitud` decimal(15,6) DEFAULT NULL,
  `customer_from_latitud` decimal(15,6) DEFAULT NULL,
  `customer_to` varchar(50) NOT NULL,
  `customer_to_type` varchar(50) DEFAULT NULL,
  `customer_to_segment` varchar(50) DEFAULT NULL,
  `customer_to_district` int(11) DEFAULT NULL,
  `customer_to_zone` int(11) DEFAULT NULL,
  `customer_to_longitud` decimal(15,6) DEFAULT NULL,
  `customer_to_latitud` decimal(15,6) DEFAULT NULL,
  `distance` decimal(10,2) DEFAULT NULL,
  `product_business_line` varchar(50) DEFAULT NULL,
  `product_type` varchar(50) NOT NULL,
  `customer_from_liters` decimal(10,2) DEFAULT NULL,
  `customer_from_dollars` decimal(10,2) DEFAULT NULL,
  `customer_from_units` decimal(10,2) DEFAULT NULL,
  `customer_to_liters` decimal(10,2) DEFAULT NULL,
  `customer_to_dollars` decimal(10,2) DEFAULT NULL,
  `customer_to_units` decimal(10,2) DEFAULT NULL,
  `liters_opportunity` decimal(10,2) DEFAULT NULL,
  `dollars_opportunity` decimal(10,2) DEFAULT NULL,
  `units_oportunity` decimal(10,2) DEFAULT NULL,
  PRIMARY KEY (`cliente_desde`,`cliente_hasta`,`grupo`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

/* MYISAM TABLE */
CREATE TABLE `customer_distances` (
  `customer_from` varchar(50) NOT NULL,
  `customer_from_type` varchar(50) DEFAULT NULL,
  `customer_from_segment` varchar(50) DEFAULT NULL,
  `customer_from_district` int(11) DEFAULT NULL,
  `customer_from_zone` int(11) DEFAULT NULL,
  `customer_from_longitud` decimal(15,6) DEFAULT NULL,
  `customer_from_latitud` decimal(15,6) DEFAULT NULL,
  `customer_to` varchar(50) NOT NULL,
  `customer_to_type` varchar(50) DEFAULT NULL,
  `customer_to_segment` varchar(50) DEFAULT NULL,
  `customer_to_district` int(11) DEFAULT NULL,
  `customer_to_zone` int(11) DEFAULT NULL,
  `customer_to_longitud` decimal(15,6) DEFAULT NULL,
  `customer_to_latitud` decimal(15,6) DEFAULT NULL,
  `distance` decimal(10,2) DEFAULT NULL,
  `product_business_line` varchar(50) DEFAULT NULL,
  `product_type` varchar(50) NOT NULL,
  `customer_from_liters` decimal(10,2) DEFAULT NULL,
  `customer_from_dollars` decimal(10,2) DEFAULT NULL,
  `customer_from_units` decimal(10,2) DEFAULT NULL,
  `customer_to_liters` decimal(10,2) DEFAULT NULL,
  `customer_to_dollars` decimal(10,2) DEFAULT NULL,
  `customer_to_units` decimal(10,2) DEFAULT NULL,
  `liters_opportunity` decimal(10,2) DEFAULT NULL,
  `dollars_opportunity` decimal(10,2) DEFAULT NULL,
  `units_oportunity` decimal(10,2) DEFAULT NULL,
  PRIMARY KEY (`cliente_desde`,`cliente_hasta`,`grupo`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

Answer 1

Inserts 插入

InnoDB, by default, "commits" each INSERT immediately. 默认情况下，InnoDB会立即“提交”每个INSERT 。 This can be remedied by clumping 100-1000 rows at a time. 这可以通过一次聚集100-1000行来解决。
Batching inserts will speed up both MyISAM and InnoDB - perhaps by 10x. 批量插入将加速MyISAM和InnoDB - 可能是10倍。
Learn about autocommit and BEGIN..COMMIT . 了解autocommit和BEGIN..COMMIT 。

Select 选择

InnoDB consumes more disk space than MyISAM -- typically 2x-3x; InnoDB比MyISAM消耗更多的磁盘空间 - 通常为2x-3x; this impacts table scans, which you are probably 这可能影响表扫描
For that query, a composite index on (customer_from, product_type, distance) would probably help both engines. 对于该查询，（customer_from，product_type，distance）上的复合索引可能对两个引擎都有帮助。

Tuning 调音

When running just MyISAM, set key_buffer_size to 20% of RAM and innodb_buffer_pool_size=0 . 当运行刚 MyISAM的，设置key_buffer_size的RAM和20％ innodb_buffer_pool_size=0 。
When running just InnoDB, set key_buffer_size to only 10M and innodb_buffer_pool_size to 70% of RAM. 当只运行InnoDB中，设置key_buffer_size只有10M和innodb_buffer_pool_size的RAM 70％。

Normalization and saving space 规范化，节省空间

Smaller --> more cacheable --> less I/O --> faster (in either engine) 更小 - >更多可缓存 - >更少I / O - >更快（在任一引擎中）
DECIMAL(10,2) is not the best in most cases. 在大多数情况下， DECIMAL(10,2)并不是最好的。 Consider FLOAT for non-money (such as distance ). 考虑FLOAT用于非货币（例如distance ）。 Consider fewer digits; 考虑更少的数字; that handles up to 99,999,999.99, and takes 5 bytes. 最多可处理99,999,999.99，占用5个字节。
It is usually not a good idea to have replicated columns, such as the 10 columns of customer_from and customer_to . 拥有复制列通常不是一个好主意，例如customer_from和customer_to的10列。 Have a Customers table, with both in it. 有一个Customers表，其中包含两者。
Each of your latitud and longitud are 7 bytes and have unnecessary resolution. 您的每个纬度和纵向都是7个字节，并且具有不必要的分辨率。 Suggest latidud DECIMAL(6,4) and longitud (7,4) , for a total of 7 bytes. 建议使用latidud DECIMAL(6,4)和longitud (7,4) ，总共 7个字节。 (These give 16m/52ft resolution.) （这些分辨率为16米/ 52英尺。）

Result 结果

After those suggestions, the 50M-row table will be very much smaller, and run very much faster in both engines. 在这些建议之后，50M行表将非常小，并且在两个引擎中运行得非常快。 Then run the comparison again. 然后再次运行比较。

在这种情况下，MyISAM比mysql中的InnoDB快得多

问题描述

1 个解决方案

解决方案1
4 已采纳 2016-07-15 17:23:47

在这种情况下，MyISAM比mysql中的InnoDB快得多

问题描述

1 个解决方案

解决方案1 4 已采纳 2016-07-15 17:23:47

解决方案1
4 已采纳 2016-07-15 17:23:47