简体   繁体   English

使用mariadb存储未来行情数据(大量记录)

[英]use mariadb to store future market data (large number records)

for future market data, we need at least 1,000,000 records each day, each record has less them 10 fileds with a few letters.对于未来的市场数据,我们每天至少需要 1,000,000 条记录,每条记录少于 10 个带有几个字母的文件。 i chose mariadb 5.5 on centos 7. engine is innodb.我在centos 7上选择了mariadb 5.5。引擎是innodb。 my.cnf has following configure: my.cnf 有以下配置:

[server]
innodb_file_per_table=1
innodb_flush_log_at_trx_commit=2
innodb_buffer_pool_size=2G
innodb_log_file_size=256M
innodb_log_buffer_size=8M
bulk_insert_buffer_size=256M

when i insert records, it is not so fast, but it can be accepted.当我插入记录时,它不是那么快,但可以接受。 but when I do export data, it is very slow when innodb talbe large than some GB.但是当我导出数据时,当 innodb talbe 大于一些 GB 时它非常慢。 fields like: id, bid, ask, time, xx,xx, id is auto increment, and is the key.字段如:id、bid、ask、time、xx、xx,id 是自动递增的,是关键。 my query sql like following: select * from table where instrument="xx" and time >= "xx" and time <= "xx"我的查询 sql 如下所示: select * from table where instrument="xx" and time >= "xx" and time <= "xx"

any advise how to speed up select performance?任何建议如何加快选择性能? thanks!谢谢!

To tailor to the SELECT , make the table InnoDB and set up the clustered PRIMARY KEY so that the desired rows are consecutive.要定制SELECT ,请创建表 InnoDB 并设置聚集的PRIMARY KEY以便所需的行是连续的。 This is likely to slow down the INSERT process, but that is not an issue -- 12 inserts/second is easily handled.这可能会减慢INSERT过程,但这不是问题——12 次插入/秒很容易处理。

But let me digress for a moment -- do the 1M rows come in all at once?但是让我离题一下——100 万行是不是一下子就进来了? Or are the trickling in over 7 hours?还是超过7小时的涓涓细流? Or what?或者是什么? If all at once, sort the data according to the PK before doing the massive LOAD DATA .如果一次全部完成,请在进行大量LOAD DATA之前根据 PK 对数据进行排序。

Your query begs for PRIMARY KEY(instrument, time) .您的查询要求PRIMARY KEY(instrument, time) But a PK must be "unique";但是一个PK必须是“独特的”; is that unique?那是独一无二的吗? If not, then another column ( id ?)` should be tacked onto the end to make it unique.如果不是,那么另一列 ( id ?)` 应该添加到末尾以使其唯一。

Note that if it is unique, then you don't need an AUTO_INCREMENT ;请注意,如果它是唯一的,那么您不需要AUTO_INCREMENT get rid of it.摆脱它。 For such large tables, minimizing the number of indexes is critical, not just for performance, but for even being able to survive.对于如此大的表,最小化索引数量至关重要,不仅是为了性能,甚至是为了能够生存。

Other things to do...其他要做的事情...

  • Normalize the instrument.使仪器正常化。 That have a table of such and map it to an id, probably SMALLINT UNSIGNED (2 bytes) if there are under 65K.有一个这样的表并将其映射到一个 id,如果低于 65K,可能是SMALLINT UNSIGNED (2 字节)。 See my blog for more discussion of normalizing as you ingest.有关在摄取时标准化的更多讨论,请参阅我的博客
  • Shrink any fields you can -- FLOAT (4 bytes) is tempting, but it has round off errors.缩小任何可以缩小的字段FLOAT (4 个字节)很诱人,但它有四舍五入的错误。 DECIMAL is tricky because you need to worry about penny-stocks at one extreme and BRK-A at the other. DECIMAL很棘手,因为您需要担心一个极端的低价股和另一个极端的 BRK-A。
  • Look at the rest of the queries to make sure this change in PK does not hurt them.查看其余的查询以确保 PK 中的此更改不会伤害它们。
  • Set innodb_buffer_pool_size to about 70% of available RAM (assuming you have more than 4GB of RAM).innodb_buffer_pool_size设置为可用RAM 的大约 70%(假设您有超过 4GB 的 RAM)。
  • If you do have to keep id as an AUTO_INCREMENT , then add INDEX(id) ;如果您必须将id保留为AUTO_INCREMENT ,则添加INDEX(id) that is all that is needed to keep A_I happy.这就是让 A_I 保持快乐所需要的一切。
  • Use CHARACTER SET ascii unless you need utf8 somewhere.除非在某处需要 utf8,否则请使用CHARACTER SET ascii
  • Volume can exceed 4 billion in rare cases;在极少数情况下,交易Volume可以超过 40 亿; ponder what to do.思考该怎么做。
  • Fetching 10K rows in PK order will take only seconds.按 PK 顺序获取 10K 行只需几秒钟。
  • FULLTEXT is not useful for this application. FULLTEXT对这个应用程序没有用。
  • PARTITIONing is not likely to be useful; PARTITIONing不太可能有用; we can revisit it if you care to share the rest of the queries.如果您愿意分享其余的查询,我们可以重新访问它。 On the other hand, if you will be deleting 'old' data, then PARTITIONing is an excellent idea.另一方面,如果您要删除“旧”数据,那么PARTITIONing是一个好主意。 See my partition blog .请参阅我的分区博客

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM