简体繁体 English

哪个数据库表模式用于存储调查数据？

[英]Which database table schema for storing survey data?

原文 2012-05-08 15:47:06 5 2 php/ mysql/ database/ zend-framework/ server-load

I'm developing software for conducting online surveys. 我正在开发用于进行在线调查的软件。 When a lot of users are filling in a survey simultaneously, I'm experiencing trouble handling the high database write load. 当很多用户同时填写调查时，我在处理高数据库写入负载时遇到了麻烦。 My current table (MySQL, InnoDB) for storing survey data has the following columns: dataID, userID, item_1 .. item_n. 我当前用于存储调查数据的表（MySQL，InnoDB）具有以下列：dataID，userID，item_1 .. item_n。 The item_* columns have different data types corresponding to the type of data acquired with the specific items. item_ *列具有与使用特定项获取的数据类型相对应的不同数据类型。 Most item columns are TINYINT(1), but there are also some TEXT item columns. 大多数项目列都是TINYINT（1），但也有一些TEXT项目列。 Large surveys can have more than a hundred items, leading to a table with more than a hundred columns. 大型调查可以有超过一百个项目，导致一个包含一百多列的表格。 The users answers around 20 items in one http post and the corresponding row has to be updated accordingly. 用户在一个http帖子中回答大约20个项目，并且相应的行必须相应地更新。 The user may skip a lot of items, leading to a lot of NULL values in the row. 用户可以跳过很多项，从而导致行中有很多NULL值。

I'm considering the following solution to my write load problem. 我正在考虑以下解决我的写入加载问题的方法。 Instead of having a single table with many columns, I set up several tables corresponding to the used data types, eg: data_tinyint_1, data_smallint_6, data_text. 我没有使用包含多列的单个表，而是设置了与使用的数据类型相对应的多个表，例如：data_tinyint_1，data_smallint_6，data_text。 Each of these tables would have only the following columns: userID, itemID, value (the value column has the data type corresponding to its table). 这些表中的每一个都只有以下列：userID，itemID，value（值列具有与其表对应的数据类型）。 For one http post with eg 20 items, I then might have to create 19 rows in data_tinyint_1 and one row in data_text (instead of updating one large row with many columns). 对于一个包含例如20个项目的http帖子，我可能必须在data_tinyint_1中创建19行，在data_text中创建一行（而不是更新一个包含许多列的大行）。 However, for every item, I need to determine its data type (via two table joins) so I know in which table to create the new row. 但是，对于每个项目，我需要确定其数据类型（通过两个表连接），因此我知道在哪个表中创建新行。 My zend framework based application code will get more complicated with this approach. 使用这种方法，我的基于zend框架的应用程序代码将变得更加复杂。

My questions: 我的问题：

Will my solution be better for heavy write load? 我的解决方案对于大量写入负载会更好吗？
Do you have a better solution? 你有更好的解决方案吗？

2 个解决方案

Since you're getting to a point of abstracting this schema to mimic actual datatypes, it might stand to reason that you should simply create new table sets per-survey instead. 由于您已经开始抽象此模式以模仿实际的数据类型，因此您可能只需要为每个调查创建新的表集。 Benefit will be that the locking will lessen and you could isolate heavy loads to outside machines, if the load becomes unbearable. 如果负载变得难以承受，那么锁定将减少并且您可以将重负载隔离到外部机器的好处。

The single-survey database structure then can more accurately reflect your real world conditions and data input handlers. 然后，单一调查数据库结构可以更准确地反映您的真实世界条件和数据输入处理程序。 It ought to make your abstraction headaches go away. 它应该让你的抽象头痛消失。

There's nothing wrong with creating tables on the fly. 动态创建表没有任何问题。 In some configurations, soft sharding is preferable. 在一些配置中，软分片是优选的。

This looks like obvious solution would be to use document database for fast writes and then bulk-insert answers to MySQL asynchronously using cron or something like that. 这看起来很明显的解决方案是使用文档数据库进行快速写入，然后使用cron或类似的东西以异步方式批量插入MySQL的答案 。 You can create view in the document database for quick statistics, but allow filtering and other complicated stuff only in MySQ if you're not a fan of document DBMSs. 您可以在文档数据库中创建视图以进行快速统计，但如果您不是文档DBMS的粉丝，则仅在MySQ中允许过滤和其他复杂的内容。