简体   繁体   English

django数据库设计时您将有太多行

[英]django database design when you will have too many rows

I have a django web app with postgres db; 我有一个带有postgres db的django网络应用程序; the general operation is that every day I have an array of values that need to be stored in one of the tables. 通常的操作是,每天我都有一组需要存储在表之一中的值。 There is no foreseeable need to query the values of the array but need to be able to plot the values for a specific day. 没有可预见的需要查询数组的值,但是需要能够绘制特定日期的值。 The problem is that this array is pretty big and if I were to store it in the db, I'd have 60 million rows per year but if I store each row as a blob object, I'd have 60 thousand rows per year. 问题在于该数组很大,如果我将其存储在数据库中,则每年将有6000万行,但是如果将每一行存储为blob对象,则每年将有6万行。

Is is a good decision to use a blob object to reduce table size when you do not want to query with the row of values? 当您不想用值的行查询时,使用blob对象减小表大小是否是一个好决定? Here are the two options: 这是两个选项:

option1 : keeping all 选项1 :保留所有

group(foreignkey)| parent(foreignkey) | pos(int) | length(int)
  A              |  B                 |  232     |  45
  A              |  B                 |  233     |  45
  A              |  B                 |  234     |  45
  A              |  B                 |  233     |  46
...

option2 : collapsing the array into a blob: option2 :将数组折叠成一个blob:

group(fk)| parent(fk) | mean_len(float)| values(blob)
  A      |  B         |    45          |[(pos=232, len=45),...]
...

so I do NOT want to query pos or length but I want to query group or parent. 所以我不想查询pos或length,但是我想查询group或parent。 An example of read query that I'm talking about is: 我正在谈论的读取查询的示例是:

SELECT * FROM "mytable"
LEFT OUTER JOIN "group"
ON ( "group"."id" = "grouptable"."id" )
ORDER BY "pos" DESC LIMIT 100

which is a typical django admin list_view page main query. 这是典型的django admin list_view页面主要查询。

I tried loading the data and tried displaying the table in the django admin page without doing any complex query (just a read query). 我尝试加载数据,并尝试在django管理页面中显示表,而不进行任何复杂的查询(仅是读取查询)。 When I get pass 1.5 millions rows, the admin page freezes. 当我通过150万行时,管理页面冻结。 All it takes is a some count query on that table to cause the app to crash so I should definitely either keep the data as a blob or not keep it in the db at all and use the filesystem instead. 它所要做的只是对该表进行一些计数查询,以导致应用程序崩溃,因此我绝对应该将数据保留为blob或根本不保留在db中,而应使用文件系统。

在此处输入图片说明

I want to emphasize that I've used django 1.8 as my test bench so this is not a postgres evaluation but rather a system evaluation with django admin and postgres. 我想强调一下,我已经使用django 1.8作为测试平台,所以这不是postgres评估,而是使用django admin和postgres进行的系统评估。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM