简体   繁体   English

在SQL数据库中存储非关系整数列表

[英]Store a list of integer that is non-relational in SQL database

First, to note, I have read several articles that all say storing a list of values in a single column in SQL is a bad idea and breaks all design protocols. 首先,要注意,我读过几篇文章,都说在SQL的单个列中存储值列表是一个坏主意,并且破坏了所有设计协议。 In fact, they all say to redesign the table so that IT IS relational. 实际上,他们都说要重新设计表格,以使其与IT相关。 So I'm not looking for the easy-out solution here, but the correct one. 因此,我不是在这里寻找简单易用的解决方案,而是在寻找正确的解决方案。

Here's the problem, I have two variables: 这是问题,我有两个变量:

1) the unique userId that relates to other tables and 1)与其他表相关的唯一userId

2) a simple Hashset of integers that relate to items in a JSON file. 2)与JSON文件中的项目相关的简单整数整数集。

Thus, these number do not relate to any other table in my SQL database. 因此,这些数字与我的SQL数据库中的任何其他表都不相关。 The max value in this list will likely not go over 1000, but who knows. 该列表中的最大值可能不会超过1000,但谁知道呢。 Also the list could be out of order or skip multiple values in-between. 此外,列表可能乱序,或在其间跳过多个值。 I will never query the numbers, but I will load them when the user logs in and re-save them when the user logs out. 我永远不会查询这些数字,但是我会在用户登录时加载它们,并在用户注销时重新保存它们。

The options I have read are a comma separated value column, an xml, or a lookup table (which in this case I don't know what I'm looking up to and with a 1000 numbers, there could be 1E-249 permutations). 我读过的选项是一个逗号分隔的值列,一个xml或一个查找表(在这种情况下,我不知道我要查找的内容,并且有1000个数字,可能有1E-249个排列) 。

Therefore, I ask what would be the correct way to save this list of integers. 因此,请问保存该整数列表的正确方法是什么。

Using normalization you can create a table relationship that stores nested data as follows: 使用规范化,您可以创建一个表关系来存储嵌套数据,如下所示:

So following DB normalization, you would use 3 tables; 因此,在进行数据库归一化之后,您将使用3个表。

  • User table 用户表

    • Id (int) ID(int)
    • UserId (int) UserId(int)
    • Username (varchar) 用户名(varchar)
  • Json File Table Json文件表

    • Id (int) ID(int)
    • UserId (varchar) UserId(varchar)
  • Json File Item Table Json文件项目表
    • Id (int) ID(int)
    • JsonFileId (int) JsonFileId(int)
    • Value (int) 值(整数)

Then from this structure, you can store the following: 然后,从此结构中,您可以存储以下内容:

  • User #1 用户#1
    • Json File #1 杰森文件#1
      • Json File Value #1 Json文件值#1
      • Json File Value #2 Json文件值2
    • Json File #2 Json文件#2
      • Json File Value #3 Json文件值3
      • Json File Value #4 Json文件值4

This essentially lets you have 'x' users to 'x' json files which each have 'x' values. 从本质上讲,这使您有'x'个用户使用'x'个json文件,每个文件都有'x'个值。

In regards to you 'saving and loading' the data, you wouldnt want to load them all and re save them all on login and log out respectivly as this would be a waste of resources. 关于“保存和加载”数据,您不希望全部加载它们,并在登录时分别保存它们并分别注销,因为这将浪费资源。 You would track the changes and save the delta between the two. 您将跟踪更改并保存两者之间的差异。 Ie you login, load the data, the user may change value #2 and thus you only update record number 2. 即您登录并加载数据,用户可以更改值2,因此您仅更新记录编号2。

Given that you do not need to preserve order and do not need to preserve duplicates, if you wish to have your database in first normal form, you can easily store this in a two-column table in any database: 鉴于您不需要保留顺序并且不需要保留重复项,因此,如果您希望数据库采用第一种普通形式,则可以轻松地将其存储在任何数据库的两列表中:

create table MyTable (
  UserId int not null,
  Value int not null,
  primary key (UserId, Value) );

If the user with ID 1 holds values [1, 2, 8, 33, 999], and the user with ID 2 holds values [3, 4], you store that as 7 records: 如果ID为1的用户持有值[1、2、8、33、999],而ID为2的用户持有值[3、4],则将其存储为7条记录:

UserId | Value
     1 | 1 
     1 | 2
     1 | 8
     1 | 33
     1 | 999
     2 | 3
     2 | 4

This is similar to your lookup table idea, except you don't have to look up what the values mean, you can store the values directly in that table. 这与您的查找表思路类似,不同之处在于您不必查找值的含义,可以将值直接存储在该表中。

The benefit of this is that there is exactly one canonical representation for any set of numbers, and the database cannot hold anything other than a set of numbers. 这样做的好处是,任何一组数字都只有一个规范的表示形式,并且数据库不能保存一组数字以外的任何东西。 You do not have to worry in your code about the possibility of some record holding duplicate values that you would have to ignore in your application ( [1, 1] ), two users with identical values being represented differently ( [1, 2] vs. [2, 1] ), or users with completely invalid values ( [1, "abc"] or 1.23 ). 您不必担心代码中某些记录包含重复值的可能性,而这些重复值在应用程序中必须忽略( [1, 1] ),两个具有相同值的用户被不同地表示( [1, 2] vs 。 [2, 1] )或具有完全无效值的用户( [1, "abc"]1.23 )。

Another benefit is that this can be easily handled with 100% standard SQL. 另一个好处是,可以使用100%标准SQL轻松处理。 You do not need any non-standard extensions, so you can keep your code portable across DB implementations. 您不需要任何非标准扩展,因此可以使代码跨数据库实现可移植。

Practical concerns may suggest some other approach. 实际问题可能会建议其他方法。 Your suggestions of CSV or XML are valid. 您的CSV或XML建议有效。 Another possibility is JSON. 另一种可能性是JSON。 All three have native support in at least one major DB implementation, and although in most cases the table approach should be good enough, depending on the size of your data set and access patterns, it is possible that a denormalised database will allow for better performance. 这三者在至少一个主要的数据库实现中都具有本机支持,尽管在大多数情况下,表方法应该足够好,但取决于数据集的大小和访问模式,非规范化的数据库可能会带来更好的性能。 。 The code to read and update the values may be slightly easier to write as well. 读取和更新值的代码也可能更容易编写。

This is a trade off you will need to make yourself. 这是一个权衡,您需要自己做。 I hope you now have enough information to make it. 我希望您现在有足够的信息来做到这一点。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM