[英]Best way to handle duplicated rows
I have insurance companies "dictionary" in my database, let's say:我的数据库中有保险公司的“字典”,比如说:
+----+-------------------+----------+
| ID | Name | Data |
+----+-------------------+----------+
| 1 | InsuranceCompany1 | SomeData |
+----+-------------------+----------+
But I'm fetching data from another system, and in result I got duplicates of insurance companies, but without my data:但是我从另一个系统获取数据,结果我得到了保险公司的副本,但没有我的数据:
+----+-------------------+----------+
| ID | Name | Data |
+----+-------------------+----------+
| 1 | InsuranceCompany1 | SomeData |
+----+-------------------+----------+
| 2 | InsuranceCompany1 | |
+----+-------------------+----------+
Both records are related in variety of models but they refer to the same data, and what I want is to pair these records without changing queries or data in other tables, so noone knows there are two records, but both refer to one instance which is两条记录在各种模型中都相关,但它们引用相同的数据,我想要的是配对这些记录而不更改其他表中的查询或数据,所以没有人知道有两条记录,但都引用一个实例,即
+----+-------------------+----------+
| 1 | InsuranceCompany1 | SomeData |
+----+-------------------+----------+
My question is: Is there some proper way to handle situations like this?我的问题是:是否有一些适当的方法来处理这种情况? I've came up with solution which is to add parent_id column, and manually set parent_id in duplicated rows, and then override Eloquent methods like find in a model to return parent if there is parent_id set.
我想出了解决方案,即添加 parent_id 列,并在重复的行中手动设置 parent_id,然后覆盖 Eloquent 方法,如在 model 中找到方法,如果设置了 parent_id,则返回父级。
Copying SomeData column is not an option because there can be condition if insurance_company_id == id;
复制 SomeData 列不是一种选择,因为如果
insurance_company_id == id;
You can try creating a view of your dict
table something like this:您可以尝试创建
dict
表的视图,如下所示:
CREATE VIEW unique_dict AS
SELECT MIN(ID) ID,
Name,
GROUP_CONCAT(Data) Data
FROM dict
GROUP BY Name
That will give you one row per name.这会给你每个名字一行。
Then, in your queries requiring one row per name, SELECT from the unique_dict
view rather than the dict
table.然后,在每个名称需要一行的查询中,SELECT 来自
unique_dict
视图而不是dict
表。
GROUP_CONCAT()
yields a list of values from Data
, which helps if more than one duplicated row contains a value: you get them all. GROUP_CONCAT()
从Data
中产生一个值列表,如果有多个重复的行包含一个值,这会有所帮助:你得到它们。
Longer term you might be smart to consider these duplicates to be "dirty data", and clean them up as you INSERT new rows.从长远来看,您可能明智地将这些重复项视为“脏数据”,并在您插入新行时清理它们。 How to do that?
怎么做?
Create a unique index on Name
.在
Name
上创建唯一索引。
CREATE UNIQUE INDEX unique_name ON dict(Name);
Then, when loading new data into dict
use Eloquent's updateOrCreate()
function.然后,在将新数据加载到
dict
时,使用 Eloquent 的updateOrCreate()
function。 Here's something to read about that.这是一些值得阅读的内容。 Laravel 5.1 Create or Update on Duplicate
Laravel 5.1 创建或更新重复
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.