简体   繁体   English

根据 object 属性的前 3 个字从对象数组中删除重复项

[英]Remove duplicates from an array of objects based on the first 3 words of object property

I have this array of objects which have a lot of duplicate entries.我有这个有很多重复条目的对象数组。 I can clean the array and get rid of the duplicate ones but the catch is I need to remove those which matches based on the property's first 3 words.我可以清理数组并删除重复的数组,但问题是我需要删除那些根据属性的前 3 个单词匹配的数组。

Suppose this is the array:假设这是数组:

let arr = [
     {
         text: "Be good and you will be lonely. But there’s nothing wrong with being lonely.",
         id: 1
     },
     {
         text: "Coffee is a way of stealing time.",
         id: 2
     },
     {
         text: "Be good and you will be lonely. But there’s nothing wrong with being lonely.",
         id: 3
     }
];

I want to match the first 3 words of each of the texts and if it is a match then remove one of the matched objects from the old array and push the removed one to a new array.我想匹配每个文本的前 3 个单词,如果匹配,则从旧数组中删除一个匹配的对象并将删除的对象推送到新数组。

So far I could remove the duplicate ones with this piece of code and I don't know what I should do next.到目前为止,我可以使用这段代码删除重复的代码,但我不知道下一步该怎么做。

let texts     = {};

arr = arr.filter(function(currentObject) {
      if (currentObject.text in seenNames) {
           return false;
      } else {
           seenNames[currentObject.text] = true;
           return true;
      }
});

It would be a big help if someone point me to the right direction.如果有人指出我正确的方向,那将是一个很大的帮助。

UPDATE:更新:

I started the whole thing again with a different approach than before.我用与以前不同的方法重新开始了整个事情。 As @Andreas and @freedomn-m said, I split the items based on the first 3 words and then tried to filter the original array by matching the split items.正如@Andreas 和@freedomn-m 所说,我根据前三个单词拆分项目,然后尝试通过匹配拆分项目来过滤原始数组。 But right now I'm getting all the values back without any filtration.但是现在我没有任何过滤就可以取回所有值。

 let arr = [{ "text": "Be good and you will be lonely. But there's nothing wrong with being lonely.", "id": 1 }, { "text": "Coffee is a way of stealing time.", "id": 2 }, { "text": "Be good and you will be lonely. But there's nothing wrong with being lonely.", "id": 3 } ]; let removedItems = []; let filtered = arr.filter((item, index) => { let splitItem = item["text"].split(" ").slice(0, 3).join(" ").toLowerCase(); if (item["text"].toLowerCase().startsWith(splitItem, index + 1)) { return item; } else { removedItems.push(item); } }); console.log(filtered); console.log(removedItems);

For the original question of how to get 3 words, one option is to use.split().slice() and.join():对于如何获得 3 个单词的原始问题,一种选择是使用.split().slice() 和.join():

var firstWords = item["text"].split(" ").slice(0, 3).join(" ");

you can then do a straight replacement of currentObject.text with firstWords , from the original question:然后,您可以从原始问题中直接用firstWords替换currentObject.text

let texts = {};
arr = arr.filter(function(currentObject) {
    if (firstWords in seenNames) {
         return false;
    } else {
         seenNames[firstWords] = true;
         return true;
    }
});

The update attempts this, but has 2 issues:更新尝试这样做,但有两个问题:

  • .filter(function(item)) must return true/false (as it did originally) not the item /nothing. .filter(function(item))必须返回 true/false(就像它最初所做的那样)而不是item /nothing。

  • item["text"].toLowerCase().startsWith(splitItem) will always be true as splitItem is built from item["text"] item["text"].toLowerCase().startsWith(splitItem)将始终为 true,因为 splitItem 是从 item["text"] 构建的

Adding the removedItems additional list to the original gives:removedItems附加列表添加到原始列表中:

 let arr = [{ "text": "Be good and you will be lonely. But there's nothing wrong with being lonely.", "id": 1 }, { "text": "Coffee is a way of stealing time.", "id": 2 }, { "text": "Be good and you will be lonely. But there's nothing wrong with being lonely.", "id": 3 } ]; let removedItems = []; let seenNames = {}; let filtered = arr.filter((item, index) => { let splitItem = item["text"].split(" ").slice(0, 3).join(" ").toLowerCase(); if (splitItem in seenNames) { // already exists, so don't include in filtered, but do add to removed removedItems.push(item); return false; } // doesn't exist, so add to seen list and include in filtered seenNames[splitItem] = true; return true; }); console.log(filtered); console.log(removedItems);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM