简体   繁体   English

如何查找重复文件?

[英]How to find duplicates documents?

It's very strange that I did not find answer in documentation and here for a very simple question.很奇怪,我没有在文档中找到答案,这里是一个非常简单的问题。 How to find duplicated records in collections.如何在 collections 中查找重复记录。 For example I need to find duplicated by id for next documents:例如,我需要为下一个文档查找id重复的内容:

{"id": 1, name: "Mike"},
{"id": 2, name: "Jow"},
{"id": 3, name: "Piter"},
{"id": 1, name: "Robert"}

I need to query that will return two documents with same id ( id: 1 in my case).我需要查询将返回两个具有相同 id 的文档(在我的情况下为id: 1 )。

Have a look at the COLLECT AQL command, it can return the count of documents that contain duplicate values, such as your id key.看看 COLLECT AQL 命令,它可以返回包含重复值的文档的计数,例如您的 id 键。

ArangoDB AQL - COLLECT ArangoDB AQL - 收集

You can use LET a lot in AQL to help break down a query into smaller steps, and work with the output in future queries.您可以在 AQL 中大量使用 LET 来帮助将查询分解为更小的步骤,并在以后的查询中使用 output。

It may be possible to also collapse it all into one query, but this technique helps break it down.也可以将其全部折叠到一个查询中,但这种技术有助于将其分解。

LET duplicates = (
    FOR d IN myCollection
    COLLECT id = d.id WITH COUNT INTO count
    FILTER count > 1
    RETURN {
        id: id,
        count: count
    }
)

FOR d IN duplicates
FOR m IN myCollection
FILTER d.id == m.id
RETURN m

This will return:这将返回:

[
  {
    "_key": "416140",
    "_id": "myCollection/416140",
    "_rev": "_au4sAfS--_",
    "id": 1,
    "name": "Mike"
  },
  {
    "_key": "416176",
    "_id": "myCollection/416176",
    "_rev": "_au4sici--_",
    "id": 1,
    "name": "Robert"
  }
]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM