简体   繁体   中英

How to count dictionaries in a MongoDB array?

I have a MongoDB collection, where each document is someone's demographic information (a unique identifier, name, address, etc).

As I parse new data into my database using Python/pymongo, I find new entries corresponding to existing identifiers, and I need to keep track of the new entries' counts in order to only utilize the most common one in the end.

For example, if I already have "Jenn Smith" in my collection, and then I get two new entries for "Jennifer Smith" and the same identifier, it is the same person and I just use Mongo's $inc to increment a counter, so the document eventually looks like: 'names': { 'Jenn Smith': 1, 'Jennifer Smith': 2} - And I can use "Jennifer Smith" which is the most common one in the end.

My problem arises when I have to deal with the exact same issue with the locations that Jenn Smith has associated with herself, because location is a dictionary, for example: {'street': '123 Maple Street Apt A', 'city': 'Austin', 'state': 'TX'} . Now it happens that sometimes I get several different locations, each one a dictionary, that so far I $push into a Mongo locations array. However, in the majority of cases there is a predominant location for each collection document, with any others being slight variations, eg: {'street': '123 Maple Street Apartment A', 'city': 'Austin', 'state': 'TX'} .

I understand that $inc can't work the same way as for names , since Python dictionaries aren't hashable. How should I go about finding the most common element in my locations array?

由于dictionary未嵌套,因此可以为dictionary创建一个frozon set并对其进行hash处理:

hash(frozenset(location.items()))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM