简体   繁体   English

计算Python中GQL查询给出的不同字符串的数量

[英]Counting the number of distinct strings given by a GQL Query in Python

Suppose I have the following GQL database, 假设我有以下GQL数据库,

class Signatories(db.Model):
    name = db.StringProperty()
    event = db.StringProperty()

This database holds information regarding events that people have signed up for. 该数据库保存有关人们注册的事件的信息。 Say I have the following entries in the database in the format (event_name, event_desc): (Bob, TestEvent), (Bob, TestEvent2), (Fred, TestEvent), (John, TestEvent). 说我在数据库中有以下条目,格式为(event_name,event_desc):(Bob,TestEvent),(Bob,TestEvent2),(Fred,TestEvent),(John,TestEvent)。 But the dilemma here is that I cannot just aggregate all of Bob's events into one entity because I'd like to Query for all the people signed up for a specific event and also I'd like to add such entries without having to manually update the entry every single time. 但是,这里的难题是,我不能仅将Bob的所有事件汇总到一个实体中,因为我想查询所有签署特定事件的人员,而且我想添加此类条目而不必手动更新每一次进入。

How could I count the number of distinct strings given by a GQL Query in Python (in my example, I am specifically trying to see how many people are currently signed up for events)? 如何计算Python中的GQL查询所给出的不同字符串的数量(在我的示例中,我专门尝试查看当前有多少人注册了事件)?

I have tried using the old mcount = db.GqlQuery("SELECT name FROM Signatories").count() , however this of course returns the total number of strings in the list, regardless of the uniqueness of each string. 我尝试使用旧的mcount = db.GqlQuery("SELECT name FROM Signatories").count() ,但是,这当然会返回列表中的字符串总数,而不管每个字符串的唯一性如何。

I have also tried using count = len(member) , where member = db.GqlQuery("SELECT name FROM Signatories") , but unfortunately, this only returns an error. 我也尝试过使用count = len(member) ,其中member = db.GqlQuery("SELECT name FROM Signatories") ,但是不幸的是,这只会返回错误。

You can't - at least not directly. 您不能-至少不能直接这样做。 (By the way you don't have a GQL database). (顺便说一下,您没有GQL数据库)。

If you have a small number of items, then fetch them into memory, and use a set operation to produce the unique set and then count 如果项目数量少,则将它们提取到内存中,并使用set操作生成唯一的set,然后计数

If you have larger numbers of entities that make in memory filtering and counting problematic then your strategy will be to aggregate the count as you create them, 如果您有大量的实体在进行内存过滤和计数时遇到问题,那么您的策略将是在创建实体时汇总计数,

eg 例如

create a separate entity each time you create an event that has the pair of strings as the key. 每次创建以字符串对为键的事件时,都创建一个单独的实体。 This way you will only have one entity the data store representing the specific pair. 这样,您将只有一个实体代表该特定对。 Then you can do a straight count. 然后,您可以直接计算。

However as you get large numbers of these entities you will need to start performing some additional work to count them as the single query.count() will become too expensive. 但是,随着您获得大量这些实体,您将需要开始执行一些额外的工作来将它们计算为单个查询.count()将变得过于昂贵。 You then need to start looking at counting strategies using the datastore. 然后,您需要开始查看使用数据存储区的计数策略。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM