简体   繁体   English

详细的SQL查询,count(*)限制器不起作用

[英]elaborate SQL query, count(*) limiter not working

I'm afraid I'm no great shakes at SQL, so I'm not surprised I'm having trouble with this, but if you could help me get it to work (doesn't even have to be one query), I'd be grateful. 恐怕我对SQL并没有很大的动摇,所以我对此并不感到惊讶,但是如果您可以帮助我使其正常工作(甚至不必是一个查询),我将不胜感激。 trying to analyze some Twitter data using MySQLdb in Python, I'm running: 试图在Python中使用MySQLdb分析一些Twitter数据,我正在运行:

for u_id in list:
"
select e.user_id
from table_entities e
inner join table_tweets t on e.id = t.id
where e.type='mention' and t.user_id=%s
group by e.type having count('hashtag') < 3
"
%
(u_id)

(python syntax faked slightly to not show the unimportant stuff) (python语法被轻微伪造为不显示不重要的内容)

now, everything before the "group by" statement works fine. 现在,“ group by”语句之前的所有内容都可以正常工作。 I'm able to extract user_ids mentioned in a given tweet (id is the PK for table_tweets, whereas there's another row in table_entities for each mention, hashtag, or URL) matching the current position of my loop. 我能够提取给定鸣叫中提到的user_id(id是table_tweets的PK,而table_entities中的每个提及,主题标签或URL都有另一行)与我的循环的当前位置匹配。

however -- and I don't think I'm formatting it anywhere near correctly -- the group by statement doesn't do a thing. 但是-而且我不认为我在正确地格式化它的位置-按语句分组没有任何作用。 what I mean to do is exclude all user_ids belonging to tweets (ids) that have 3 or more entries in table_entity with type=hashtag. 我的意思是排除所有属于table_entity中具有type = hashtag的3个或更多条目的推文(id)的所有user_id。 I can sort of tell it's not going to work as it is, since it doesn't actually refer to the id column, but any way that I've tried to do that (eg by trying to make it part of the join clause) throws a syntax error. 我可以说出它实际上不会起作用,因为它实际上并不引用id列,而是我尝试执行的任何方式(例如,尝试使其成为join子句的一部分)引发语法错误。

advice is appreciated! 意见表示赞赏!

This doesn't really do what you want. 这并没有真正做到您想要的。

select e.user_id
from table_entities e
inner join table_tweets t on e.id = t.id
where e.type='mention' and t.user_id=%s
group by e.type having count('hashtag') < 3
  • The Select And group by clause aren't doing what you expect. Select And group by子句没有达到您的期望。 By putting e.user_id in the SELECT clause and not in the GROUP BY MySQL will select one arbitrary user_id for each e.type. 通过在SELECT子句中而不是在GROUP BY中放置e.user_id ,MySQL将为每个e.type选择一个任意的user_id。
  • Having count('literalString') is the equivalent of Having COUNT(*) you can see this yourself by moving the Count('hashtag') to the select clause. Having count('literalString')等同于Having COUNT(*)您可以通过将Count('hashtag')移至select子句来自己查看。

Here's a Live DEMO of these points 这是这些要点的实时演示

The result is that your query will only records if there are fewer than 3 mentions for the user. 结果是您的查询将仅记录用户提及少于3个提示的情况。

There are many way to accomplish what you're trying I chose IN (you could also use Exists or an INNER JOIN to a subquery) 有很多方法可以完成您选择的IN(您也可以对子查询使用Exists或INNER JOIN)

SELECT e.user_id 
FROM   table_entities e 
       INNER JOIN table_tweets t 
               ON e.id = t.id 
WHERE  e.type = 'mentions' 
       AND t.user_id =% s 
       AND e.user_ID  IN (SELECT e.user_id 
                             FROM   table_entities e 
                                    INNER JOIN table_tweets t 
                                            ON e.id = t.id 
                             WHERE  e.type = 'hashtag' 
                                    AND t.user_id =% s 
                             GROUP  BY e.user_id 
                             HAVING Count(*) >= 3) 

the sub select finds all user ids that have less than 3 records in table_enties that have an e.type of "hashtag" and the user that matches % s 子选择将在e_type为“ hashtag”的table_enties中查找少于3条记录且与% s匹配的用户的所有用户ID

The main select filter for 'mentions' and the user id again. 再次针对“提及”和用户ID进行主选择过滤。 This allows you you to select for one e.type and filtering on a count of another e.type. 这使您可以选择一种电子类型,并根据另一种电子类型进行计数。

I think you mis-parsed one part of my post (my fault for it being a bit muddled) -- the user_id column is only populated when type='mention'. 我认为您错误地解析了我的帖子的一部分(我的错是因为有点混乱)-仅当type ='mention'时才填充user_id列。 I'm trying to limit by the id column. 我试图通过id列来限制。 that said, I was able to get it to work thanks to your help! 就是说,多亏您的帮助,我才能够使它正常工作!

select e.user_id
from table_entities e
inner join table_tweets t on e.id = t.id
where e.type='mention' and
e.id in
(select e.id
from table_entities e
where e.type='hashtag' group by e.id having count(*) < 3)

I decided to move this above the for u_id in list loop because the query now takes a while to run, but I can work with the list output here just fine. 我决定将其移到列表循环中的for u_id上方,因为查询现在需要一段时间才能运行,但是我可以在这里使用列表输出。 thanks! 谢谢!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM