SQL查询以一个输出计数多个字符串

Question

I have a database including certain strings, such as '{TICKER|IBM}' to which I will refer as ticker-strings. 我有一个包含某些字符串的数据库，例如“ {TICKER | IBM}”，我将其称为股票报价字符串。 My target is to count the amount of ticker-strings per day for multiple strings. 我的目标是每天计算多个字符串的自动报价字符串的数量。

My database table 'tweets' includes the rows 'tweet_id', 'created at' (dd/mm/yyyy hh/mm/ss) and 'processed text'. 我的数据库表“ tweets”包括行“ tweet_id”，“创建于”（dd / mm / yyyy hh / mm / ss）和“已处理文本”。 The ticker-strings, such as '{TICKER|IBM}', are within the 'processed text' row. 标记字符串，例如“ {TICKER | IBM}”，位于“已处理的文本”行中。

At this moment, I have a working SQL query for counting one ticker-string (thanks to the help of other Stackoverflow-ers). 目前，我有一个有效的SQL查询来计算一个报价字符串（由于其他Stackoverflow-ers的帮助）。 What I would like to have is a SQL query in which I can count multiple strings (next to '{TICKER|IBM}' also '{TICKER|GOOG}' and '{TICKER|BAC}' for instance). 我想拥有一个SQL查询，其中我可以计算多个字符串（例如，“ {TICKER | IBM}”旁边的还有“ {TICKER | GOOG}”和“ {TICKER | BAC}”旁边）。

The working SQL query for counting one ticker-string is as follows: 用于计算一个报价字符串的有效SQL查询如下：

SELECT d.date, IFNULL(t.count, 0) AS tweet_count
FROM all_dates AS d
LEFT JOIN (
    SELECT COUNT(DISTINCT tweet_id) AS count, DATE(created_at) AS date
    FROM tweets
    WHERE processed_text LIKE '%{TICKER|IBM}%'
    GROUP BY date) AS t
ON d.date = t.date

The eventual output should thus give a column with the date, a column with {TICKER|IBM}, a column with {TICKER|GOOG} and one with {TICKER|BAC}. 因此，最终的输出应为日期提供一列，{TICKER | IBM}为一列，{TICKER | GOOG}为一列，而{TICKER | BAC}为一列。

I was wondering whether this is possible and whether you have a solution for this? 我想知道这是否可能，您是否对此有解决方案？ I have more than 100 different ticker-strings. 我有100多个不同的置顶字符串。 Of course, doing them one-by-one is an option, but it is a very time-consuming one. 当然，一个接一个地做是一种选择，但这是非常耗时的。

Answer 1

If I understand correctly, you can do this with conditional aggregation: 如果我理解正确，则可以使用条件聚合来做到这一点：

SELECT d.date, coalesce(IBM, 0) as IBM, coalesce(GOOG, 0) as GOOG, coalesce(BAC, 0) AS BAC
FROM all_dates d LEFT JOIN
     (SELECT DATE(created_at) AS date,
             COUNT(DISTINCT CASE WHEN processed_text LIKE '%{TICKER|IBM}%' then tweet_id
                   END) as IBM,
             COUNT(DISTINCT CASE WHEN processed_text LIKE '%{TICKER|GOOG}%' then tweet_id
                   END) as GOOG,
             COUNT(DISTINCT CASE WHEN processed_text LIKE '%{TICKER|BAC}%' then tweet_id
                   END) as BAC
      FROM tweets
      GROUP BY date
     ) t
     ON d.date = t.date;

Answer 2

I'd return the specified resultset like this, adding expressions to the SELECT list for each "ticker" I want returned as a separate column: 我将以这种方式返回指定的结果集，将要作为单独列返回的每个“行情指示器”添加表达式到SELECT列表中：

   SELECT d.date
        , IFNULL(SUM(t.processed_text LIKE '%{TICKER|IBM}%' ),0) AS `cnt_ibm`
        , IFNULL(SUM(t.processed_text LIKE '%{TICKER|GOOG}%'),0) AS `cnt_goog`
        , IFNULL(SUM(t.processed_text LIKE '%{TICKER|BAC}%' ),0) AS `cnt_goog`
        , IFNULL(SUM(t.processed_text LIKE '%{TICKER|...}%' ),0) AS `cnt_...`
     FROM all_dates d
     LEFT
     JOIN tweets t
       ON t.created_at >= d.date
      AND t.created_at < d.date + INTERVAL 1 DAY
    GROUP BY d.date

NOTES: The expressions within the SUM aggregates above are evaluated as booleans, so they return 1 (if true), 0 (if false), or NULL. 注意：上面的SUM聚合中的表达式被评估为布尔值，因此它们返回1（如果为true），0（如果为false）或NULL。 I'd avoid wrapping the created_at column in a DATE() function, and use a range scan instead, especially if a predicate is added ( WHERE clause) that restricts the values of date being returned from all_dates`. 我要避免将created_at列包装在DATE（）函数中，而应使用范围扫描，尤其是如果添加了谓词（ WHERE clause) that restricts the values of being returned from all_dates being returned from WHERE clause) that restricts the values of日期WHERE clause) that restricts the values of 。

As an alternative, expressions like this will return an equivalent result: 或者，这样的表达式将返回等效结果：

     , SUM(IF(t.process_text LIKE '%{TICKER|IBM}%' ,1,0)) AS `cnt_ibm`

SQL查询以一个输出计数多个字符串

问题描述

2 个解决方案

解决方案1
1 2014-06-30 20:17:42

解决方案2
0 2014-06-30 20:42:30

SQL查询以一个输出计数多个字符串

问题描述

2 个解决方案

解决方案1 1 2014-06-30 20:17:42

解决方案2 0 2014-06-30 20:42:30

解决方案1
1 2014-06-30 20:17:42

解决方案2
0 2014-06-30 20:42:30