简体   繁体   中英

Return column names and count of their distinct values in MySQL

I am looking for a dynamic MySQL query which would count the distinct values in each column of a table and which would also tell me which of the columns contain Null values.
Here is the sample table, I used 'db_name' as database name and 'table_name' as table name:

+------+------+------+------+------+
| Col1 | Col2 | Col3 | Col4 | Col5 |
+------+------+------+------+------+
| a    | d    | j    | o    | q    |
| b    | e    | k    | o    | r    |
| c    | f    | l    | o    | NULL |
| a    | g    | NULL | p    | t    |
| b    | h    | m    | NULL | r    |
| a    | i    | n    | p    | s    |
+------+------+------+------+------+

This is the result set that I would like to get:

+----------+---------------+---------------+
| Col Name | Unique values | Contains Null |
+----------+---------------+---------------+
| Col1     |             3 | FALSE         |
| Col2     |             6 | FALSE         |
| Col3     |             6 | TRUE          |
| Col4     |             3 | TRUE          |
| Col5     |             5 | TRUE          |
+----------+---------------+---------------+

So far this is what I have managed to figure out:

-- list all the column names

SELECT COLUMN_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'table_name';

-- count the distinct values in a column

SELECT COUNT(DISTINCT Col1) Col1 
FROM table_name;

-- tell if a column contains any Null

SELECT
  (CASE WHEN (SUM(CASE WHEN Col1 IS NULL THEN 1 ELSE 0 END)) > 0 THEN 'TRUE' 
  ELSE 'FALSE' END) 'Contains Null'
FROM table_name;

-- combining the queries

SELECT
  (SELECT COLUMN_NAME
  FROM INFORMATION_SCHEMA.COLUMNS
  WHERE TABLE_NAME = 'table_name' AND COLUMN_NAME = 'Col1') 'Col Name', 
  (SELECT COUNT(DISTINCT Col1) 
  FROM table_name) 'Unique values',
  (SELECT (CASE WHEN (SUM(CASE WHEN Col1 IS 
  NULL THEN 1 ELSE 0 END)) > 0 THEN 'TRUE' ELSE 'FALSE' END)
  FROM table_name) 'Contains Null';

Now, I assume I would need to build a loop that goes through each column and unifies the records returned by the query or inserts them into a new table. The problem is, I am relatively new to SQL and I am not really familiar with loops and variables yet.

I found a few questions similar to mine, but none of them gave me a clear answer:

SQL Server count number of distinct values in each column of a table

Return column name and distinct values

SQL: count number of distinct values in every column

I just needed thing similiar to this question (get count of all table distinct values and easy way to get it from any table using only SQL) so I've made it on this way. Hopefully it may help somebody in cases like this one.

SET @processedtable := 'myprecioustablename';
SET @columnnames := (
    SELECT GROUP_CONCAT(COLUMN_NAME)
    FROM INFORMATION_SCHEMA.COLUMNS
    WHERE TABLE_NAME = @processedtable);
SET @qrypartcount := REPLACE(@columnnames, ',','), COUNT(DISTINCT ');
SET @validquery := CONCAT("SELECT COUNT(DISTINCT ", @qrypartcount, ") FROM ", @processedtable);
PREPARE stmt FROM @validquery;
EXECUTE stmt;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM