簡體   English   中英

如何查找表中哪些記錄填充了大多數字段?

[英]How can I find which record in a table has most fields populated?

或者換句話說:如何找到表中哪條記錄的空字段或空白字段最少?

無需單獨計算每個單獨的字段 - 該表有161個字段。

這可以通過構建動態查詢來完成,例如:

SELECT id,
       (IF(col1 = NULL OR col1 = "", 1, 0) +
        IF(col2 = NULL OR col3 = "", 1, 0) +
        ...
        IF(coln = NULL OR coln = "", 1, 0)
       ) AS null_count
FROM table_name
ORDER BY null_count DESC
LIMIT 1;

這可以通過使用INFORMATION_SCHEMA.COLUMNS形成新的動態查詢然后執行dynamic sql query來輕松完成。 此外,您可能需要通過將會話級變量group_concat_max_len設置為更高的值來增加GROUP_CONCAT函數的最大輸出長度。

SET GLOBAL group_concat_max_len = 4294967295;

SELECT @query1 := CONCAT('SELECT id,
                                 (',
                         GROUP_CONCAT(CONCAT('IF(',COLUMN_NAME,' IS NULL OR ',
                                                   COLUMN_NAME,' = "", 1, 0
                                                ) 
                                           ') 
                                      SEPARATOR ' + '),
                                ') AS null_count
                         FROM table_name
                         ORDER BY null_count DESC
                         LIMIT 1')
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = SCHEMA()
      AND TABLE_NAME = 'table_name';

PREPARE stmt FROM @query1; EXECUTE stmt; DEALLOCATE PREPARE stmt;

示例: SQLFiddle

免責聲明:我提出錯誤的問題,認為OP正在尋找具有最多NULL的表的列。 盡管如此,它可能對任何人都有用。

創建一個這樣的過程:

drop procedure if exists test_most_pop_field;
DELIMITER $$
CREATE PROCEDURE test_most_pop_field(IN tableName varchar(100))
BEGIN

DECLARE done INT DEFAULT 0;
DECLARE sql_query VARCHAR(255);

DECLARE cur CURSOR FOR 
SELECT CONCAT('INSERT INTO tmp_result(columnName, numberOfEmptyRows) SELECT "', COLUMN_NAME, '" AS columnName, SUM(IF(',COLUMN_NAME,' IS NULL OR ', COLUMN_NAME,' = "", 1, 0)) AS numberEmptyRows FROM ', TABLE_NAME)
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = SCHEMA()
AND TABLE_NAME = tableName;

DECLARE CONTINUE HANDLER FOR SQLSTATE '02000' SET done = 1;

DROP TABLE IF EXISTS tmp_result;
CREATE TEMPORARY TABLE tmp_result(columnName varchar(100), numberOfEmptyRows int);

OPEN cur;

REPEAT
  FETCH cur INTO sql_query;
  IF NOT done THEN
    BEGIN
      SET @sql = sql_query; /*this extra step is necessary, cause otherwise it's a syntax error, don't ask me why*/
      PREPARE stmt FROM @sql; 
      EXECUTE stmt; 
      DEALLOCATE PREPARE stmt;
    END;
  END IF;
UNTIL done END REPEAT;

CLOSE cur;

SELECT * FROM tmp_result ORDER BY numberOfEmptyRows DESC /*optionally LIMIT 1*/;

END $$
DELIMITER ;

然后使用要檢查的表名稱調用它:

CALL test_most_pop_field('yourTableName');

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM