简体   繁体   中英

SQL - Select min and max value for each column

Supose I have a table with a few (actually 107) columns: COLUMN_A, COLUMN_B, COLUMN_C, COLUMN_D, etc...

Out of each of them i want to extract informations such as minimum/maximum length, null+empty quantity and minimum/maximum value.

To analyze each column individually i use the following code:

DECLARE @col VARCHAR(max) =   'COLUMN_A'

DECLARE @RUN_QUERY AS VARCHAR(MAX)
SET @RUN_QUERY = 'SELECT MIN(LEN(' + @col + ')) AS CHR_MIN, MAX(LEN(' + @col + ')) AS CHR_MAX, MIN(' + @col + ') AS VALUE_MIN, MAX(' + @col + ') AS VALUE_MAX FROM MY_TABLE'
EXEC(@RUN_QUERY)

and manually i can change the variable on first line in order to "efficiently" change targeted column.

I also know that accessing the INFORMATION_SCHEMA i can easily get a table with every column as a row with following script:

SELECT TABLE_NAME, COLUMN_NAME, ORDINAL_POSITION
INTO #TEMP_COLS
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = MY_TABLE
ORDER BY 3

But i dont know how to make the first query run for every line of the #TEMP_COLS table... I feel i need a pivot table, but i don't know where to start. I surely can't pivot MY_TABLE as a whole because it has about half a million lines... even so, i think pivotting is the way to go. And i am a little bit scared of it because of the syntax.

If you know any other way around please share it. If you know how to pivot the solution away please teach me, lol.

Thanks in advance.

You can loop the rows of your temporary table and store results in another temporary table.

IF OBJECT_ID('tempdb..#TEMP_COLS') IS NOT NULL
    DROP TABLE #TEMP_COLS
SELECT TABLE_NAME, COLUMN_NAME, ORDINAL_POSITION, CAST(0 as BIT) as isProcessed
INTO #TEMP_COLS
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'YourTable'

Your code but with an indicator isProcessed to register when the column has been calculated.

DECLARE @RUN_QUERY AS VARCHAR(MAX)
DECLARE @col VARCHAR(max) =  (SELECT TOP 1 COLUMN_NAME FROM #TEMP_COLS WHERE isProcessed = 0)

IF OBJECT_ID('tempdb..#MinMaxValues') IS NOT NULL
    DROP TABLE #MinMaxValues
CREATE TABLE #MinMaxValues (
    COLUMN_NAME VARCHAR(max),
    CHR_MIN int,
    CHR_MAX int,
    VALUE_MIN VARCHAR(max),
    VALUE_MAX VARCHAR(max),
)

WHILE @col IS NOT NULL
BEGIN

    SET @RUN_QUERY = '
    INSERT INTO #MinMaxValues
    SELECT  ''' + @col + ''',
            MIN(LEN(' + @col + ')) AS CHR_MIN, 
            MAX(LEN(' + @col + ')) AS CHR_MAX, 
            MIN(' + @col + ') AS VALUE_MIN, 
            MAX(' + @col + ') AS VALUE_MAX 
            FROM YourTable
    GROUP BY ' + @col
    EXEC(@RUN_QUERY)

    UPDATE #TEMP_COLS SET isProcessed = 1 WHERE COLUMN_NAME = @col
    SET @col = null
    SELECT TOP 1 @col = COLUMN_NAME FROM #TEMP_COLS WHERE isProcessed = 0
END


SELECT * from #MinMaxValues

Temporary table declaration for #MinMaxValues. This table will store the results for each column while we iterate through each #TEMP_COLS record.

The iteration could be a cursor, but since cursors are very slow, I prefer to iterate through each record in #TEMP_COLS while our indicator isProcessed is 0, meaning @col will receive a value. Each processed record update isProcessed of the current row with the value of 1.

What you are looking for is an UNPIVOT. unpivot-example

DROP TABLE IF EXISTS yourTable; 

CREATE TABLE yourTable (
    COL_01 INT NULL
  , COL_02 INT NULL
  , COL_03 INT NULL
  , COL_04 INT NULL
  , COL_05 INT NULL
  , COL_06 INT NULL
  , COL_07 INT NULL
  , COL_08 INT NULL
  , COL_09 INT NULL
  , COL_10 INT NULL
  , COL_11 INT NULL
  , COL_12 INT NULL
  , COL_13 INT NULL
  , COL_14 INT NULL
  , COL_15 INT NULL
) ;
GO

INSERT INTO dbo.yourTable (COL_01
                           , COL_02
                           , COL_03
                           , COL_04
                           , COL_05
                           , COL_06
                           , COL_07
                           , COL_08
                           , COL_09
                           , COL_10
                           , COL_11
                           , COL_12
                           , COL_13
                           , COL_14
                           , COL_15
)
VALUES (
   CAST ((RAND()*100) AS INT)
  ,CAST ((RAND()*100) AS INT)
  ,CAST ((RAND()*100) AS INT)
  ,CAST ((RAND()*100) AS INT)
  ,CAST ((RAND()*100) AS INT)
  ,CAST ((RAND()*100) AS INT)
  ,CAST ((RAND()*100) AS INT)
  ,CAST ((RAND()*100) AS INT)
  ,CAST ((RAND()*100) AS INT)
  ,CAST ((RAND()*100) AS INT)
  ,CAST ((RAND()*100) AS INT)
  ,CAST ((RAND()*100) AS INT)
  ,CAST ((RAND()*100) AS INT)
  ,CAST ((RAND()*100) AS INT)
  ,CAST ((RAND()*100) AS INT)
) ;
GO 20

SELECT TOP (100) * FROM dbo.yourTable

Unpivot Code

SELECT
    unpvt.ColumnName
  , MAX( ColumnValue )
  , MIN( ColumnValue )
  , AVG( ColumnValue )
FROM (
    SELECT
        COL_01
      , COL_02
      , COL_03
      , COL_04
      , COL_05
      , COL_06
      , COL_07
      , COL_08
      , COL_09
      , COL_10
      , COL_11
      , COL_12
      , COL_13
      , COL_14
      , COL_15
    FROM dbo.yourTable
) p
    UNPIVOT (
        ColumnValue
        FOR ColumnName IN (COL_01, COL_02, COL_03, COL_04, COL_05, COL_06, COL_07, COL_08, COL_09, COL_10, COL_11
                           , COL_12, COL_13, COL_14, COL_15
        )
    ) AS unpvt
GROUP BY unpvt.ColumnName ;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM