I have a table structure in parent-child relationship and I need to unpivot multiple columns with different values.
Here is an example:
Small example I have put in this table.
Note: I have only put few columns here but I have 215 columns in this way and with different Ids and names. So my goal is to flatten out all columns tied with p_id and/or c_id in expected result table.
I am doing this exercise using snowflake. but I am familiar with jupyter notebook as well. Feel free to provide a solution in SQL or using a Python jupyter notebook. You can also suggest any other ways to handle these kind of data.
This image is a part of comment section. Please Check the highlighted part.
Let's start by creating a table shaped like the one in the description:
CREATE OR REPLACE TABLE weird_table
AS
SELECT 1 AS a, 'b' b, 1 pip1, 2 pip2, 3 pip3, 4 pip4, 11 rip1, 12 rip2, 13 rip3, 14 rip4
UNION ALL SELECT 2 , 'c', 124, 3123, 123, 123, 1231 ,9, 99,999
;
[![enter image description here][1]][1]
Now we can create a stored procedure inside Snowflake with JavaScript. Here the script gets the name of a table. Then it takes all the columns that end with a number and uses those to generate different SELECT
statements, and merges them with a UNION ALL
:
CREATE OR REPLACE PROCEDURE custom_unpivot(TABLE_NAME VARCHAR)
RETURNS STRING
LANGUAGE JAVASCRIPT
AS
$$
var stmt = snowflake.createStatement({
sqlText: "SELECT * FROM " + TABLE_NAME + " LIMIT 1;",
});
stmt.execute();
var cols=[];
for (i = 1; i <= stmt.getColumnCount(); i++) {
cols.push(stmt.getColumnName(i));
}
var idCols = cols.filter(x => !x.match(/[0-9]+$/));
var unpivotCols = cols.filter(x => x.match(/[0-9]+$/));
var maxUnpivot = Math.max(...unpivotCols.map(x => parseInt(x.match(/[0-9]+$/))));
var colsSansSuffix = [...new Set(unpivotCols.map(x => x.replace(/[0-9]+$/, '')))];
selectsToUnion = [];
for (i = 1; i <= maxUnpivot; i++) {
selectsToUnion.push(
"SELECT "+idCols+","+colsSansSuffix.map(x=>" "+x+i+" AS "+x)+" FROM "+TABLE_NAME
);
}
return selectsToUnion.join('\nUNION ALL\n');
$$
;
When you call that procedure, it return a combined SELECT
statement that gives you the desired "unpivot":
CALL custom_unpivot('weird_table');
SELECT A,B, PIP1 AS PIP, RIP1 AS RIP FROM weird_table
UNION ALL
SELECT A,B, PIP2 AS PIP, RIP2 AS RIP FROM weird_table
UNION ALL
SELECT A,B, PIP3 AS PIP, RIP3 AS RIP FROM weird_table
UNION ALL
SELECT A,B, PIP4 AS PIP, RIP4 AS RIP FROM weird_table
If you run that generated SQL, it produces the desired results:
Once you get how this pattern works within a stored procedure, then the possibilities are endless.
For the follow up in the comment, try embedding the resulting query into a filter like this:
SELECT *
FROM (
SELECT A,B, PIP1 AS PIP, RIP1 AS RIP FROM A_weird_table
UNION ALL
SELECT A,B, PIP2 AS PIP, RIP2 AS RIP FROM A_weird_table
UNION ALL
SELECT A,B, PIP3 AS PIP, RIP3 AS RIP FROM A_weird_table
UNION ALL
SELECT A,B, PIP4 AS PIP, RIP4 AS RIP FROM A_weird_table
UNION ALL
SELECT A,B, PIP5 AS PIP, RIP5 AS RIP FROM A_weird_table
)
WHERE pip>0 AND rip>0;
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.