在用户定义函数中将表名作为参数传递

Question

I want to create a UDF that contains parameterized query. 我想创建一个包含参数化查询的UDF。 Since I want to reuse this function, I also want to parameterize the table name. 由于我想重用此函数，因此我也想参数化表名。 Is this possible in Snowflake Data Warehouse(or Oracle since it's bit similar in terms of syntax)? 在Snowflake Data Warehouse（或Oracle，因为在语法上有点相似）是否可行？

I don't think this can be done using only SQL. 我不认为仅使用SQL就能做到这一点。 Any input on how it can be achieved using Python? 关于如何使用Python实现的任何输入？

Answer 1

You can't use a table name as a bind variable in Oracle. 在Oracle中，不能将表名用作绑定变量。 The query is parsed at compile time, not run time. 查询是在编译时而不是运行时解析的。 You have to use EXECUTE IMMEDIATE or dbms_sql . 您必须使用EXECUTE IMMEDIATE或dbms_sql 。

Answer 2

This is a really basic programming question, and not really related to Snowflake. 这是一个非常基本的编程问题，与Snowflake无关。 So you should have investigated more before asking here. 因此，在这里问之前，您应该进行更多调查。 But here's a sketch of an answer for you: 但这是为您提供答案的草图：

#!/usr/bin/env python

# Connect to Snowflake
# See https://docs.snowflake.net/manuals/user-guide/python-connector-example.html#connecting-
...
con = 
...

# Scan the file to build the table/column info
tables = {}
with open("zz.csv") as f:
    for l in f.readlines():
        tname, cname = l.split(",")
        tables.setdefault(tname.strip(), []).append(cname.strip())

# Run a query for each query
for tname, clist in tables.iteritems():
    query = """select {columns} from {table} 
               group by {columns} 
               having count(*) > 1
            """.format(columns=",".join(clist), table=tname)
    print("Running query: {0}".format(query))
    cur = con.cursor()
    cur.execute(query)
    for rec in cur:
        print("DUPLICATED RECORD: {0}".format(rec))

Answer 3

@MarcinZukowski Thanks for info. @MarcinZukowski感谢您提供信息。 Yes I checked it and they are planning to launch it in June. 是的，我检查了它，他们计划在六月推出。 I just have query to validate PK constraint in the UDF which doesn't need to be in UDF. 我只有查询来验证UDF中的PK约束，而不必在UDF中。 My CSV looks like this which has table name and its primary keys: 我的CSV看起来像这样，它具有表名称及其主键：

|  Table Name  |  Primary Key  | 
|    Table 1   |     Col1      |  
|    Table 1   |     Col2      |
|    Table 1   |     Col3      | 
|    Table 2   |     Col11     | 
|    Table 2   |     Col12     |

I want to execute below query for every table from this CSV. 我想从此CSV对每个表执行以下查询。 Here table name can be single argument but column names will have to be varying since tables can 1 or more primary keys. 这里的表名可以是单个参数，但由于表可以具有1个或多个主键，因此列名必须有所不同。

select Col1, Col2, Col3 from Table1
group by Col1, Col2, Col3
having count(*)>1

在用户定义函数中将表名作为参数传递

问题描述

3 个解决方案

解决方案1
1 2019-04-10 02:45:52

解决方案2
1 已采纳 2019-04-16 21:19:07

解决方案3
0 2019-04-13 01:25:57

在用户定义函数中将表名作为参数传递

问题描述

3 个解决方案

解决方案1 1 2019-04-10 02:45:52

解决方案2 1 已采纳 2019-04-16 21:19:07

解决方案3 0 2019-04-13 01:25:57

解决方案1
1 2019-04-10 02:45:52

解决方案2
1 已采纳 2019-04-16 21:19:07

解决方案3
0 2019-04-13 01:25:57