简体   繁体   English

使用Or_循环多列的SQL Alchemy(Pandas数据框)

[英]SQL Alchemy using Or_ looping multiple columns (Pandas Dataframes)

SUMMARY: How to query against values from different data frame columns with table.column_name combinations in SQL Alchemy using the OR_ statement. 简介:如何在SQL Alchemy中使用OR_语句查询来自具有table.column_name组合的不同数据框列中的值。

I'm working on a SQL Alchemy project where I pull down valid columns of a dataframe and enter them all into SQL Alchemy's filter. 我正在研究一个SQL Alchemy项目,在该项目中,我下拉了数据框的有效列,并将它们全部输入到SQL Alchemy的过滤器中。 I've successfully got it running where it would enter all entries of a column using the head of the column like this: 我已经成功地运行了它,它将使用列的标题输入列的所有条目,如下所示:

qry = qry.filter(or_(*[getattr(Query_Tbl,column_head).like(x) \
      for x in (df[column_head].dropna().values)]))

This produced the pattern I was looking for of (tbl.column1 like a OR tbl.column1 like b...) AND- etc. 这产生了我正在寻找的模式(tbl.column1像OR或tbl.column1像b ...)AND-等。

However, there are groups of the dataframe that need to be placed together where the columns are different but still need to be placed within the OR_ category, 但是,有些列的数据框需要放在一起,但列不同,但仍需要放在OR_类别中,

ie (The desired result) 即(预期结果)

(tbl1.col1 like a OR tbl.col1 like b OR tbl.col2 like c OR tbl.col2 like d OR tbl.col3 like e...) etc. (tbl1.col1如OR tbl.col1如b或tbl.col2如c OR tbl.col2如d或tbl.col3如e ...)等

My latest attempt was to sub-group the columns I needed grouped together, then repeat the previous style inside those groups like: 我的最新尝试是将需要分组的列分组,然后在这些分组中重复以前的样式,例如:

qry = qry.filter(or_((*[getattr(Query_Tbl, set_id[0]).like(x) \
                 for x in (df[set_id[0]].dropna().values)]),
                 (*[getattr(Query_Tbl, set_id[1]).like(y) \
                 for y in (df[set_id[1]].dropna().values)]),
                 (*[getattr(Query_Tbl, set_id[2]).like(z) \
                 for z in (df[set_id[2]].dropna().values)])
                 ))

Where set_id is a list of 3 strings corresponding to column1, column2, and column 3 so I get the designated results, however, this produces simply: 其中set_id是3个字符串的列表,分别与column1,column2和column 3对应,因此我得到了指定的结果,但是,这简单地产生了:

(What I'm actually getting) (我实际上得到了什么)

(tbl.col1 like a OR tbl.col1 like b..) AND (tbl.col2 like c OR tbl.col2 like d...) AND (tbl.col3 like e OR...) (tbl.col1如OR tbl.col1如b ..)AND(tbl.col2如c或tbl.col2如d ...)AND(tbl.col3如e OR ...)

Is there a better way to go about this in SQL Alchemy to get the result I want, or would it better to find a way of implementing column values with Pandas directly into getattr() to work it into my existing code? 在SQL Alchemy中是否有更好的方法可以得到我想要的结果,还是找到一种使用Pandas直接在getattr()中实现列值以将其应用于现有代码的方法更好?

Thank you for reading and in advance for your help! 感谢您的阅读,并提前为您提供帮助!

It appears I was having issues with the way the data-frame was formatted, and I was reading column names into groups differently. 看来我在格式化数据框的方式时遇到了问题,而且我在以不同的方式将列名读入组中。 This pattern works for anyone who want to process multiple df columns into the same OR statements. 对于希望将多个df列处理到同一OR语句中的任何人,此模式都适用。

I apologize for the issue, if anyone has any comments or questions on the subject I will help others with this type of issue. 对于这个问题,我深表歉意,如果有人对这个问题有任何意见或疑问,我将帮助其他人解决这类问题。

Alternatively, I found a much cleaner answer. 另外,我找到了一个更干净的答案。 Since SQL Alchemy's OR_ function can be used with a variable column if you use Python's built in getattr() function, you only need to create (column,value) pairs where by you can unpack both in a loop. 如果使用Python的内置getattr()函数,由于SQL Alchemy的OR_函数可以与变量列一起使用,因此只需要创建(列,值)对,就可以对它们进行循环解包。

for group in [group_2, group_3]:
    set_id = list(set(df.columns.values) & set(group))
    if len(set_id) > 1:
        set_tuple = list()
        for column in set_id:
            for value in df[column].dropna().values:
                set_tuple.append((column, value))
        print(set_tuple)
        qry = qry.filter(or_(*[getattr(Query_Tbl,id).like(x) for id, x in set_tuple]))
        df = df.drop(group, axis=1)

If you know what column need to be grouped in the Or_ statement, you can put them into lists and iterate through them. 如果您知道Or_语句中需要对哪些列进行分组,则可以将它们放入列表中并对其进行遍历。 Inside those, you create a list of tuples where you create the (column, value) pairs you need. 在这些内部,创建一个元组列表,在其中创建所需的(列,值)对。 Then within the Or_ function you upact the column and values in a loop, and assign them accordingly. 然后,在Or_函数中,向上循环修改列和值,并相应地分配它们。 The code is must easier to read and much for compack. 该代码必须更易于阅读,而且要花很多钱才能获得。 I found this to be a more robust solution than explicitly writing out cases for the group sizes. 我发现这是一个比显式写出组大小的案例更可靠的解决方案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM