简体   繁体   中英

SQL Alchemy using Or_ looping multiple columns (Pandas Dataframes)

SUMMARY: How to query against values from different data frame columns with table.column_name combinations in SQL Alchemy using the OR_ statement.

I'm working on a SQL Alchemy project where I pull down valid columns of a dataframe and enter them all into SQL Alchemy's filter. I've successfully got it running where it would enter all entries of a column using the head of the column like this:

qry = qry.filter(or_(*[getattr(Query_Tbl,column_head).like(x) \
      for x in (df[column_head].dropna().values)]))

This produced the pattern I was looking for of (tbl.column1 like a OR tbl.column1 like b...) AND- etc.

However, there are groups of the dataframe that need to be placed together where the columns are different but still need to be placed within the OR_ category,

ie (The desired result)

(tbl1.col1 like a OR tbl.col1 like b OR tbl.col2 like c OR tbl.col2 like d OR tbl.col3 like e...) etc.

My latest attempt was to sub-group the columns I needed grouped together, then repeat the previous style inside those groups like:

qry = qry.filter(or_((*[getattr(Query_Tbl, set_id[0]).like(x) \
                 for x in (df[set_id[0]].dropna().values)]),
                 (*[getattr(Query_Tbl, set_id[1]).like(y) \
                 for y in (df[set_id[1]].dropna().values)]),
                 (*[getattr(Query_Tbl, set_id[2]).like(z) \
                 for z in (df[set_id[2]].dropna().values)])
                 ))

Where set_id is a list of 3 strings corresponding to column1, column2, and column 3 so I get the designated results, however, this produces simply:

(What I'm actually getting)

(tbl.col1 like a OR tbl.col1 like b..) AND (tbl.col2 like c OR tbl.col2 like d...) AND (tbl.col3 like e OR...)

Is there a better way to go about this in SQL Alchemy to get the result I want, or would it better to find a way of implementing column values with Pandas directly into getattr() to work it into my existing code?

Thank you for reading and in advance for your help!

It appears I was having issues with the way the data-frame was formatted, and I was reading column names into groups differently. This pattern works for anyone who want to process multiple df columns into the same OR statements.

I apologize for the issue, if anyone has any comments or questions on the subject I will help others with this type of issue.

Alternatively, I found a much cleaner answer. Since SQL Alchemy's OR_ function can be used with a variable column if you use Python's built in getattr() function, you only need to create (column,value) pairs where by you can unpack both in a loop.

for group in [group_2, group_3]:
    set_id = list(set(df.columns.values) & set(group))
    if len(set_id) > 1:
        set_tuple = list()
        for column in set_id:
            for value in df[column].dropna().values:
                set_tuple.append((column, value))
        print(set_tuple)
        qry = qry.filter(or_(*[getattr(Query_Tbl,id).like(x) for id, x in set_tuple]))
        df = df.drop(group, axis=1)

If you know what column need to be grouped in the Or_ statement, you can put them into lists and iterate through them. Inside those, you create a list of tuples where you create the (column, value) pairs you need. Then within the Or_ function you upact the column and values in a loop, and assign them accordingly. The code is must easier to read and much for compack. I found this to be a more robust solution than explicitly writing out cases for the group sizes.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM