简体   繁体   中英

merge or join pandas dataframe with sqlite table?

I have a large sqlite table that should not be loaded all at once in memory with columns ['a','b','c',...] . It has a composite index for the columns a and b .

a    b     c   ....
1    34    45656 .....
54   175   34323 ....
102  12121  3029 ....

Now I want to extract rows that have particular values of a and b , so I'm thinking of making a query like

SELECT * FROM table WHERE a IN <insert tuple of a vals> AND b IN <insert tuple of b vals>

However, there are thousands of such combinations of a and b that I want to check. I can create a pandas database that contains these combinations:

>>> df
    a    b
    102  12121
    234  879789
    ...  ...

and it might be simpler to just join or merge the two tables.

However, I don't want to add another table to the sqlite.db file because I can make different df s, and I don't want to keep inflating my sqlite database file size. Is there a way to create a temporary table in the sqlite db for merging? Or is there a way to do this via pandas?

Since you have the index on a and b , querying for each combination of a and b should be more efficient than creating new data structures in either Pandas or SQLite AND performing a join.

You could loop over each combination of a and b and query, or you could do it in one query, but you're limited by SQLITE_MAX_SQL_LENGTH :

SELECT * FROM table WHERE (a = <a1> AND b = <b1>) OR (a = "a2" AND b = <b2>) ...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM