简体   繁体   English

如何使用python和postgres将表的奇数行插入另一个表

[英]How can I insert the odd rows of a table into another table with python and postgres

I have a python program in which I want to read the odd rows from one table and insert them into another table. 我有一个python程序,我想在其中读取一个表中的奇数行并将它们插入另一表中。 How can I achieve this? 我该如何实现?

For example, the first table has 5 rows in total, and I want to insert the first, third, and fifth rows into another table. 例如,第一个表总共有5行,我想将第一,第三和第五行插入到另一个表中。

Note that the table may contains millions of rows, so the performance is very important. 请注意,该表可能包含数百万行,因此性能非常重要。

I found a few methods here . 我在这里找到了几种方法。 Here's two of them transcribed to psycopg2 . 这是其中两个抄录为psycopg2

If you have a sequential primary key, you can just use mod on it: 如果您有顺序主键,则可以在其上使用mod

database_cursor.execute('SELECT * FROM table WHERE mod(primary_key_column, 2) = 1')

Otherwise, you can use a subquery to get the row number and use mod : 否则,您可以使用子查询获取行号并使用mod

database_cursor.execute('''SELECT col1, col2, col3
                             FROM (SELECT row_number() OVER () as rnum, col1, col2, col3
                                     FROM table)
                           WHERE mod(rnum, 2) = 1''')

If you have an id-type column that is guaranteed to increment by 1 upon every insert (kinda like an auto-increment index), you could always mod that to select the row. 如果您有一个id类型的列,该列在每次插入时都保证增加1(有点像自动增量索引),则可以随时修改该值以选择该行。 However, this would break when you begin to delete rows from the table you are selecting from. 但是,当您开始从要选择的表中删除行时,这可能会中断。

A more complicated solution would be to use postgresql's row_number() function. 一个更复杂的解决方案是使用postgresql的row_number()函数。 The following assumes you have an id column that can be used to sort the rows in the desired order: 以下内容假设您有一个id列,可用于按所需顺序对行进行排序:

select r.* 
from (select *,row_number() over(order by id) as row 
      from <tablename>
) r
where r.row % 2 = 0

Note: regardless of how you do it, the performance will NEVER really be efficient as you necessarily have to do a full table scan, and selecting all columns on a table with millions of records using a full table scan is going to be slow. 注意:无论您如何执行,性能都将永远无法真正高效,因为您必须进行全表扫描,而使用全表扫描选择具有数百万条记录的表中的所有列将很慢。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM