Python 循环通过特定行/索引拆分 dataframe 以插入 SQL 服务器

Question

I have a dataframe with a couple thousand rows, I want to create a loop to split the entire dataframe by 90 rows each sub-dataframe and INSERT each subset into SQL server.我有一个有几千行的 dataframe，我想创建一个循环，将整个 dataframe 拆分为每个子数据帧 90 行，并将每个子集插入 Z9778840A0100CB30C9828767A2 服务器。

my dummy way to split it by a fixed number 90 rows which is not efficient我的虚拟方法将其拆分为固定数量的 90 行，效率不高

df1 = df.loc[0:89,:]
df1.to_sql("tableName", schema='dbo', con=engine, method='multi')

df2 = df.loc[90:179,:]
df2.to_sql("tableName", schema='dbo', con=engine, method='multi')
......

sample data样本数据

df = pd.DataFrame(np.random.randint(0,100,size=(2000, 4)), columns = ['Name', 'Age','food','tree']) #size control how many rows

because of my sql server has the limitation, I can only insert 90 rows for each Bulk Insert.因为我的 sql 服务器有限制，我只能为每个批量插入插入 90 行。

Answer 1

Here's a pretty verbose approach.这是一个非常冗长的方法。 In this case, taking your sample dataframe, it is sliced in increments of 90 rows.在这种情况下，以您的样本 dataframe 为例，它以 90 行为增量进行切片。 The first block will be 0-89, then 90-179, 180-269, etc.第一个块将是 0-89，然后是 90-179、180-269 等。

import pandas as pd
import numpy as np
import math

df = pd.DataFrame(np.random.randint(0,100,size=(2000, 4)), columns = ['Name', 'Age','food','tree']) #size control how many rows

def slice_df(dataframe, row_count):

    num_rows = len(dataframe)
    num_blocks = math.ceil(num_rows / row_count)
    
    for i in range(num_blocks):
        df = dataframe[(i * row_count) : ((i * row_count)+row_count-1)]
        # Do your insert command here

slice_df(df, 90)

Answer 2

`np.array_split(arr, indices)`

Split an array into multiple sub-arrays using the given indices .使用给定的indices将数组拆分为多个子数组。

for chunk in np.array_split(df, range(90, len(df), 90)):
    INSERT_sql()

Answer 3

I believe this should work:我相信这应该有效：

for i in range(0,len(df),90):
    df.iloc[i:i+90].to_sql("tableName", schema='dbo', con=engine, method='multi')

Python 循环通过特定行/索引拆分 dataframe 以插入 SQL 服务器

问题描述

3 个解决方案

解决方案1
0 2022-09-04 17:53:32

解决方案2
0 2022-09-04 18:01:42

`np.array_split(arr, indices)`

解决方案3
0 2022-09-05 07:14:04

Python 循环通过特定行/索引拆分 dataframe 以插入 SQL 服务器

问题描述

3 个解决方案

解决方案1 0 2022-09-04 17:53:32

解决方案2 0 2022-09-04 18:01:42

np.array_split(arr, indices)

解决方案3 0 2022-09-05 07:14:04

解决方案1
0 2022-09-04 17:53:32

解决方案2
0 2022-09-04 18:01:42

`np.array_split(arr, indices)`

解决方案3
0 2022-09-05 07:14:04