将 dataframe 均匀地拆分为许多较小的数据帧

Question

I have the following frame called df which is 65 items long.我有以下称为 df 的框架，它有 65 个项目长。

   Name  Data
0  Name1 Data1
1  Name2 Data2
2  Name3 Data3
....

I want to split it into 30 data frames as evenly as possible.我想尽可能均匀地将其拆分为 30 个数据帧。

So with a length of 65, I want there to be 5 frames of length 3 and 25 of length 2 (which adds up to 65)所以长度为 65，我希望有 5 个长度为 3 的帧和 25 个长度为 2 的帧（加起来为 65）

I use the following function:我使用以下 function：

def chunk(seq, size):
    return (seq[pos:pos + size] for pos in range(0, len(seq), size))

n = 30 #number of files

length = len(df)

counter=0

for df_chunk in chunk(frame, int(length / n) + (length % n > 0)):
    counter+=1
    df_chunk.to_csv(f"path/to/file{counter}.csv")

But I only get 21 files which are, 3 in length and 1 file which is 2 in length instead of 5 files which are 3 in length and 25 which are 2 in length.但是我只得到 21 个长度为 3 的文件和 1 个长度为 2 的文件，而不是 5 个长度为 3 的文件和 25 个长度为 2 的文件。

Anyone has any ideas on how I can achieve what I want?有人对我如何实现我想要的有任何想法吗？

Answer 1

Use, np.array_split , from the documentation it says:从它说的文档中使用np.array_split ：

For an array of length l that should be split into n sections, it returns l % n sub-arrays of size l//n + 1 and the rest of size l//n.对于一个长度为l的数组，应该分成 n 个部分，它返回大小为l//n + 1的l % n个子数组和大小为 l//n 的 rest l//n. : ：

for counter, df_chunk in enumerate(np.array_split(df, 30), 1):
    df_chunk.to_csv(f"path/to/file{counter}.csv")

将 dataframe 均匀地拆分为许多较小的数据帧

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-06-16 12:14:42

将 dataframe 均匀地拆分为许多较小的数据帧

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-06-16 12:14:42

解决方案1
1 已采纳 2020-06-16 12:14:42