简体   繁体   中英

Is there a Pandas equivalent to each_slice to operate on dataframes

I am wondering if there is a Python or Pandas function that approximates the Ruby #each_slice method. In this example, the Ruby #each_slice method will take the array or hash and break it into groups of 100.

var.each_slice(100) do |batch|
  # do some work on each batch

I am trying to do this same operation on a Pandas dataframe. Is there a Pythonic way to accomplish the same thing?

I have checked out this answer: Python equivalent of Ruby's each_slice(count)

However, it is old and is not Pandas specific. I am checking it out but am wondering if there is a more direct method.

There isn't a built in method as such but you can use numpy's array_slice , you can pass the dataframe to this and the number of slices.

In order to get ~100 size slices you'll have to calculate this which is simply the number of rows/100:

import numpy as np
# df.shape returns the dimensions in a tuple, the first dimension is the number of rows
np.array_slice(df, df.shape[0]/100)

This returns a list of dataframes sliced as evenly as possible

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM