Processing csv iteratively 3 rows at a time in Python

Question

I have a csv file like following:

A, B, C, D
2,3,4,5
4,3,5,2
5,8,3,9
7,4,2,6
8,6,3,7

I want to fetch the B values from 3 rows at a time(for first iteration values would be 3,3,8) and save in some variable( value1=3,value2=3,value3=8 ) and pass it on to a function. Once those values are processed. I want to fetch the values from next 3 rows ( value1=3,value2=8,value3=4 ) and so on.

The csv file is large. I am a JAVA developer, if possible suggest the simplest possible code.

Answer 1

An easy solution would be the following:

import pandas as pd
data = pd.read_csv("path.csv")

for i in range(len(data)-2):
    value1 = data.loc[i,"B"]
    value2 = data.loc[i+1,"B"]
    value3 = data.loc[i+2,"B"]
    function(value1, value2, value3)

Answer 2

This is a possible solution (I have used the function proposed in this answer):

import csv
import itertools

# Function to iterate the csv file by chunks (of any size)
def grouper(n, iterable):
    it = iter(iterable)
    while True:
       chunk = tuple(itertools.islice(it, n))
       if not chunk:
           return
       yield chunk

# Open the csv file
with open('myfile.csv') as f:
    csvreader = csv.reader(f)
    # Read the headers: ['A', 'B', 'C', 'D']
    headers = next(csvreader, None)
    # Read the rest of the file by chunks of 3 rows
    for chunk in grouper(3, csvreader):
        # do something with your chunk of rows
        print(chunk)

Printed result:

(['2', '3', '4', '5'], ['4', '3', '5', '2'], ['5', '8', '3', '9'])
(['7', '4', '2', '6'], ['8', '6', '3', '7'])

Answer 3

You can use csv module

import csv
with open('data.txt') as fp:
    reader = csv.reader(fp)
    next(reader) #skips the header
    res = [int(row[1]) for row in reader]
    groups = (res[idx: idx + 3] for idx in range(0, len(res) - 2))
for a, b, c in groups:
    print(a, b, c)

Output:

3 3 8
3 8 4
8 4 6

Answer 4

You can use pandas to read your csv with chunksize argument as described here ( How can I partially read a huge CSV file? )

import pandas as pd

#Function that you want to apply to you arguments
def fn(A, B, C, D):
    print(sum(A), sum(B), sum(C), sum(D))

#Iterate through the chunks
for chunk in pd.read_csv('test.csv', chunksize=3):
    #Convert dataframe to dict
    chunk_dict = chunk.to_dict(orient = 'list')
    #Pass arguments to your functions
    fn(**chunk_dict)

Processing csv iteratively 3 rows at a time in Python

Question

4 answers

solution1
2 ACCPTED 2020-08-07 09:27:00

solution2
1 2020-08-07 09:28:04

solution3
0 2020-08-07 09:28:37

solution4
0 2020-08-07 09:40:39

Processing csv iteratively 3 rows at a time in Python

Question

4 answers

solution1 2 ACCPTED 2020-08-07 09:27:00

solution2 1 2020-08-07 09:28:04

solution3 0 2020-08-07 09:28:37

solution4 0 2020-08-07 09:40:39

solution1
2 ACCPTED 2020-08-07 09:27:00

solution2
1 2020-08-07 09:28:04

solution3
0 2020-08-07 09:28:37

solution4
0 2020-08-07 09:40:39