简体   繁体   English

如何使用 Pandas 逐个读取 CSV 文件?

[英]How to read a CSV file subset by subset with Pandas?

I have a data frame with 13000 rows and 3 columns:我有一个包含 13000 行和 3 列的数据框:

('time', 'rowScore', 'label')

I want to read subset by subset:我想逐个读取子集:

[[1..360], [360..712], ..., [12640..13000]]

I used list too but it's not working:我也使用了 list 但它不起作用:

import pandas as pd
import math
import datetime

result="data.csv"
dataSet = pd.read_csv(result)
TP=0
count=0
x=0
df = pd.DataFrame(dataSet, columns = 
     ['rawScore','label'])
for i,row in df.iterrows():
    data=  row.to_dict()   

    ScoreX= data['rawScore']
    labelX=data['label']


  for i in range (1,13000,360):
     x=x+1
    for j in range (i,360*x,1):
        if ((ScoreX  > 0.3) and (labelX ==0)):
            count=count+1
 print("count=",count)

You can also use the parameters nrows or skiprows to break it up into chunks.您还可以使用参数nrowsskiprows将其分解为块。 I would recommend against using iterrows since that is typically very slow.我建议不要使用iterrows因为它通常很慢。 If you do this when reading in the values, and saving these chunks separately, then it would skip the iterrows section.如果您在读取值时执行此操作,并分别保存这些块,则会跳过 iterrows 部分。 This is for the file reading if you want to split up into chunks (which seems to be an intermediate step in what you're trying to do).如果您想分成多个块,这是用于文件读取(这似乎是您尝试做的中间步骤)。

Another way is to subset using generators by seeing if the values belong to each set: [[1..360], [360..712], ..., [12640..13000]]另一种方法是通过查看值是否属于每个集合来使用生成器进行子集化:[[1..360], [360..712], ..., [12640..13000]]

So write a function that takes the chunks with indices divisible by 360 and if the indices are in that range, then choose that particular subset.因此,编写一个函数,该函数采用索引可被 360 整除的块,如果索引在该范围内,则选择该特定子集。

I just wrote these approaches down as alternative ideas you might want to play around with, since in some cases you may only want a subset and not all of the chunks for calculation purposes.我只是将这些方法写下来作为您可能想要尝试的替代想法,因为在某些情况下,您可能只需要一个子集而不是所有块用于计算目的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM