简体   繁体   English

跳过某些数据行-Python

[英]Skipping certain lines of data - Python

I have a CSV file with a ton of data that needs to be plotted and sorted and what not. 我有一个CSV文件,其中包含大量需要绘制和排序的数据,而没有。 An example of the data is below. 数据示例如下。

10, 50, 60, 74, 19
10, 55, 68, 93, 10
10, 84, 92, 75, 32
10, 58, 39, 82, 12
20, 15, 12, 84, 35
20, 53, 13, 96, 57
20, 53, 32, 64, 67
20, 56, 31, 29, 18
30, 85, 92, 18, 95
30, 75, 12, 92, 12
...
90, 35, 21, 95, 47
100, 67, 96, 73, 47
100, 86, 32, 62, 32
100, 32, 53, 69, 57
100, 34, 64, 72, 34

What I'm looking for is taking the first row, 4th, row, and 8th row, etc. and putting them into a list. 我要寻找的是将第一行,第四行,第八行等放入列表中。 so it looks as such: 所以看起来像这样:

column1 = ['10', '20', '30', ..., '100']
column3 = ['60', '12', '92', ..., '73']
column5 = ['19', '35', '95', ..., '47']

Note: the first row from the data set should be in the first column of the output, 2nd row from the data is the 2nd column of output, etc. Also, I want to be able to control which columns I select to put into the lists (and which rows as well). 注意:数据集中的第一行应该在输出的第一列中,数据第二行是输出的第二列,依此类推。此外,我希望能够控制我选择将哪些列放入列表(以及哪些行)。

I'm also looking for a way to adjust which nth row I want to begin with. 我也在寻找一种方法来调整我要开始的第n行。 For example, if we start with row 2, the output would be as such: 例如,如果我们从第2行开始,则输出将是这样的:

column1 = ['10', '20', '30', ..., '100']
column3 = ['68', '13', '12', ..., '32']
column5 = ['10', '35', '12', ..., '32']

This is the code I have so far: 这是我到目前为止的代码:

import numpy as np
import matplotlib.pyplot as plt
import csv

column1 = []
column2 = []
column4 = []

with open('csvFile.csv', 'rb') as f:
    w = csv.reader(f, delimiter = ',')
    for i, line in enumerate(w):
        if i == 0 or i == 1:
            pass # Skip first row
        else:
            column1.append(line[1])
            column2.append(line[2])
            column4.append(line[4])

This gives me ALL the values in the columns, which I don't want. 这给了我列中所有不需要的值。 Maybe I'm overthinking this, as what I was thinking about doing after was indexing the lists and removing the values I don't want (which my dataset is MUCH larger than what is shown here - I have a total of 26 rows per first number (ie 26 rows of the number 10 with data after it, 26 rows of 20 with data, 26 of 30, etc.)) 也许是我想得太过分了,因为我想做的事就是为列表建立索引并删除不需要的值(我的数据集比这里显示的要大得多-我总共有26行数字(即数字10后面有数据的26行,数据有数字20的26行,30的26等)

You can just check whether i is a multiple of four or not. 您可以检查i是否是四的倍数。 If it is not a multiple of four then skip 如果不是四的倍数,则跳过

with open("data", 'rb') as f:
    w = csv.reader(f,delimiter = ",")
    for i, line in enumerate(w):
        if (i % 4 == 0): 
            column1.append(line[0])
            column2.append(line[1])
            column3.append(line[2])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM