简体   繁体   中英

How to read a CSV file every other row

how do I take from a CSV file data every 2 rows?

For example if I have a file that looks this

  0   1
0 23  34
1 45  45
2 78  16
3 110 78
4 48  14
5 76  23
6 55  33
7 12  13
8 18  76

how can iterate and extract every 2nd row to get something like this and append in a new dataframe?

0 23  34
2 78  16
4 48  14
6 55  33
8 18  76

Thank you!

Use the skiprows parameter of read_csv :

To keep even rows:

pd.read_csv('file.csv', skiprows=lambda x: (x != 0) and not x % 2)

To keep odd rows:

pd.read_csv('file.csv', skiprows=lambda x: x % 2)

Note that the header is included in skiprows , which is why the x != 0 is needed in the even example.

Example:

In [1]: import pandas as pd
   ...: from io import StringIO
   ...:
   ...: data = """A,B
   ...: a,1
   ...: b,2
   ...: c,3
   ...: d,4
   ...: e,5
   ...: """

In [2]: pd.read_csv(StringIO(data))
Out[2]:
   A  B
0  a  1
1  b  2
2  c  3
3  d  4
4  e  5

In [3]: pd.read_csv(StringIO(data), skiprows=lambda x: (x != 0) and not x % 2)
Out[3]:
   A  B
0  a  1
1  c  3
2  e  5

In [4]: pd.read_csv(StringIO(data), skiprows=lambda x: x % 2)
Out[4]:
   A  B
0  b  2
1  d  4

you could read them all into memory with numpy and store every other row:

import numpy as np
import pandas as pd

data = np.loadtxt(filename)
data = pd.DataFrame(data[::2])

The last bit, [::2] , means "take every second element".

Personally, I think the easiest answer (if you only want even-numbered rows) is to do:

import pandas as pd
df = pd.read_csv('csv_file.csv')
rows_we_want = [row for i,row in enumerate(df.index) if not i % 2]
df_new = df.loc[rows_we_want]

enumerate() is a powerful function in Python and "if not i % 2" is only True when the row number (i) is even. You could delete the "not" if you want the odd-numbered rows instead. I think this approach is easier than reading in the file line-by-line, though there could be scalability issues with this if your file is extremely large. Hope this helps

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM