I am using the pandas
module for reading the data from a .csv
file.
I can write out the following code to extract the data belonging to an individual column as follows:
import pandas as pd
df = pd.read_csv('somefile.tsv', sep='\t', header=0)
some_column = df.column_name
print some_column # Gives the values of all entries in the column
However, the file that I am trying to read now has more than 5000 columns and writing out the statement
some_column = df.column_name
is now not feasible. How can I get all the column values so that I can access them using indexing?
eg to extract the value present at the 100th row and the 50th column, I should be able to write something like this:
df([100][50])
Use DataFrame.iloc
or DataFrame.iat
, but python counts from 0
, so need 99
and 49
for select 100.
row and 50.
column:
df = df.iloc[99,49]
Sample - select 3.
row and 4.
column:
df = pd.DataFrame({'A':[1,2,3],
'B':[4,5,6],
'C':[7,8,9],
'D':[1,3,10],
'E':[5,3,6],
'F':[7,4,3]})
print (df)
A B C D E F
0 1 4 7 1 5 7
1 2 5 8 3 3 4
2 3 6 9 10 6 3
print (df.iloc[2,3])
10
print (df.iat[2,3])
10
Combination for selecting by column name and position of row is possible by Series.iloc
or Series.iat
:
print (df['D'].iloc[2])
10
print (df['D'].iat[2])
10
Pandas has indexing for dataframes, so you can use
df.iloc[[index]]["column header"]
the index is in a list as you can pass multiple indexes at one in this way.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.