Python: Using Excel CSV file to read only certain columns and rows

Question

While I can read csv file instead of reading to whole file how can I print only certain rows and columns?

Imagine as if this is Excel:

  A              B              C                  D                    E
State  |Heart Disease Rate| Stroke Death Rate | HIV Diagnosis Rate |Teen Birth Rate

Alabama     235.5             54.5                 16.7                 18.01

Alaska      147.9             44.3                  3.2                  N/A    

Arizona     152.5             32.7                 11.9                  N/A    

Arkansas    221.8             57.4                 10.2                  N/A    

California  177.9             42.2                  N/A                  N/A    

Colorado    145.3             39                    8.4                 9.25

Heres what I have:

import csv

try:
    risk = open('riskfactors.csv', 'r', encoding="windows-1252").read() #find the file

except:
    while risk != "riskfactors.csv":  # if the file cant be found if there is an error
    print("Could not open", risk, "file")
    risk = input("\nPlease try to open file again: ")
else:
    with open("riskfactors.csv") as f:
        reader = csv.reader(f, delimiter=' ', quotechar='|')

        data = []
        for row in reader:# Number of rows including the death rates 
            for col in (2,4): # The columns I want read   B and D
                data.append(row)
                data.append(col)
        for item in data:
            print(item) #print the rows and columns

I need to only read column B and D with all statistics to read like this:

  A              B                D                    
 State  |Heart Disease Rate| HIV Diagnosis Rate |

 Alabama       235.5             16.7                

  Alaska       147.9             3.2                     

  Arizona      152.5             11.9                     

  Arkansas     221.8             10.2                    

 California    177.9             N/A                     

 Colorado      145.3             8.4

Edited

no errors

Any ideas on how to tackle this? Everything I try isn't working. Any help or advice is much appreciated.

Answer 1

I hope you have heard about Pandas for Data Analysis.

The following code will do the job for reading columns however about reading rows, you might have to explain more.

import pandas
io = pandas.read_csv('test.csv',sep=",",usecols=(1,2,4)) # To read 1st,2nd and 4th columns
print io

Answer 2

If you're still stuck, there's really no reason you have to read the file with the CSV module as all CSV files are just comma separated strings. So, for something simple you could try this, which would give you a list of tuples of the form (state,heart disease rate,HIV diagnosis rate)

output = []

f = open( 'riskfactors.csv', 'rU' ) #open the file in read universal mode
for line in f:
    cells = line.split( "," )
    output.append( ( cells[ 0 ], cells[ 1 ], cells[ 3 ] ) ) #since we want the first, second and third column

f.close()

print output

Just note that you would then have to go through and ignore the header rows if you wanted to do any sort of data analysis.

Answer 3

try this

data = []
for row in reader:# Number of rows including the death rates
    data.append([row[1],row[3]) # The columns I want read  B and D
for item in data
            print(item) #print the rows and columns

Python: Using Excel CSV file to read only certain columns and rows

Question

Edited

3 answers

solution1
10 2013-03-08 05:42:46

solution2
3 ACCPTED 2013-03-08 16:03:25

solution3
2 2013-03-08 04:24:36

Python: Using Excel CSV file to read only certain columns and rows

Question

Edited

3 answers

solution1 10 2013-03-08 05:42:46

solution2 3 ACCPTED 2013-03-08 16:03:25

solution3 2 2013-03-08 04:24:36

solution1
10 2013-03-08 05:42:46

solution2
3 ACCPTED 2013-03-08 16:03:25

solution3
2 2013-03-08 04:24:36