I have some data in Microsoft excel that I save them as CSV file for ease of use. the data Structure is like this:
MS Excel format:
L1
0 1 0 0 0 1 1
0 0 1 0 0 1 0
0 0 0 1 0 0 1
0 0 0 0 1 0 0
1 1 1 1 1 1 1
1 1 1 1 1 1 1
1 1 1 1 1 1 1
1 1 1 1 1 1 1
CSV format
L1,,,,,,,,,,,,,,
0,1,0,0,0,1,1,
0,0,1,0,0,1,0,
0,0,0,1,0,0,1,
0,0,0,0,1,0,0,
1,1,1,1,1,1,1,
1,1,1,1,1,1,1,
1,1,1,1,1,1,1,
1,1,1,1,1,1,1,
As you see only the first column has label now I want to read the CSV file (or it's easier the excel file) to get each column and do some bit manipulation operation on them. How can I acheive this? I have read something about pandas But I can't find anything useful in order to fetch each coloumn
Given the .csv file temp.csv
L1x,,,,,,,
0,1,0,0,0,1,1,
0,0,1,0,0,1,0,
0,0,0,1,0,0,1,
0,0,0,0,1,0,0,
1,1,1,1,1,1,1,
1,1,1,1,1,1,1,
1,1,1,1,1,1,1,
1,1,1,1,1,1,1,
read it in as follows:
import pandas
a = pandas.read_csv('temp.csv', names = ["c%d" % i for i in range(8)], skiprows = 1)
a
Output:
c0 c1 c2 c3 c4 c5 c6 c7
0 0 1 0 0 0 1 1 NaN
1 0 0 1 0 0 1 0 NaN
2 0 0 0 1 0 0 1 NaN
3 0 0 0 0 1 0 0 NaN
4 1 1 1 1 1 1 1 NaN
5 1 1 1 1 1 1 1 NaN
6 1 1 1 1 1 1 1 NaN
7 1 1 1 1 1 1 1 NaN
The 'NaN's in the last column come from the pesky trailing commas. The 8 in the range needs to match the number of columns. To access the columns in a
use either
a.c3
or
a[c3]
both of which result in
0 0
1 0
2 1
3 0
4 1
5 1
6 1
7 1
Name: c3
The cool thing about pandas is that if you want to XOR two columns you can very simply.
a.c0^a.c2
Output
0 0
1 1
2 0
3 0
4 0
5 0
6 0
7 0
Name: c0
Assume I have:
Which you can save into a CSV file that looks like so:
L1,,,
L2,0,10,20
L3,1,11,21
L4,2,12,22
L5,3,13,23
L6,4,14,24
L7,5,15,25
L8,6,16,26
L9,7,17,27
L10,8,18,28
To get just any col, use CSV reader and transpose with zip:
import csv
with open('test.csv', 'rU') as fin:
reader=csv.reader(fin)
data=list(reader)
print 'data:', data
# data: [['L1', '', '', ''], ['L2', '0', '10', '20'], ['L3', '1', '11', '21'], ['L4', '2', '12', '22'], ['L5', '3', '13', '23'], ['L6', '4', '14', '24'], ['L7', '5', '15', '25'], ['L8', '6', '16', '26'], ['L9', '7', '17', '27'], ['L10', '8', '18', '28']]
Notice the data is a list of rows. You can transpose that List of Lists using zip to get a list of columns:
trans=zip(*data)
print 'trans:',trans
# trans: [('L1', 'L2', 'L3', 'L4', 'L5', 'L6', 'L7', 'L8', 'L9', 'L10'), ('', '0', '1', '2', '3', '4', '5', '6', '7', '8'), ('', '10', '11', '12', '13', '14', '15', '16', '17', '18'), ('', '20', '21', '22', '23', '24', '25', '26', '27', '28')]
Then just index to get a specific column:
print trans[0]
# ('L1', 'L2', 'L3', 'L4', 'L5', 'L6', 'L7', 'L8', 'L9', 'L10')
Of course if you want to do arithmetic on the cells, you will need to convert the string to ints or floats as appropriate.
import pandas as pd
pd.read_excel("foo.xls", "Sheet 1",
names=["c%d" % i for i in range(7)])
Output:
c0 c1 c2 c3 c4 c5 c6
0 0 1 0 0 0 1 1
1 0 0 1 0 0 1 0
2 0 0 0 1 0 0 1
3 0 0 0 0 1 0 0
4 1 1 1 1 1 1 1
5 1 1 1 1 1 1 1
6 1 1 1 1 1 1 1
7 1 1 1 1 1 1 1
Sample Code returns column as array.:
input = """L1,,,,,,,,,,,,,,
0,1,0,0,0,1,1,
0,0,1,0,0,1,0,
0,0,0,1,0,0,1,
0,0,0,0,1,0,0,
1,1,1,1,1,1,1,
1,1,1,1,1,1,1,
1,1,1,1,1,1,1,
1,1,1,1,1,1,1,
"""
def getColumn(data,column_number):
dump_array=[]
lines=data.split("\n")
for line in lines:
tmp_cell = line.split(",")
dump_array.append(tmp_cell[3])
return dump_array
#for ex. get column 3
getColumn(3,input)
This may give an idea to manuplate your grid...
Note: I dont have an interpreter for testing code now, so sorry if there is typo...
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.