简体   繁体   English

编写python脚本以抓取excel数据并写入CSV,如何获得正确的输出?

[英]Writing a python script to scrape excel data and write to a CSV, how do I get the proper output?

I've got an Excel document with rows named 'foo' and and columns named 'bar'. 我有一个Excel文档,其中包含名为“ foo”的行和名为“ bar”的列。 Foo and bar are sometimes associated with an 'x'. Foo和bar有时与“ x”关联。

FooBar Tracker Excel Doc FooBar Tracker Excel文档

I've written some python code that searches the document for 'x' and then lists the associated foo and bar values. 我编写了一些python代码,在文档中搜索“ x”,然后列出了相关的foo和bar值。 When I just print the output, all of the values are printed to the console. 当我仅打印输出时,所有值都将打印到控制台。 When I try to store the output as a variable and print the variable, I only get the final valid foo and bar combination. 当我尝试将输出存储为变量并打印该变量时,我仅获得最终的有效foo和bar组合。

import xlrd
import csv

###Grab the data 
def get_row_values(workSheet, row):
    to_return = []
    num_cells = myWorksheet.ncols - 1
    curr_cell = -1
    while curr_cell < num_cells:
        curr_cell += 1
        cell_value = myWorksheet.cell_value(row, curr_cell)
        to_return.append(cell_value)
    return to_return

file_path = 'map_test.xlsx'

myWorkbook = xlrd.open_workbook(file_path)
myWorksheet = myWorkbook.sheet_by_name('Sheet1')
num_rows = myWorksheet.nrows - 1
curr_row = 0
column_names = get_row_values(myWorksheet, curr_row)
print len(column_names)
while curr_row < num_rows:
        curr_row += 1 
        row = myWorksheet.row(curr_row)
        this_row = get_row_values(myWorksheet, curr_row)
        x = 0
        while x <len(this_row):
            if this_row[x] == 'x':
                    #print this_row[0], column_names[x]  
### print this_row[0], column_names[x] works 
### when I un-comment it, and prints foo and bar associated in the 
### proper order
                    output = "[%s %s]" % (this_row[0], column_names[x]) 
            x += 1

print output 
###Using the output variable just outputs the last valid foo/bar 
###combination 

Why is this? 为什么是这样? How do I fix it? 我如何解决它?

Second, when I try to write the data to a .csv file, the broken output gets added to the .csv with one character in each cell. 其次,当我尝试将数据写入.csv文件时,损坏的输出将添加到.csv中,每个单元格中只有一个字符。 I need to be able to have each unique value go into its own cell, and control which cells they go into. 我需要能够将每个唯一值放入其自己的单元格,并控制它们进入的单元格。 Here's what I have so far: 这是我到目前为止的内容:

myData = [["number", "name", "version", "bar" "foo"]]

myFile = open('test123.csv', 'w')
with myFile:
        writer = csv.writer(myFile)
        writer.writerows(myData)
        writer.writerows(output) ###This just outputs the last valid foo 
###and bar combination
print ("CSV Written")

The output ends up looking like this: Results I'm getting 输出最终看起来像这样: 结果我得到

But I want it to look like this: Results I want 但我希望它看起来像这样: 我想要的结果

Your output variable (your accumulator) is not continually adding values but over-writing the row, column values every loop run. 您的output变量(您的累加器)不会持续添加值,而是会在每次循环运行时覆盖行,列值。 Your print statement works because it is printing for every loop run and that's what you see. 您的print语句有效,因为它在每次循环运行时都在打印,这就是您所看到的。

To fix, set your output variable as an empty list outside of your while loops: 要解决此问题,请将输出变量设置为while循环之外的空列表:

output = []

Then change this line: 然后更改此行:

output = "[%s %s]" % (this_row[0], column_names[x]) 

To this: 对此:

output.append([this_row[0], column_names[x]]) 

The other problem you are having is that your output was coming out funny. 您遇到的另一个问题是您的输出结果很有趣。 This is because of this line: 这是因为此行:

output = "[%s %s]" % (this_row[0], column_names[x]) 

You are asking python to render this_row as a string and then give you the character in position [0] , which is likely just "f". 您正在要求python将this_row呈现为字符串,然后在位置[0]给您一个字符,该字符可能只是“ f”。 The above change to your code fixes this also. 上面对代码的更改也解决了此问题。

On a side note, it would be considered better form to use a for loop for this instead of a while loop. 顺便说一句,使用for循环代替while循环会被认为是更好的形式。 eg 例如

for row in range(0,num_rows) :

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM