简体   繁体   中英

How do I properly write a CSV file within a for loop in python?

I am using the following code to scrape content from a webpage with the end goal of writing to a CSV. On the first iteration I had this portion working, but now that my data is formatted differently it writes the data in a way that gets mangled when I try to view it in excel.

If I use the code below the "heading.text" data is correctly put into one cell when viewed in excel. Where as the contents of "child.text" is packed into one cell rather then being split based on the commas. You will see I have attempted to clean up the content of "child.text" in an effort to see if that was my issue.

If I remove "heading.text" from "z" and try again, it writes in a way that has excel showing one letter per cell. In the end I would like each value that is seperated by commas to display in one cell when viewed in excel, I believe I am doing something (many things?) incorrectly in structuring "z" and or when I write the row.

Any guidance would be greatly appreciated. Thank you.

    csvwriter = csv.writer(csvfile) 
    for heading in All_Heading:
        driver.execute_script("return arguments[0].scrollIntoView(true);", heading)
        print("------------- " + heading.text + " -------------")
        ChildElement = heading.find_elements_by_xpath("./../div/div")
        for child in ChildElement:
            driver.execute_script("return arguments[0].scrollIntoView(true);", child)
            #print(heading.text)
            #print(child.text)
            z = (heading.text, child.text)
            print (z)
            csvwriter.writerow(z)

When I print "z" I get the following:

('Flower', 'Afghani 3.5g Pre-Pack Details\nGREEN GOLD ORGANICS\nAfghani 3.5g Pre-Pack\nIndica\nTHC: 16.2%\n1/8 oz  -  \n$45.00')

When I print "z" with the older code that split the string on "\n" I get the following:

('Flower', "Cherry Limeade 3.5g Flower - BeWell Details', 'BE WELL', 'Cherry Limeade 3.5g Flower - BeWell', 'Hybrid', 'THC: 18.7 mg', '1/8 oz  -  ', '$56.67")

csv.writerow() takes an iterable, each element of which is separated by the writer's delimiter ie made a different cell.

First let's see what's been happening with you till now:

  1. (heading.text, child.text) has two elements ie two cells, heading.text and child.text
  2. (child.text) is simply child.text (would be a tuple if it was (child.text**,**)) and a string's elements are each letter. Hence each letter made its own cell.

To get different cells in a row we need separate elements in our iterable so we want an iterable like [header.text, child.text line 1, child.text line 2, ...]. You were right in splitting the text into lines but the lines weren't being added to it correctly. Tuples being immutable I'll use a list instead:

  1. We know heading.text is to take a single cell so we can write the following to start with
row  = [heading.text] # this is what your z is
  1. We want each line to be a separate element so we split child.text:
lines = child.text.split("\n") 
# The text doesn’t start or end with a newline so this should suffice
  1. Now we want each element to be added to the row separately, we can make use of the extend() method on lists:
row.extend(lines)
# [1, 2].extend([3, 4, 5]) would result in [1, 2, 3, 4, 5]

To cumulate it:

row  = [heading.text]
lines = child.text.split("\n") 
row.extend(lines)

or unpacking it in a single line:

row = [heading.text, *child.text.split("\n")] # You can also use a tuple here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM