简体   繁体   中英

Python-Selenium: How to export list into CSV all in new lines?

I am using Selenium to scrape People also Ask question and answers on Google and want to export the outputs (questions, answers and URL) to a csv file but I want each of them in different lines.

Everything goes well I can even print out everything on different lines, but when checking my output csv the question and answer are all in one row.

My code looks like this:

from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.firefox.options import Options

from tqdm import tqdm 
from time import sleep

import itertools
import threading
import time
import sys
import csv
import pandas as pd 

query = "nflx"
clicks = 4

def search(query,clicks):
       with webdriver.Firefox() as driver:
            
            driver.get("https://www.google.com?hl=en")
            driver.find_element_by_xpath("//input[@aria-label='Search']").send_keys(query)
            driver.find_elements_by_xpath("/html/body/div/div[3]/form/div[2]/div[1]/div[3]/center/input[1]")
            searchbtn = driver.find_elements_by_xpath("//input[@aria-label='Google Search']")
            searchbtn[-1].click()
            
            #Questions with answers. Have to clean a little bit.
            
            paa = driver.find_elements_by_css_selector('div.related-question-pair')
            for i in range(clicks):
                paa[i].click()
                paa = driver.find_elements_by_css_selector('div.related-question-pair')
            list_paa = [] 
            for j in paa:
                    p = format(j.text)
                    print(p)
                    list_paa.append(p)

To export I tried this:

with open('file1.csv', 'w',newline='\n', encoding='utf-8') as file:
    writer = csv.writer(file)
    for row in list_paa:
        writer.writerow(zip(row))

And this:

#Tried this
df = pd.DataFrame(list_paa, columns=["column"])
df.to_csv('list.csv', index=False)                                     

Current CSV output when executing search(query,clicks) :

当前 CSV 输出

Desired CSV output for all questions:

在此处输入图像描述

I guess running a for loop to process the data and split it with splitlines() would be the easiest way to go about?

As an example:

        list_paa = [] 
        for j in paa:
                p = format(j.text)

                p = p.splitlines()

                print(p)
                list_paa.append(p)

Im sure there is more to add to this example for it actually to work as intended by you get the idea:).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM