简体   繁体   English

将for循环的输出写入python中的csv

[英]Write output from for loop to a csv in python

I am opening a csv called Remarks_Drug.csv which contains product names and mapped filenames in consecutive columns. 我打开一个名为Remarks_Drug.csv的csv,其中包含连续列中的产品名称和映射文件名 I am doing some operations on the product column to remove all string content after + character. 我正在对产品列执行一些操作,以删除+字符后的所有字符串内容。 After stripping the string from + characters, I am storing the result in a variable called product_patterns . +字符中剥离字符串后,我将结果存储在名为product_patterns的变量中。

Now I am opening a new csv and I want to write the output from the for loop into two columns, the first one containing the product_patterns and the second one containing the corresponding filenames . 现在我打开一个新的csv ,我想把for循环的输出写成两列,第一列包含product_patterns ,第二列包含相应的filenames

What I am getting as output now is only the last row of the output csv that I am looking for. 我现在得到的输出只是我正在寻找的output csv的最后一行。 I think I am not looping properly so that each row of product_patterns and filename gets appended in the output csv file. 我认为我没有正确循环,因此product_patterns和filename的每一行都会附加在output csv文件中。

Can someone please help me with this. 有人可以帮我这个。

Attaching code below: 附上以下代码:

import csv


with open('Remarks_Drug.csv', newline='', encoding ='utf-8') as myFile:
    reader = csv.reader(myFile)
    for row in reader:
        product = row[0].lower()
        #print('K---'+ product)
        filename = row[1]
        product_patterns = ', '.join([i.split("+")[0].strip() for i in product.split(",")])


        #print(product_patterns, filename)

    with open ('drug_output100.csv', 'a') as csvfile:
        fieldnames = ['product_patterns', 'filename']
        print(fieldnames)
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
        print(writer)
        #writer.writeheader()
        writer.writerow({'product_patterns':product_patterns, 'filename':filename})

Sample input: 样本输入:

    Film-coated tablet + TERIFLUNOMIDE, 2011-07-18 - Received approval letter_EN.txt
    Film-coated tablet + VANDETANIB,             2013-12-14 RECD Eudralink_Caprelsa II-28 - RSI - 14.12.2017.txt
    Solution for injection + MenQuadTT, 395_EU001930-PIP01-16_2016-02-22.txt
    Solution for injection + INSULIN GLARGINE,  2017-11-4 Updated PR.txt
    Solution for injection + INSULIN GLARGINE + LIXISENATIDE,   2017 12 12 Email Approval Texts - SA1006-.txt

I hope this is the right way for you, if is not, tell me and we check. 我希望这是对你的正确方式,如果不是,请告诉我,我们检查。

import csv

with open('Remarks_Drug.csv') as myFile:
    reader = csv.reader(myFile)
    products_list = list()
    filenames_list = list()

    for row in reader:
        products_list.append(row[0].lower().split("+")[0].strip())
        filenames_list.append(row[1])

    for index, product in enumerate(products_list):
        with open ('drug_output100.csv', 'a') as csvfile:
            fieldnames = ['product_patterns', 'filename']
            print(fieldnames)
            writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
            print(writer)
            writer.writerow({'product_patterns':product, 'filename':filenames_list[index]})
  1. Open the Remarks_Drug.csv file and create two list where store the row value elaborated as you prefer. 打开Remarks_Drug.csv文件并创建两个列表,其中存储根据您的喜好精心设计的行值。
  2. Iterate on the product list and enumerate it so you have an index to use on the filename list. 迭代产品列表并枚举它,以便在文件名列表中使用索引。
  3. Open the output file and append to it the result. 打开输出文件并将结果附加到其中。

You can also use pandas to elaborate csv files, faster and in a smart way. 您还可以使用pandas以更智能的方式更快地制作csv文件。

Here the pandas solution: 这里的熊猫解决方案:

import pandas as pd

def select_real_product(string_to_elaborate):
    return string_to_elaborate.split('+')[0].strip()

df = pd.read_csv("Remarks_Drug.csv", delimiter=',', names=("product", "filename"))

df['product'] = df['product'].apply(select_real_product)

df.to_csv("drug_output100.csv", sep=',', na_rep='empty',index_label=False, index=False)
import csv
import pandas as pd

with open('Remarks_Drug.csv', newline='', encoding ='utf-8') as myFile:
    reader = csv.reader(myFile)
    mydrug = []
    for row in reader:
        product = row[0].lower()
        #print('K---'+ product)
        filename = row[1]
        product_patterns = ', '.join([i.split("+")[0].strip() for i in product.split(",")])
        mydrug.append([product_patterns, filename])

#     print(mydrug)

    df = pd.DataFrame(mydrug, columns=['product_patterns', 'filename'])
    print(df)
    df.to_csv('drug_output100.csv', sep=',', index=False)

This utilizes pandas library. 这利用了pandas图书馆。 If you're to deal with large csv files using panda s will be handy and efficient in terms of performance and memory. 如果您要使用panda处理大型csv文件,在性能和内存方面将非常方便和高效。 This is just an alternative solution for the above. 这只是上述的替代解决方案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM