简体   繁体   English

使用python将特定数据从一个Excel文件传输到另一个

[英]Transferring specific data from one excel file to another using python

I just started learning Python and I need help with a script my internship asked me to write. 我刚开始学习Python,我需要实习要求我编写的脚本的帮助。

I have a csv file (sheet1.csv) and I need to extract data from only two of the columns which have the headers referenceID and PartNumber that correspond to each other. 我有一个csv文件(sheet1.csv),我只需要从标题彼此对应的标题referenceID和PartNumber的两列中提取数据。 I need to update a separate csv file called sheet2.csv which also contains the two columns referenceID and PartNumber however many of the PartNumber cells are empty. 我需要更新一个名为sheet2.csv的单独的csv文件,该文件还包含两列referenceID和PartNumber,但是许多PartNumber单元格为空。

Basically I need to fill in the “PartNumber” field with the values from sheet1. 基本上,我需要使用sheet1中的值填写“ PartNumber”字段。 From the research I've done I've decided using dictionaries are a solid approach to writing this script (I think). 根据我所做的研究,我认为使用字典是编写此脚本的可靠方法(我认为)。 So far I have been able to read the files and create two dictionaries with the referenceIDs as the keys and the PartNumber as values… Here is what I have showing an example of what the dictionaries look like. 到目前为止,我已经能够读取文件并创建两个字典,这些字典的referenceIDs为键,而PartNumber为值...这是我展示的字典外观示例。

import csv 
a = open('sheet1.csv', 'rU')
b = open('sheet2.csv', 'rU')
csvReadera = csv.DictReader(a)
csvReaderb = csv.DictReader(b)
a_dict = {}
b_dict = {}

for line in csvReadera:
    a_dict[line["ReferenceID"]] = line["PartNumber"]
print(a_dict)

for line in csvReaderb:
    b_dict[line["ReferenceID"]] = line["PartNumber"]
print(b_dict)

a_dict = {'R150': 'PN000123', 'R331': 'PN000873', 'C774': 'PN000064', 'L7896': 'PN000447', 'R0640': 'PN000878', 'R454': 'PN000333'}
b_dict = {'C774': '', 'R331': '', 'R454': '', 'L7896': 'PN000000', 'R0640': '', 'R150': 'PN000333'}

How can I compare the two dictionaries and fill in/overwrite the missing values for b-dict and then write to sheet2? 如何比较两个字典并填写/覆盖b-dict的缺失值,然后写入sheet2? Certainly, there must be more efficient methods than what I have come up with, but I have never used Python before so please forgive my pitiful attempt! 当然,必须有比我想出的方法更有效的方法,但是我以前从未使用过Python,所以请原谅我的可怜尝试!

have a look at the pandas library. 看看熊猫图书馆。

import padas as pd

#this is how you read
dfa = pd.read_csv("sheet1.csv")
dfb = pd.read_csv("sheet2.csv")

let s jus take the dicts you defined as testdata 让我们接受您定义为testdata的字典

a_dict = {'R150': 'PN000123', 'R331': 'PN000873', 'C774': 'PN000064', 'L7896': 'PN000447', 'R0640': 'PN000878', 'R454': 'PN000333'}
b_dict = {'C774': '', 'R331': '', 'R454': '', 'L7896': 'PN000000', 'R0640': '', 'R150': 'PN000333'}
dfar = pd.DataFrame(a_dict.items(), columns = ['ReferenceID', 'PartNumber'])
dfbr = pd.DataFrame(b_dict.items(), columns = ['ReferenceID', 'PartNumber'])
dfa = dfar[['ReferenceID', 'PartNumber']]
dfa.columns = ['ReferenceIDA', 'PartNumberA']
dfb = dfbr[['ReferenceID', 'PartNumber']]
dfb.columns = ['ReferenceIDB', 'PartNumberB']

you get this 你得到这个

  In [97]: dfa
Out[97]: 
  ReferenceIDA PartNumberA
0         R331    PN000873
1         R454    PN000333
2        L7896    PN000447
3         R150    PN000123
4         C774    PN000064
5        R0640    PN000878

In [98]: dfb
Out[98]: 
  ReferenceIDB PartNumberB
0         R331            
1         R454            
2        R0640            
3         R150    PN000333
4         C774            
5        L7896    PN000000

now 现在

    In [67]: cd = pd.concat([dfa,dfb], axis=1)

    In [68]: cd
    Out[68]: 
  ReferenceIDA PartNumberA ReferenceIDB PartNumberB
0         R331    PN000873         R331            
1         R454    PN000333         R454            
2        L7896    PN000447        R0640            
3         R150    PN000123         R150    PN000333
4         C774    PN000064         C774            
5        R0640    PN000878        L7896    PN000000




cd["res"] = cd.apply(lambda x : x["PartNumberB"] if x["PartNumberB"] else x["PartNumberA"], axis=1)

 cd
Out[106]: 
  ReferenceIDA PartNumberA ReferenceIDB PartNumberB       res
0         R331    PN000873         R331              PN000873
1         R454    PN000333         R454              PN000333
2        L7896    PN000447        R0640              PN000447
3         R150    PN000123         R150    PN000333  PN000333
4         C774    PN000064         C774              PN000064
5        R0640    PN000878        L7896    PN000000  PN000000

this is what you wanted 这就是你想要的

just set 刚设置

dfbr['PartNumber'] = cd['res']

and dump to csv 并转储到csv

dfbr.to_csv('sheet2.csv')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将文本从一个文件传输到另一个文件-Python - Transferring text from one file to another - Python 使用IMAP将gmail中的电子邮件从一个标签转移到python中的另一个标签 - Transferring emails in gmail from one label to another in python using IMAP 使用python进行网络抓取并将数据传输到excel中 - web scraping and transferring data into excel using python 将数据从Excel传输到Python 3中的Word - Transferring data from excel to word in python 3 在 PYTHON 中将网格从一个进程转移到另一个进程 - Transferring a mesh from one process to another in PYTHON 使用python将特定行从一个文件写入另一个文件 - Using python to write specific lines from one file to another file 使用特定位置python将文本从一个文件移动到另一个文件 - Move text from one file to another using specific positions python 将一个现有 excel 文件中的特定列复制到 python 中的另一个文件 - Copying a specific column from one existing excel file to another file in python 使用 python 将列从一个 excel 文件复制到另一个 excel 文件表 - Copy columns from one excel file to another excel file sheet using python 使用 Python 和 openpyxl 将 1 个特定行从一个 Excel 电子表格复制到另一个 - Using Python & openpyxl to copy 1 specific row from one excel spreadsheet to another
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM