[英]Pandas: concat multiple new columns to an existing data-frame based on the value of one of the columns
Using Pandas, I have a data-frame in hand:使用 Pandas,我手头有一个数据框:
A![]() |
B![]() |
|
---|---|---|
0 ![]() |
a![]() |
9 ![]() |
1 ![]() |
b ![]() |
9 ![]() |
2 ![]() |
c ![]() |
9 ![]() |
Iterating through all rows (one by one), based on the A column, I load (from a ZipFile) an additional columns in the form of a dictionary (some of which are missing - need to eliminate the whole row).遍历所有行(一个接一个),基于 A 列,我(从 ZipFile)加载(从 ZipFile)字典形式的附加列(其中一些缺失 - 需要消除整行)。 Result should look simillar to that:
结果应该类似于:
A![]() |
B![]() |
C ![]() |
D ![]() |
|
---|---|---|---|---|
0 ![]() |
a![]() |
9 ![]() |
a-foo![]() |
a-bar![]() |
2 ![]() |
c ![]() |
9 ![]() |
c-foo ![]() |
c-bar ![]() |
What is the best method to use for the iteration?用于迭代的最佳方法是什么? Tried several options, among which are for-in, apply,map, but it usually fails on typing issues (I'm quite new to that).
尝试了几个选项,其中包括 for-in、apply、map,但它通常在打字问题上失败(我对此很陌生)。
Any help or directions will be truely appreciated.任何帮助或指示将不胜感激。
Let's assume your data looks something like this:假设您的数据如下所示:
import pandas as pd
data = {'A': {0: 'a', 1: 'b', 2: 'c'},
'B': {0: 9, 1: 9, 2: 9}}
df = pd.DataFrame(data)
data2 = [{'A': 'a', 'C': 'a-foo', 'D': 'a-bar'},
{'A': 'c', 'C': 'c-foo', 'D': 'c-bar'}]
df2 = pd.DataFrame(data2)
You can combine df.merge
with df.dropna
to merge the two dfs
and drop the rows (here: only the row with index 1) that end up with NaN
values in the process:您可以将
df.merge
与df.dropna
组合以合并两个dfs
并删除在此过程中最终以NaN
值结束的行(此处:仅索引为 1 的行):
df.merge(df2, on='A', how='left').dropna(axis=0, how='any')
A B C D
0 a 9 a-foo a-bar
2 c 9 c-foo c-bar
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.