[英]Merge specific rows in pandas Df
I have df after read_excel where some of values (from one column, with strings) are divided.我在 read_excel 之后有 df,其中一些值(来自一列,带有字符串)被划分。 How can i merge them back?
我怎样才能将它们合并回来?
for example: the df i have例如:我的 df
{'CODE': ['A', None, 'B', None, None, 'C'],
'TEXT': ['A', 'a', 'B', 'b', 'b', 'C'],
'NUMBER': ['1', None, '2', None, None,'3']}
the df i want我想要的df
{'CODE': ['A','B','C'],
'TEXT': ['Aa','Bbb','C'],
'NUMBER': ['1','2','3']}
I can't find the right solution.我找不到正确的解决方案。 I tried to import data in different ways but it also did not help
我尝试以不同的方式导入数据,但也无济于事
You can forward fill missing values or None
s for groups with aggregate join
and first non None
value for NUMBER
column:您可以转发填充缺失值或
None
用于具有聚合join
的组和NUMBER
列的第一个非None
值:
d = {'CODE': ['A', None, 'B', None, None, 'C'],
'TEXT': ['A', 'a', 'B', 'b', 'b', 'C'],
'NUMBER': ['1', None, '2', None, None,'3']}
df = pd.DataFrame(d)
df1 = df.groupby(df['CODE'].ffill()).agg({'TEXT':''.join, 'NUMBER':'first'}).reset_index()
print (df1)
CODE TEXT NUMBER
0 A Aa 1
1 B Bbb 2
2 C C 3
You can generate dictionary:您可以生成字典:
cols = df.columns.difference(['CODE'])
d1 = dict.fromkeys(cols, 'first')
d1['TEXT'] = ''.join
df1 = df.groupby(df['CODE'].ffill()).agg(d1).reset_index()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.