简体   繁体   English

Pandas- 连接两列字符串列表

[英]Pandas- Concatenating two columns of string lists

I've got an.. interesting data frame that comes from a database.我有一个来自数据库的..有趣的数据框。 The data frame has two columns, which are lists of strings.数据框有两列,它们是字符串列表。 I need to concat the values in these two lists, to create a new column of lists.我需要连接这两个列表中的值,以创建一个新的列表列。 For example:例如:

data = [ 
    {'id': 1, 'l1': ['Luke', 'Han'], 'l2': ['Skywalker', 'Solo']}, 
    {'id': 2, 'l1': ['Darth', 'Kylo'], 'l2': ['Vader', 'Ren']},
    {'id': 3, 'l1': [], 'l2': []}
]                                                                                                                                                                
df = pd.DataFrame(data)   

Notice the third row has no values.注意第三行没有值。 You can also assume that l1 and l2 are of the same length.您还可以假设l1l2的长度相同。 And I need to concat the values in l1 and l2 (with a space between), eg:我需要连接 l1 和 l2 中的值(中间有空格),例如:

result = [
    {'id': 1, 'name': ['Luke Skywalker', 'Han Solo']},
    {'id': 2, 'name': ['Darth Vader', 'Kylo Ren']},
    {'id': 3, 'name': []}
]
result_df = pd.DataFrame(result)

You you use dict comprehension and ' '.join in combination with zip to iterate over your dataset, for example, this:您使用dict comprehension' '.join结合zip来迭代您的数据集,例如,这个:

import pandas as pd


data = [ 
    {'id': 1, 'l1': ['Luke', 'Han'], 'l2': ['Skywalker', 'Solo']}, 
    {'id': 2, 'l1': ['Darth', 'Kylo'], 'l2': ['Vader', 'Ren']},
    {'id': 3, 'l1': [], 'l2': []}
]                                                                                                                                                                
df = pd.DataFrame(data) 

result = [
    {
        'id': row['id'], 
        'name': [' '.join(l1_l2) for l1_l2 in zip(row['l1'], row['l2'])]
    } for row in data
]

print(pd.DataFrame(result))
>>>
   id                        name
0   1  [Luke Skywalker, Han Solo]
1   2     [Darth Vader, Kylo Ren]
2   3                          []

This should get you to where you want: assuming you only have two columns (if you have more just add one of those ' '+df.iloc[j,3 &or 4 &or...][i])这应该可以让您到达您想要的位置:假设您只有两列(如果您有更多列,只需添加其中一个 ' '+df.iloc[j,3 &or 4 &or...][i])

Voila =[]
for j in range(len(df)):
    Voila.append([df.iloc[j,1][i]+ ' '+df.iloc[j,2][i] for i in range(len(df. 
                                                                loc[j,'l1']))])
df['Voila'] = Voila

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM