简体   繁体   English

如何基于从python / pandas中现有列派生的列表创建新列?

[英]How do I create new columns based off of a list derived from an existing column in python/pandas?

I have a data frame with a column entitle "Name" that includes a string in this format: "Group1name / Group2name / Group3name / Group4name" 我有一个数据框架,其中的列标题为“ Name”,其中包含以下格式的字符串:“ Group1name / Group2name / Group3name / Group4name”

I want to create 3 new columns based off of the "Name" column and the "/" delimiter: 我想基于“名称”列和“ /”定界符创建3个新列:

Level 1: "Group1name" 级别1:“ Group1name”
Level 2: "Group1name / Group2name" 级别2:“ Group1name / Group2name”
Level 3: "Group1name / Group2name / Group3name" 级别3:“ Group1name / Group2name / Group3name”

How do I create these new columns in the dataframe? 如何在数据框中创建这些新列?

This solution uses a generator expression, which is basically a nested for loop. 此解决方案使用生成器表达式,该表达式基本上是嵌套的for循环。 It splits the string found in the Name column of df based on the / delimiter. 它基于/分隔符分割在df的“ Name列中找到的字符串。 It then joins it back together, but only takes the first n elements for the appropriate column when joining back together. 然后将其重新连接在一起,但重新连接时仅采用适当列的前n元素。

df = pd.DataFrame({'Name': ["Group1name / Group2name / Group3name / Group4name"]})

for n in range(1, 4):  # 1, 2, 3 for column indexing and naming.
    df['col_{0}'.format(n)] = ' / '.join(group for groups in df.Name.str.split(' / ') 
                                         for group in groups[:n])

>>> df.T
                                                       0
Name   Group1name / Group2name / Group3name / Group4name
col_1                                         Group1name
col_2                            Group1name / Group2name
col_3               Group1name / Group2name / Group3name

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM