简体   繁体   English

Pandas:将一列中的值分隔为多行

[英]Pandas : separate values in one column to several rows

If a column contains several values, separated by , , how can I separate them in different rows?如果一列包含多个值,由,分隔,我如何将它们分隔在不同的行中?

The sample data sets:样本数据集:

name   age      taskID
----------------------------
AA      20      T01,T02
BB      22      T03,T02,T03
CC      24      T01,T05
DD      21      T02,T06 

Output: Output:

name   age      taskID
-----------------------
AA      20      T01
AA      20      T02
BB      22      T03
BB      22      T02
CC      24      T01
CC      24      T05
CC      24      T03
DD      21      T02 
DD      21      T06

for pandas 0.25 and above you can use对于 pandas 0.25 及以上,您可以使用

df = pd.DataFrame([['AA', '20', 'T01,T02'], ['BB', '22', 'T03,T02,T03'], ['CC', '24', 'T01,T05'], ['DD', '21', 'T02,T06']], columns=('name', 'age', 'taskID'))

df["taskID"] = df["taskID"].str.split(",")
df.explode("taskID")

for pandas below 0.25对于 pandas 低于 0.25

from itertools import chain
import numpy as np
import pandas as pd

df = pd.DataFrame([['AA', '20', 'T01,T02'], ['BB', '22', 'T03,T02,T03'], ['CC', '24', 'T01,T05'], ['DD', '21', 'T02,T06']], columns=('name', 'age', 'taskID'))
df["taskID"] = df["taskID"].str.split(",")

arr = np.repeat(df.iloc[:,:-1].values, df["taskID"].apply(len), axis=0)
df2 = pd.DataFrame(arr, columns=df.columns[:-1])
df2["TaskID"] = list(chain(*df["taskID"]))

df2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM