[英]Pandas : separate values in one column to several rows
If a column contains several values, separated by ,
, how can I separate them in different rows?如果一列包含多个值,由,
分隔,我如何将它们分隔在不同的行中?
The sample data sets:样本数据集:
name age taskID
----------------------------
AA 20 T01,T02
BB 22 T03,T02,T03
CC 24 T01,T05
DD 21 T02,T06
Output: Output:
name age taskID
-----------------------
AA 20 T01
AA 20 T02
BB 22 T03
BB 22 T02
CC 24 T01
CC 24 T05
CC 24 T03
DD 21 T02
DD 21 T06
for pandas 0.25 and above you can use对于 pandas 0.25 及以上,您可以使用
df = pd.DataFrame([['AA', '20', 'T01,T02'], ['BB', '22', 'T03,T02,T03'], ['CC', '24', 'T01,T05'], ['DD', '21', 'T02,T06']], columns=('name', 'age', 'taskID'))
df["taskID"] = df["taskID"].str.split(",")
df.explode("taskID")
for pandas below 0.25对于 pandas 低于 0.25
from itertools import chain
import numpy as np
import pandas as pd
df = pd.DataFrame([['AA', '20', 'T01,T02'], ['BB', '22', 'T03,T02,T03'], ['CC', '24', 'T01,T05'], ['DD', '21', 'T02,T06']], columns=('name', 'age', 'taskID'))
df["taskID"] = df["taskID"].str.split(",")
arr = np.repeat(df.iloc[:,:-1].values, df["taskID"].apply(len), axis=0)
df2 = pd.DataFrame(arr, columns=df.columns[:-1])
df2["TaskID"] = list(chain(*df["taskID"]))
df2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.