拆分列在 pandas dataframe

Question

I have a file in which one of the column is a multi-value field, for example:我有一个文件，其中一列是多值字段，例如：

Col1|Col2
rec1|xyz#tew
rec2|
rec3|jkl#qwer

I need to split the Col2 based on delimiter, and following is the code which I am using:我需要根据定界符拆分 Col2，以下是我正在使用的代码：

x = ['Col1','Col2']
df[x] = (df[x].apply(lambda c: c.str.split('#',expand=True))

With this code I am getting following error: "AttributeError: 'Series' object has no attribute 'series' "使用此代码，我收到以下错误：“AttributeError：'Series'object 没有属性'series'”

I tried using replace and fillna, but no luck, can someone please help in correcting the above code我尝试使用 replace 和 fillna，但没有运气，有人可以帮助更正上面的代码

Answer 1

First, we'll need to replace the NaN values in a clever manner:首先，我们需要巧妙地替换 NaN 值：

>> df["Col2"] = df["Col2"].fillna("#")

Now, split the strings in the "Col2" column:现在，拆分“Col2”列中的字符串：

>> df["Col2"] = df["Col2"].str.split("#", n=1)  # n=1 to make sure every list has 2 values
>> df
   Col1         Col2
0  rec1   [xyz, tew]
1  rec2         [, ]
2  rec3  [jkl, qwer]

Now, merge your original dataframe with a new dataframe created from the lists of the previous step现在，将您的原始 dataframe 与根据上一步的列表创建的新 dataframe 合并

>> df = df.join(pd.DataFrame(df["Col2"].values.tolist())) # .add_prefix('col_'))

You can add a prefix if you want to name your columns (add .add_prefix('Col_') at the end, for example).如果要命名列，可以添加前缀（例如，在末尾添加.add_prefix('Col_') ）。 Drop your old "Col2":放下旧的“Col2”：

>> df = df.drop("Col2", axis=1)
>> df

   Col1    0     1
0  rec1  xyz   tew
1  rec2           
2  rec3  jkl  qwer

拆分列在 pandas dataframe

问题描述

1 个解决方案

解决方案1
0 2022-02-21 20:17:09

拆分列在 pandas dataframe

问题描述

1 个解决方案

解决方案1 0 2022-02-21 20:17:09

解决方案1
0 2022-02-21 20:17:09