Pandas - 通过指定分隔符将数据类型为 object（字符串）的列拆分为数据类型列表

Question

Given data in a csv like below给定 csv 中的数据，如下所示

A,B,C-1,D,BTP,Type C1,Type C2
0,1,0,0,0,,Type B
0,2,1,1,14,Type B,Type B
0,3,2,2,28,Type A,Type B
0,4,3,3,42,"Type A,Type B","Type A,Type B"
0,5,4,4,56,Type A,"Type A,Type B"

I'm reading this into a dataframe df .我正在将其读入 dataframe df 。 Need to split Type C1' column by , and store as a list in place such that I can do some lookup with %in% of operator.需要将Type C1' column by ， and store as a list in place such that I can do some lookup with 。 Here is what is being done.这是正在做的事情。

df["Type C1"] = df["Type C1"].str.split(",", n = 1, expand = True)

Was expecting to get a list for column Type C1 - however it was still a string with the part from , stripped out as below.期望获得列Type C1的list - 但是它仍然是一个字符串，其中的部分来自,如下所示。

      A  B    C-1          D    BTP                        Type C1                Type C2
0     0  1    0            0     0                            NaN                         Type B
1     0  2    1            1    14                         Type B                         Type B
2     0  3    2            2    28                         Type A                         Type B
3     0  4    3            3    42                         Type A                  Type A,Type B
4     0  5    4            4    56                         Type A                  Type A,Type B

For row #3 was expecting [Type A,Type B] for column Type C1对于第 3 行，期待Type C1列[Type A,Type B]

The reference I'm using to do this is from Pandas Split strings into two List/Columns using str.split() Example #1 output.我用来执行此操作的参考来自Pandas 使用 str.split() 示例 #1 output 将字符串拆分为两个列表/列。

Answer 1

You are almost correct, split() returns a list by default:你几乎是正确的， split()默认返回一个列表：

df['Type C1'] = df['Type C1'].str.split(',')
df['Type C2'] = df['Type C2'].str.split(',')

   A  B  C-1  D  BTP           Type C1           Type C2
0  0  1    0  0    0               NaN          [Type B]
1  0  2    1  1   14          [Type B]          [Type B]
2  0  3    2  2   28          [Type A]          [Type B]
3  0  4    3  3   42  [Type A, Type B]  [Type A, Type B]
4  0  5    4  4   56          [Type A]  [Type A, Type B]

Pandas - 通过指定分隔符将数据类型为 object（字符串）的列拆分为数据类型列表

问题描述

1 个解决方案

解决方案1
0 2020-06-27 05:04:22

Pandas - 通过指定分隔符将数据类型为 object（字符串）的列拆分为数据类型列表

问题描述

1 个解决方案

解决方案1 0 2020-06-27 05:04:22

解决方案1
0 2020-06-27 05:04:22