[英]Pandas - split column with dtype object (string) to dtype list by specifying delimiter
Given data in a csv like below给定 csv 中的数据,如下所示
A,B,C-1,D,BTP,Type C1,Type C2
0,1,0,0,0,,Type B
0,2,1,1,14,Type B,Type B
0,3,2,2,28,Type A,Type B
0,4,3,3,42,"Type A,Type B","Type A,Type B"
0,5,4,4,56,Type A,"Type A,Type B"
I'm reading this into a dataframe df
.我正在将其读入 dataframe
df
。 Need to split Type C1' column by
, and store as a list in place such that I can do some lookup with
%in% of operator.需要将
Type C1' column by
, and store as a list in place such that I can do some lookup with
。 Here is what is being done.这是正在做的事情。
df["Type C1"] = df["Type C1"].str.split(",", n = 1, expand = True)
Was expecting to get a list
for column Type C1
- however it was still a string with the part from ,
stripped out as below.期望获得列
Type C1
的list
- 但是它仍然是一个字符串,其中的部分来自,
如下所示。
A B C-1 D BTP Type C1 Type C2
0 0 1 0 0 0 NaN Type B
1 0 2 1 1 14 Type B Type B
2 0 3 2 2 28 Type A Type B
3 0 4 3 3 42 Type A Type A,Type B
4 0 5 4 4 56 Type A Type A,Type B
For row #3 was expecting [Type A,Type B]
for column Type C1
对于第 3 行,期待
Type C1
列[Type A,Type B]
The reference I'm using to do this is from Pandas Split strings into two List/Columns using str.split() Example #1 output.我用来执行此操作的参考来自Pandas 使用 str.split() 示例 #1 output 将字符串拆分为两个列表/列。
You are almost correct, split()
returns a list by default:你几乎是正确的,
split()
默认返回一个列表:
df['Type C1'] = df['Type C1'].str.split(',')
df['Type C2'] = df['Type C2'].str.split(',')
A B C-1 D BTP Type C1 Type C2
0 0 1 0 0 0 NaN [Type B]
1 0 2 1 1 14 [Type B] [Type B]
2 0 3 2 2 28 [Type A] [Type B]
3 0 4 3 3 42 [Type A, Type B] [Type A, Type B]
4 0 5 4 4 56 [Type A] [Type A, Type B]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.