[英]split several columns using pandas
I want to split string in several columns. 我想将字符串分成几列。 For example, I'd like to select some information from col2, col3 and col5 in below dataframe (but indeed I have more than hundred columns to do so).
例如,我想在下面的数据框中从col2,col3和col5中选择一些信息(但实际上我有一百多列要做)。
d = pd.DataFrame({
'col1' : ['USA', 'AGN'],
'col2' : ['0|0:0.014:0.986,0.013,0', '1|0:0.02:1.936,0.023,1'],
'col3' : ['1|0:0.024:0.9,0.01345,2', '0|2:0.213:0.92,0.1,2'],
'col4' : ['done', 'done'],
'col5' : ['2|0:0.02:1.936,0.023,1', '1|0:0.024:0.9,0.01345,2']
})
col1 col2 col3 col4 .....
0 USA 0|0:0.014:0.986,0.013,0 1|0:0.024:0.9,0.01345,2 done .....
1 AGN 1|0:0.02:1.936,0.023,1 0|2:0.213:0.92,0.1,2 done .....
I only need first 3 marks from that long string. 我只需要该长字符串的前3个标记 。 Then I expect I can see from my result such as below.
然后,我希望可以从如下结果中看到。
col1 col2 col3 col4 col5 ....
USA 0|0 1|0 done 2|0 ....
AGN 1|0 0|2 done 1|0 ....
Any hint please? 有什么提示吗?
if i understood your question correctly, you can do it this way: 如果我正确理解了您的问题,则可以这样进行:
In [254]: d.replace(r':.*', '', regex=True)
Out[254]:
col1 col2 col3 col4 col5
0 USA 0|0 1|0 done 2|0
1 AGN 1|0 0|2 done 1|0
To get the first three string characters: 要获取前三个字符串字符:
>>> d.col2.str[:3]
0 0|0
1 1|0
Name: col2, dtype: object
To split on ":" and take the first item: 要分割“:”并采用第一项:
>>> d.col2.str.split(':', expand=True)[0]
0 0|0
1 1|0
Name: 0, dtype: object
To apply it to a group of columns: 要将其应用于一组列:
cols = ['col2', 'col3', 'col5']
d.loc[:, cols] = d.loc[:, cols].apply(lambda s: s.str[:3])
>>> d
col1 col2 col3 col4 col5
0 USA 0|0 1|0 done 2|0
1 AGN 1|0 0|2 done 1|0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.