[英]Pandas: How to split on multiple delimiters?
我的数据框包含单列( coordinates
)中的纬度、经度和高度,我想将coordinates
列分成三列(纬度、经度和高度)。
df:
ID Coordinates Region
1 latitude_degrees: 52.00755721100514\nlongitude_degrees: 12.565129548994266\naltitude_meters: 185.23616827199143\n Europe
2 latitude_degrees: 52.00755721100514\nlongitude_degrees: 12.565129548994266\naltitude_meters: 185.23616827199143\n Europe
3 latitude_degrees: 52.00755721100514\nlongitude_degrees: 12.565129548994266\naltitude_meters: 185.23616827199143\n Europe
4 latitude_degrees: 52.00755721100514\nlongitude_degrees: 12.565129548994266\naltitude_meters: 185.23616827199143\n Europe
5 latitude_degrees: 52.00755721100514\nlongitude_degrees: 12.565129548994266\naltitude_meters: 185.23616827199143\n Europe
预期输出:
ID lat lon alt Region
1 52.00755721100514 12.565129548994266 185.23616827199143 Europe
2 52.00755721100514 12.565129548994266 185.23616827199143 Europe
3 52.00755721100514 12.565129548994266 185.23616827199143 Europe
4 52.00755721100514 12.565129548994266 185.23616827199143 Europe
5 52.00755721100514 12.565129548994266 185.23616827199143 Europe
我试过的:
我试图首先在:
基础上拆分列,但它不起作用:
df.loc[df['Coordinates'].isin(["latitude_degrees", "longitude_degrees"])]= ""
我也尝试替换文本,但它不起作用:
df.Coordinates.replace(to_replace=['latitude_degrees','longitude_degrees'],value='')
让我们用extractall
提取lat
, long
和alt
从Coordinates
列,然后unstack
它重塑,最后join
用列本ID
和Region
:
c = df['Coordinates'].str.extractall(r'([\d.]+)')[0].unstack()
d = df[['ID', 'Region']].join(c.set_axis(['lat', 'long', 'alt'], 1))
ID Region lat long alt
0 1 Europe 52.00755721100514 12.565129548994266 185.23616827199143
1 2 Europe 52.00755721100514 12.565129548994266 185.23616827199143
2 3 Europe 52.00755721100514 12.565129548994266 185.23616827199143
3 4 Europe 52.00755721100514 12.565129548994266 185.23616827199143
4 5 Europe 52.00755721100514 12.565129548994266 185.23616827199143
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.