简体   繁体   English

熊猫:如何拆分多个分隔符?

[英]Pandas: How to split on multiple delimiters?

I've dataframe which contains latitude, longitude and altitude in single column ( coordinates ) and I want to split coordinates column into three columns(latitude, longitude and altitude).我的数据框包含单列( coordinates )中的纬度、经度和高度,我想将coordinates列分成三列(纬度、经度和高度)。

df: df:

ID                                                         Coordinates                                                      Region  
1     latitude_degrees: 52.00755721100514\nlongitude_degrees: 12.565129548994266\naltitude_meters: 185.23616827199143\n     Europe   
2     latitude_degrees: 52.00755721100514\nlongitude_degrees: 12.565129548994266\naltitude_meters: 185.23616827199143\n     Europe   
3     latitude_degrees: 52.00755721100514\nlongitude_degrees: 12.565129548994266\naltitude_meters: 185.23616827199143\n     Europe   
4     latitude_degrees: 52.00755721100514\nlongitude_degrees: 12.565129548994266\naltitude_meters: 185.23616827199143\n     Europe   
5     latitude_degrees: 52.00755721100514\nlongitude_degrees: 12.565129548994266\naltitude_meters: 185.23616827199143\n     Europe  

Expected Output:预期输出:

ID           lat                lon                     alt             Region  
1      52.00755721100514  12.565129548994266     185.23616827199143     Europe   
2      52.00755721100514  12.565129548994266     185.23616827199143     Europe   
3      52.00755721100514  12.565129548994266     185.23616827199143     Europe   
4      52.00755721100514  12.565129548994266     185.23616827199143     Europe   
5      52.00755721100514  12.565129548994266     185.23616827199143     Europe 

What I tried:我试过的:

I tried to first split columns on : basis but it's not working:我试图首先在:基础上拆分列,但它不起作用:

df.loc[df['Coordinates'].isin(["latitude_degrees", "longitude_degrees"])]= ""

I also tried to replace the text but it's not working:我也尝试替换文本,但它不起作用:

df.Coordinates.replace(to_replace=['latitude_degrees','longitude_degrees'],value='')

Let's use extractall to extract lat , long and alt from the Coordinates column, then unstack it to reshape, finally join this with the columns ID and Region :让我们用extractall提取latlongaltCoordinates列,然后unstack它重塑,最后join用列本IDRegion

c = df['Coordinates'].str.extractall(r'([\d.]+)')[0].unstack()
d = df[['ID', 'Region']].join(c.set_axis(['lat', 'long', 'alt'], 1))

   ID  Region                lat                long                 alt
0   1  Europe  52.00755721100514  12.565129548994266  185.23616827199143
1   2  Europe  52.00755721100514  12.565129548994266  185.23616827199143
2   3  Europe  52.00755721100514  12.565129548994266  185.23616827199143
3   4  Europe  52.00755721100514  12.565129548994266  185.23616827199143
4   5  Europe  52.00755721100514  12.565129548994266  185.23616827199143

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM