[英]how to pick multiple columns from pandas df and store it in another df?
I have a dataframe having the following keys:我有一个具有以下键的数据框:
tourist_spot,
nearest landmark,
longitute(landmark),
latitude(landmark),
nearest police station,
longitute(PS),
latitude(PS),
nearest bus stop,
longitute(BS),
latitude(BS),
nearest taxi stand,
longitute(TS),
latitude(TS)
and several other columns like this.和其他几个类似的列。 What I want to achieve is something like this:
我想要实现的是这样的:
name,
type,
latitude,
longitude,
nearest_to
The name
will have the name of the landmark or PS or BS etc. Type
will indicate the type of place. name
将包含地标或 PS 或 BS 等的名称。 Type
将指示地点的类型。 For example, PS for police station etc. latitude
and longitude
will be the locations of each place and nearest_to
will be the name of the tourist_spot
.例如,PS为派出所等
latitude
和longitude
将是每个地方的位置和nearest_to
将是名tourist_spot
。
Sample data:样本数据:
tourist_spot ts1
nearest landmark nl1
longitute(landmark) 4
latitude(landmark) 5
nearest police station ps1
longitute(PS) 7
latitude(PS) 8
nearest bus stop bs1
longitute(BS) 9
latitude(BS) 10
nearest taxi stand ts1
longitute(TS) 11
latitude(TS) 12
Please convert into a table like structure.请转换为类似表格的结构。 Its tough to make a big table here.
在这里做一张大桌子很难。
Desired output:期望的输出:
name type longitude latitude nearest_to
nl1 landmark 4 5 ts1
ps1 PS 7 8 ts1
My code so far:到目前为止我的代码:
import pandas as pd
df = pd.read_excel("C:\\Users\\NCOG1\\Desktop\\Meghalaya\\Daribokgre.xlsx")
df1 = pd.DataFrame(columns=['name','type_id','longitude','latitude', 'nearby_to'])
df1['name'] = df['Nearest Landmark Name']
df1['type_id'] = df['Nearest Landmark Name']
df1['longitude'] = df['Longitude (of Nearest Landmark)']
df1['latitude'] = df['Latitude (of Nearest Landmark)']
df1['nearby_to'] = df['Name of Tourist Spot']
First is necessary parse columns with Latitude
and Longitude
with one column before:首先是必要的解析列与
Latitude
和Longitude
与前一列:
df = pd.read_excel('Daribokgre.xlsx', sheet_name='Sheet2', index_col=0)
c = df.columns[df.columns.str.startswith('Latitude')]
idx = df.columns.get_indexer(c)
df1 = df.iloc[:, np.sort(np.r_[idx-1, idx, idx + 1])[1:]].reset_index()
#print (df1)
Then extract data to new columns in helper DataFrame:然后将数据提取到助手 DataFrame 中的新列:
df2 = df1.columns.to_frame(index=False)
m = df2[0].str.contains('Latitude|Longitude')
df2['type'] = df2[0].mask(m).ffill()
df2['b'] = np.where(m, df2[0].str.split(r'\s+\(', n=1).str[0], 'name')
Last create MultiIndex
and reshape by stack
:最后创建
MultiIndex
并按stack
重塑:
df1.columns = [df2['type'], df2['b']]
df1 = (df1.stack(0)
.reset_index(level=0, drop=True)
.reset_index().rename(columns={1:'type'})
.rename_axis(None, axis=1))
print (df1)
type Latitude \
0 Bus Stop 25⁰33'05" N
1 Landmark 25⁰33'05" N
2 Name of Tourist Spot 25⁰29'24" N
3 Nearest Hospital Name 25⁰36'01" N
4 Nearest Taxi Stand Name 25⁰30'45" N
5 Police Station 25⁰35'53" N
6 Restaurants or other eating joints nearby 2529'22' N
7 Staying Options nearby (Hotels, Resorts, Homes... 25⁰29'22" N
Longitude name
0 90⁰19'42" E Oragitok Jnc. Old Williamnagar-Tura road
1 90⁰19'42" E Oragitok Williamnagar-Tura old road
2 90⁰19'22" E Daribokgre
3 90⁰16'25" E Asanang PHC
4 90⁰13'01" E Tura Bazaar, Tura
5 90⁰15'14" E Rongram Beat House whoever Daribokgre falls un...
6 9019'21" E Daribokgre Homestay
7 90⁰19'21" E 1. Daribokgre Homestay,
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.