[英]summarize df by Unique values from two different columns
say I have the following df:说我有以下df:
Origin![]() |
Lat![]() |
Long![]() |
Destination![]() |
Lat![]() |
Long![]() |
---|---|---|---|---|---|
A![]() |
1 ![]() |
3 ![]() |
B![]() |
5 ![]() |
3 ![]() |
A![]() |
1 ![]() |
3 ![]() |
C ![]() |
7 ![]() |
3 ![]() |
B![]() |
5 ![]() |
3 ![]() |
A![]() |
1 ![]() |
3 ![]() |
B![]() |
5 ![]() |
3 ![]() |
C ![]() |
7 ![]() |
3 ![]() |
I need to get the df in the following shape我需要得到以下形状的df
Unique Location![]() |
Lat![]() |
Long![]() |
---|---|---|
A![]() |
1 ![]() |
3 ![]() |
B![]() |
5 ![]() |
3 ![]() |
C ![]() |
7 ![]() |
3 ![]() |
is there a quick way to do that using NumPy/pandas?有没有使用 NumPy/pandas 的快速方法? I was trying to split the data into two dfs and then join them together but it seems to be like inefficient way at all.
我试图将数据分成两个 dfs,然后将它们连接在一起,但这似乎是一种低效的方式。
Use pd.concat
and drop_duplicates
:使用
pd.concat
和drop_duplicates
:
>>> pd.concat([df.iloc[:, :3].rename(columns={'Origin': 'Unique Location'}),
df.iloc[:, 3:].rename(columns={'Destination': 'Unique Location'})]) \
.drop_duplicates().reset_index(drop=True)
Unique Location Lat Long
0 A 1 3
1 B 5 3
2 C 7 3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.