[英]subset pandas dataframe with corresponding numpy array
I have a pandas dataframe with following columns. 我有以下几列的pandas数据框。
order_id latitude
0 519 19.119677
1 519 19.119677
2 520 19.042117
3 520 19.042117
4 520 19.042117
5 521 19.138245
6 523 19.117662
7 523 19.117662
8 523 19.117662
9 523 19.117662
10 523 19.117662
11 524 19.137793
12 525 19.119372
13 526 0.000000
14 526 0.000000
15 526 0.000000
16 527 19.133430
17 528 0.000000
18 529 19.118284
19 530 0.000000
20 531 19.114269
21 531 19.114269
22 532 19.136292
23 533 19.119075
24 533 19.119075
25 533 19.119075
26 534 19.119677
27 535 19.119677
28 535 19.119677
29 535 19.119677
order_id is repeated, I want unique order_id values which I can get by 重复order_id,我想要可以通过的唯一order_id值
unique_order_id = pd.unique(tsp_data['order_id'])
array(['519', '520', '521', '523', '524', '525', '526', '527', '528',
'529', '530', '531', '532', '533', '534', '535'], dtype=object)
Which returns me correct unique values. 返回正确的唯一值。 I am storing it in unique_order_id variable.
我将其存储在unique_order_id变量中。 Now I want only corresponding lat values for unique order_id values.
现在,我只想为唯一的order_id值使用相应的经度值。
I am doing something like this. 我正在做这样的事情。
tsp_data['latitude'][tsp_data['order_id'].isin(unique_order_id)]
But it returns me all 30 rows. 但是它返回了我所有的30行。 Where I am getting wrong?
我哪里出错了? please help
请帮忙
You could use pd.pivot_table
which will return first values by order_id
: 您可以使用
pd.pivot_table
,它将通过order_id
返回第一个值:
In [184]: tsp_data.pivot_table(index='order_id', values='latitude')
Out[184]:
order_id
519 19.119677
520 19.042117
521 19.138245
523 19.117662
524 19.137793
525 19.119372
526 0.000000
527 19.133430
528 0.000000
529 19.118284
530 0.000000
531 19.114269
532 19.136292
533 19.119075
534 19.119677
535 19.119677
Name: latitude, dtype: float64
Or you could use drop_duplicates
: 或者您可以使用
drop_duplicates
:
In [185]: tsp_data.drop_duplicates(subset=['order_id'])
Out[185]:
order_id latitude
0 519 19.119677
2 520 19.042117
5 521 19.138245
6 523 19.117662
11 524 19.137793
12 525 19.119372
13 526 0.000000
16 527 19.133430
17 528 0.000000
18 529 19.118284
19 530 0.000000
20 531 19.114269
22 532 19.136292
23 533 19.119075
26 534 19.119677
27 535 19.119677
Or groupby
as @EdChum suggested 或@EdChum建议的
groupby
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.