[英]Sorting issue with pandas dataframe (or python list/tuple)
我有一個如下所示的 Pandas DataFrame:
import pandas as pd
data = [
(638009197035522, 655784141500417), # 0
(693075572527105, 693075572527105), # 1
(655784141500417, 693668642918400), # 2
(693075572527105, 694397537353729), # 3
(694397537353729, 695737600794624), # 4
(695737600794624, 700168400654337), # 5
(693075572527105, 929811762360322), # 6
(929811762360322, 931830115979265), # 7
(931830115979265, 951912745500672), # 8
(951912745500672, 965073687117824)] # 9
pd.DataFrame(data, columns=['reference', 'uid'])
它按第二列 (uid) 排序。 然而,我想要實現的是以如下方式對數據框進行排序(或重建):
[(638009197035522, 655784141500417), # 0->0
(655784141500417, 693668642918400), # 2->1
(693075572527105, 693075572527105), # 1->2
(693075572527105, 694397537353729), # 3->3
(694397537353729, 695737600794624), # 4->4
(693075572527105, 929811762360322), # 6->5
(695737600794624, 700168400654337), # 5->6
(929811762360322, 931830115979265), # 7->7
(931830115979265, 951912745500672), # 8->8
(951912745500672, 965073687117824)] # 9->9
也就是說,第二列 (uid) 中的值決定了數據幀/列表中的下一個特定行,但並不總是如您所見。 在它的原始形狀中,它是按 uid 列排序的,直到有一行帶有指向該 uid 的引用鍵為止。
解決方案不一定是熊貓/數據框,純 python 解決方案也可以。
df = pd.DataFrame(data, columns=['reference', 'uid'])
df.sort_values(by="reference", inplace=True)
df
reference uid
0 638009197035522 655784141500417
2 655784141500417 693668642918400
1 693075572527105 693075572527105
3 693075572527105 694397537353729
6 693075572527105 929811762360322
4 694397537353729 695737600794624
5 695737600794624 700168400654337
7 929811762360322 931830115979265
8 931830115979265 951912745500672
9 951912745500672 965073687117824
然后進一步排序
df['uid'].isin(df['reference'])
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.