簡體   English   中英

從DataFrame創建唯一數字的排序列表

[英]Creating a sorted list of unique numbers from a DataFrame

我通過LaTeX將關鍵字及其對應的頁碼寫入文本文件,然后使用Python處理。 如何創建帶有相應關鍵字的頁碼排序列表?

以下代碼為我提供了唯一列表,但未排序。

import pandas as pd

def unique(liste):
    a = liste.split(',')
    a = [int(numeric_string) for numeric_string in a]
    a = sorted(a)
    a = map(str,a)
    b = set(a)
    return ','.join(b)

df = pd.DataFrame({'keyword': ["foo","foo","foo","foo","foo","foo","foo","foo","bar","bar","bar"], "page": [1,2,3,3,4,5,6,7,7,9,10]})
df['page'] = df['page'].astype(str)
print(df)

grouped = df.groupby('keyword',as_index=False).agg(lambda col: ','.join(col))
grouped = pd.DataFrame(grouped)
grouped['unique'] = grouped['page'].apply(unique)
print(grouped)

產生

   keyword page
0      foo    1
1      foo    2
2      foo    3
3      foo    3
4      foo    4
5      foo    5
6      foo    6
7      foo    7
8      bar    7
9      bar    9
10     bar   10
  keyword             page         unique
0     bar           7,9,10         9,7,10
1     foo  1,2,3,3,4,5,6,7  3,7,6,4,5,2,1
import numpy as np
import pandas as pd

df = pd.DataFrame(
    {'keyword': ["foo","foo","foo","foo","foo","foo","foo","foo","bar","bar","bar"], 
     "page": [1,2,3,3,4,5,6,7,7,9,10]})

# df['page'] = df['page'].astype(int)
result = df.groupby(['keyword'])['page'].agg(lambda x: ','.join(np.unique(x).astype(str)))

print(result)

產量

keyword
bar           7,9,10
foo    1,2,3,4,5,6,7
Name: page, dtype: object

  • np.unique返回值的唯一排序數組。 我們希望頁面值以整數(而不是字符串)進行排序,因此將page值保持為整數。 調用np.unique ,可以使用astype(str)轉換為字符串,然后將其與','.join astype(str)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM