如何按降序排序我的链接（我有链接的值，（num_to_words（v）））

Question

我正在制作网络爬虫，现在我需要排序算法，它可以按递减顺序对我的链接进行排序，以查看在此网页中大多数时间出现了哪个链接。 这是我在python中创建的代码：

import requests
from bs4 import BeautifulSoup
from collections import defaultdict

all_links = defaultdict(int)

def webpages():

        url = 'http://www.hm.com/lv/department/MEN'
        source_code = requests.get(url)
        text = source_code.text
        soup = BeautifulSoup(text)
        for link in soup.findAll ('a', {'class':' ', 'rel':'nofollow'}):
            href = link.get('href')
            print(href)
            get_single_item_data(href)
        return all_links

def get_single_item_data(item_url):
    source_code = requests.get(item_url)
    text = source_code.text
    soup = BeautifulSoup(text)
    for link in soup.findAll('a'):
        href = link.get('href')
        if href and href.startswith('http://www.'):
            if href:
                all_links[href] += 1
            print(href)


webpages()

units = ["", "one", "two", "three", "four", "five",
         "six", "seven", "eight", "nine "]
teens = ["", "eleven", "twelve", "thirteen", "fourteen",
         "fifteen", 'sixteen', "seventeen", "eighteen", "nineteen"]
tens = ["", "ten", "twenty", "thirty", "forty",
        "fifty", "sixty", "seventy", "eighty", "ninety"]
thousands = ["", "thousand", "million", "billion", "trillion",
             "quadrillion", "quintillion", "sextillion", "septillion", "octillion",
             "nonillion", "decillion", "undecillion", "duodecillion", "tredecillion",
             "quattuordecillion", "sexdecillion", "septendecillion", "octodecillion",
             "novemdecillion", "vigintillion "]


def num_to_words(n):
    words = []
    if n == 0:
        words.append("zero")
    else:
        num_str = "{}".format(n)
        groups = (len(num_str) + 2) // 3
        num_str = num_str.zfill(groups * 3)
        for i in range(0, groups * 3, 3):
            h = int(num_str[i])
            t = int(num_str[i + 1])
            u = int(num_str[i + 2])
            print()
            print(units[i])
            g = groups - (i // 3 + 1)
            if h >= 1:
                words.append(units[h])
                words.append("hundred")
                if int(num_str) % 100:  # if number  modulo 100 has remainder  add "and" i.e one hundred and ten
                    words.append("and")
            if t > 1:
                words.append(tens[t])
                if u >= 1:
                    words.append(units[u])
            elif t == 1:
                if u >= 1:
                    words.append(teens[u])
                else:
                    words.append(tens[t])
            else:
                if u >= 1:
                    words.append(units[u])

            if g >= 1 and (h + t + u) > 0:
                words.append(thousands[g])
    return " ".join(words)

for k, v in webpages().items():
    print(k, num_to_words(v))

Answer 1

如果它们存储在数组中，您可以对数组进行排序。
例如：

# Array 
a = [6, 2, 9, 3]
# sort the array 
a.sort()

也许这个链接也会有所帮助：关于排序的链接

Answer 2

在python中使用sort函数。

有关内置函数排序的帮助:(从python帮助中复制）

sort(...)
    L.sort(cmp=None, key=None, reverse=False) -- stable sort *IN PLACE*;
    cmp(x, y) -> -1, 0, 1
(END)

现在做反向排序使用这个：

>> L= [1,2,3,4]
>>> L.sort(reverse=True)
>>> L
[4, 3, 2, 1]
>>>

您还可以使用自定义过滤器进行比较。

sort将创建就地排序，如果您不希望使用sorted

>>> L=[1,2,3,4]
>>> sorted(L,reverse=True)
[4, 3, 2, 1]
>>> L
[1, 2, 3, 4]
>>>

Answer 3

dct = webpages()
for k  in sorted(dct,key=dct.get,reverse=True):
    print(k, num_to_words(dct[k]))

或使用itemgetter对项目进行排序：

from operator import itemgetter
for k, v in sorted(webpages().items(),key=itemgetter(1),reverse=True):
    print(k, num_to_words(v))

如何按降序排序我的链接（我有链接的值，（num_to_words（v）））

问题描述

3 个解决方案

解决方案1
1 2015-02-14 15:22:06

解决方案2
1 2015-02-14 15:32:17

解决方案3
1 2015-02-14 15:38:55

如何按降序排序我的链接（我有链接的值，（num_to_words（v）））

问题描述

3 个解决方案

解决方案1 1 2015-02-14 15:22:06

解决方案2 1 2015-02-14 15:32:17

解决方案3 1 2015-02-14 15:38:55

解决方案1
1 2015-02-14 15:22:06

解决方案2
1 2015-02-14 15:32:17

解决方案3
1 2015-02-14 15:38:55