简体   繁体   English

类型错误:无法排序的类型:int() < str()

[英]TypeError: unorderable types: int() < str()

There is an error occurs when I was applying the 5W1H extractor(which is an opensource library in Git) on my JSON news dataset.当我在 JSON 新闻数据集上应用 5W1H 提取器(这是 Git 中的一个开源库)时发生错误。

The error occurs at evaluate_location file when it tried to run尝试运行时,在evaluate_location 文件中发生错误

raw_locations.sort(key=lambda x: x[1], reverse=True)

Then the console gave the error says然后控制台给出了错误说

TypeError: unorderable types: int() < str()

My question is: Does this means something wrong with my dataset format?我的问题是:这是否意味着我的数据集格式有问题? But if so shouldn't it consider all the news data as a simple long string when the extractor work on this corpus?但如果是这样,当提取器在这个语料库上工作时,它不应该将所有新闻数据视为一个简单的长字符串吗? I'm eagerly looking for a solution to this problem.我急切地寻找解决这个问题的方法。

This is one of the json news data:这是json新闻数据之一:

{
"title": "Football: Van Dijk, Ronaldo and Messi shortlisted for FIFA award",
"body": "ROME: Liverpool centre-back Virgil van Dijk is on the shortlist to add FIFA's best player award to his UEFA Men's Player of the Year honour.The Dutch international denied Cristiano Ronaldo and Lionel Messi for the European title last week and the same trio are in the running for the FIFA accolade to be announced in Milan on September 23.    Van Dijk starred in Liverpool's triumphant Champions League campaign.England full-back Lucy Bronze won UEFA's women's award and is on FIFA's shortlist with the United States' World Cup-winning duo Megan Rapinoe and Alex Morgan.Manchester City boss Pep Guardiola is up against Liverpool's Jurgen Klopp and Mauricio Pochettino of Tottenham for best men's coach.Phil Neville, who led England's women to a World Cup semi-final, is up for the women's coach award with the USA's Jill Ellis and Sarina Wiegman who guided European champions the Netherlands to the World Cup final.    FIFA Best shortlistsMen's player:Cristiano Ronaldo (Juventus/Portugal), Lionel Messi (Barcelona/Argentina), Virgil van Dijk  player:Lucy Bronze (Lyon/England), Alex Morgan (Orlando Pride/USA), Megan Rapinoe (Reign FC/USA)Men's coach:Pep Guardiola (Manchester City), Jurgen Klopp (Liverpool), Mauricio Pochettino (Tottenham)Women's coach:Jill Ellis (USA), Phil Neville (England), Sarina Wiegman (Netherlands)Women's goalkeeper:Christiane Endler (Paris St-Germain/Chile), Hedvig Lindahl (Wolfsburg/Sweden), Sari van Veenendaal (Atletico Madrid/Netherlands)Men's goalkeeper:Alisson (Liverpool/Brazil), Ederson (Manchester City/Brazil), Marc-Andre ter Stegen (Barcelona/Germany)Puskas award (for best goal):Lionel Messi (Barcelona v Real Betis), Juan Quintero (River Plate v Racing Club), Daniel Zsori (Debrecen v Ferencvaros)",
"published_at": "2019-09-02",
} 

Code:代码:

json_file = open("./Labeled.json","r",encoding="utf-8")
data = json.load(json_file)

if __name__ == '__main__':
    # logger setup
    log = logging.getLogger('GiveMe5W')
    log.setLevel(logging.DEBUG)
    sh = logging.StreamHandler()
    sh.setLevel(logging.DEBUG)
    log.addHandler(sh)

    # giveme5w setup - with defaults
    extractor = MasterExtractor()
    Document() 

for i in range(0,1000):
    body = data[i]["body"]
    #print(body)
    #for line in body:
    #print(line[0:line.find('\n')])
    #head = re.sub("[^A-Z\d]", "", "")
    head = re.search("^[^\n]*", body).group(0)
    head = str(head)

    title = data[i]["title"]
    title = str(title)

    body = data[i]["body"]
    body = str(body)

    published_at = data[i]["published_at"]
    published_at = str(published_at)

    doc1 = Document(title,head,body,published_at)


    doc = extractor.parse(doc1)

Instead of return the extracted time&location result, it gave me this error:它没有返回提取的时间和位置结果,而是给了我这个错误:

 Traceback (most recent call last):   File
 "/usr/lib/python3.5/threading.py", line 914, in _bootstrap_inner
     self.run()   File "/usr/local/lib/python3.5/dist-packages/Giveme5W1H/extractor/extractor.py",
 line 20, in run
     extractor.process(document)   File "/usr/local/lib/python3.5/dist-packages/Giveme5W1H/extractor/extractors/abs_extractor.py",
 line 41, in process
     self._evaluate_candidates(document)   File "/usr/local/lib/python3.5/dist-packages/Giveme5W1H/extractor/extractors/environment_extractor.py",
 line 75, in _evaluate_candidates
     locations = self._evaluate_locations(document)   File "/usr/local/lib/python3.5/dist-packages/Giveme5W1H/extractor/extractors/environment_extractor.py",
 line 224, in _evaluate_locations
     raw_locations.sort(key=lambda x: x[1], reverse=True) TypeError: unorderable types: int() < str()

The row_locations is build in the same file in line 219: row_locations 在第 219 行的同一个文件中构建:

raw_locations.append([parts, location.raw['place_id'], location.point, bb, area, 0, 0, candidate, 0])

Thus, the sort function tries to sort the locations by their place_id .因此, sort 函数尝试按place_id对位置进行排序。 Please check your dataset if it does include strings and numbers for the place_id .请检查您的数据集是否包含place_id字符串和数字。 If so you need to convert all entries to one type.如果是这样,您需要将所有条目转换为一种类型。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM