[英]How to fix : TypeError: normalize() argument 2 must be str, not list
I'm making an api call that pulls the desired endpoints from ...url/articles.json and transforms it into a csv file. 我正在进行api调用,该调用从... url / articles.json中提取所需的端点并将其转换为csv文件。 My problem here is that the ['labels_name'] endpoint is a string with multiple values.(an article might have multiple labels) How can I pull multiple values of a string without getting this error . 我的问题是['labels_name']端点是一个具有多个值的字符串。(文章可能具有多个标签)如何在不出现此错误的情况下提取字符串的多个值。 "File "articles_labels.py", line 40, in <module> decode_3 = unicodedata.normalize('NFKD', article_label) TypeError: normalize() argument 2 must be str, not list"
? "File "articles_labels.py", line 40, in <module> decode_3 = unicodedata.normalize('NFKD', article_label) TypeError: normalize() argument 2 must be str, not list"
?
import requests
import csv
import unicodedata
import getpass
url = 'https://......./articles.json'
user = ' '
pwd = ' '
csvfile = 'articles_labels.csv'
output_1 = []
output_1.append("id")
output_2 = []
output_2.append("title")
output_3 = []
output_3.append("label_names")
output_4 = []
output_4.append("link")
while url:
response = requests.get(url, auth=(user, pwd))
data = response.json()
for article in data['articles']:
article_id = article['id']
decode_1 = int(article_id)
output_1.append(decode_1)
for article in data['articles']:
title = article['title']
decode_2 = unicodedata.normalize('NFKD', title)
output_2.append(decode_2)
for article in data['articles']:
article_label = article['label_names']
decode_3 = unicodedata.normalize('NFKD', article_label)
output_3.append(decode_3)
for article in data['articles']:
article_url = article['html_url']
decode_3 = unicodedata.normalize('NFKD', article_url)
output_3.append(decode_3)
print(data['next_page'])
url = data['next_page']
print("Number of articles:")
print(len(output_1))
with open(csvfile, 'w') as fp:
writer = csv.writer(fp,dialect = 'excel')
writer.writerows([output_1])
writer.writerows([output_2])
writer.writerows([output_3])
writer.writerows([output_4])
My problem here is that the ['labels_name'] endpoint is a string with multiple values.(an article might have multiple labels) How can I pull multiple values of a string 我的问题是['labels_name']端点是具有多个值的字符串。(文章可能具有多个标签)如何提取字符串的多个值
It's a list not a string, so you don't have "a string with multiple values" you have a list of multiple strings, already, as-is. 它是一个列表,而不是字符串,因此您没有“具有多个值的字符串”,而是已经有多个字符串的列表。
The question is what you want to do with them, CSV certainly isn't going to handle that, so you must decide on a way to serialise a list of strings to a single string eg by joining them together (with some separator like space or comma) or by just picking the first one (beware to handle the case where there is none), … either way the issue is not really technical. 问题是您想使用它们做什么,CSV当然不会处理这个问题,因此您必须决定一种将字符串列表序列化为单个字符串的方法,例如将它们连接在一起(使用一些分隔符,例如空格或逗号)或仅选择第一个(请注意处理不存在的情况),……无论哪种方式,问题都不是真正的技术问题。
unicodedata.normalize takes a unicode string, and not a list as the error says. unicodedata.normalize采用unicode字符串,而不是错误所表示的列表。 The correct way to use unicodedata.normalize
will be (example taken from How does unicodedata.normalize(form, unistr) work? 正确使用unicodedata.normalize
将是(示例取自unicodedata.normalize(form,unistr)如何工作?
from unicodedata import normalize
print(normalize('NFD', u'\u00C7'))
print(normalize('NFC', u'C\u0327'))
#Ç
#Ç
Hence you need to make sure that unicodedata.normalize('NFKD', title)
has title as a unicode string 因此,您需要确保unicodedata.normalize('NFKD', title)
标题为unicode字符串
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.