简体   繁体   English

将函数分配给字典值时出错

[英]Error while assigning function to the values of a dictionary

I am trying to assign a function to the values of my dict with with the below command: 我正在尝试使用以下命令将函数分配给我的字典值:

x_text = [clean_str(v) for k, v in answer.items()]

Function clean_str: 函数clean_str:

def clean_str(string):
    # remove stopwords
    # string = ' '.join([word for word in string.split() if word not in cachedStopWords])
    string = re.sub(r"[^A-Za-z0-9(),!?\'\`]", " ", string)
    string = re.sub(r"\'s", " \'s", string)
    string = re.sub(r"\'ve", " \'ve", string)
    string = re.sub(r"n\'t", " n\'t", string)
    string = re.sub(r"\'re", " \'re", string)
    string = re.sub(r"\'d", " \'d", string)
    string = re.sub(r"\'ll", " \'ll", string)
    string = re.sub(r",", " , ", string)
    string = re.sub(r"!", " ! ", string)
    string = re.sub(r"\(", " \( ", string)
    string = re.sub(r"\)", " \) ", string)
    string = re.sub(r"\?", " \? ", string)
    string = re.sub(r"\s{2,}", " ", string)
    return string.strip().lower()

But i am getting the below error: 但是我收到以下错误:

File "C:\\ProgramData\\Anaconda3\\lib\\re.py", line 191, in sub return _compile(pattern, flags).sub(repl, string, count) 文件“ C:\\ ProgramData \\ Anaconda3 \\ lib \\ re.py”,行191,在子返回_compile(pattern,flags).sub(repl,string,count)中

TypeError: expected string or bytes-like object TypeError:预期的字符串或类似字节的对象

A snippet of first 2 k,v pairs of my dict(answer{}) is below: 我的字典(answer {})的前2 k,v对的摘要如下:

In[45]:{k: answer[k] for k in list(answer)[:2]}
Out[45]: 
{b'B00308CJ12': [b'Bulletproof Salesman (2008)'],
 b'189138922X': [b'Classical Mechanics']} 

The values of your dict are all bytes rather than strings, and re.sub can only process strings. dict的值全是字节,而不是字符串,并且re.sub只能处理字符串。

You should convert the bytes to strings with the decode() method: 您应该使用decode()方法将字节转换为字符串:

x_text = [clean_str(i.decode()) for k, v in answer.items() for i in v]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM