有什么方法可以在python中解碼str？ AttributeError：“ str”對象沒有屬性“ decode”

Question

我需要解碼文件中的文本

從b'我知道你的感覺mba mba seperjuangan \\ xf0 \\ x9f \\ x98 \\ x90'

到“ b，我知道您對mba mba seperjuangan的感受”

但是我得到了“ b，我知道您的感受” mba mba seperjuangan xf xf x x

我嘗試解碼，但出現錯誤AttributeError: 'str' object has no attribute 'decode'

tok = WordPunctTokenizer()
pat1 = r'@[A-Za-z0-9]+'
pat2 = r'https?://[A-Za-z0-9./]+'
combined_pat = r'|'.join((pat1, pat2))
def tweet_cleaner(tweet):
    soup = BeautifulSoup(tweet)
    souped = soup.get_text()
    stripped = re.sub(combined_pat, '', souped)
    clean = stripped.decode("utf-8","strict").replace(u"\ufffd", "?")
    letters_only = re.sub("[^a-zA-Z]", " ", clean)
    lower_case = letters_only.lower()
    # During the letters_only process two lines above, it has created unnecessay white spaces,
    # I will tokenize and join together to remove unneccessary white spaces
    words = tok.tokenize(lower_case)
    return (" ".join(words)).strip()
testing = df.tweet[:100]
test_result = []
for t in testing:
    test_result.append(tweet_cleaner(t))
test_result```

Answer 1

字符串已被解碼。 您無法再次對其進行解碼。

您只能對其進行編碼。

實際上，字符串是一串unicode字母。

字節字符串是字節序列。

字節可以解碼為字符串。
字符串可以編碼為字節。

如果您從beautifulsoup中獲得了一個字符串，則說明它已經對字節進行了一些解碼，或者已經給它提供了字符串。

也許您可以舉一個很小的示例字符串/ html文件來說明您的確切問題。

我們可以嘗試解決您的具體問題。

有什么方法可以在python中解碼str？ AttributeError：“ str”對象沒有屬性“ decode”

問題描述

1 個解決方案

解決方案1
2 2019-10-23 13:10:03

有什么方法可以在python中解碼str？ AttributeError：“ str”對象沒有屬性“ decode”

問題描述

1 個解決方案

解決方案1 2 2019-10-23 13:10:03

解決方案1
2 2019-10-23 13:10:03