簡體   English   中英

字符串集之間的區別不起作用

[英]Difference between set of strings doesn't work

首先,感謝您的幫助,我幾天來一直試圖解決此問題。

檔案myStopWords.txt:

è
ad
più
a
b
c
17

我的代碼:

stopWord = set(open("<...>/myStopwords.txt").read().split("\n"))
oldWords = set(["a","b","ad", "è", "più","17","horse"])

print( oldWords.difference(stopWord) )

結果:

{'horse', 'ad', 'più', 'è'}

為什么不從set減去"ad""è""più"

結果應為{horse}

謝謝。 如先前評論中所建議,這是解決方案:

1)將文本文件轉換為UTF-8。

2)

fname = '<...>/myStopwords.txt'

with open(fname, encoding='utf-8') as f:
    content = f.readlines()

stopWord = [x.strip() for x in content] 


oldWords = set(["a","b","ad", "è", "più","17","horse"])
print( oldWords.difference(stopWord) )

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM