繁体   English   中英

字符串集之间的区别不起作用

[英]Difference between set of strings doesn't work

首先,感谢您的帮助,我几天来一直试图解决此问题。

档案myStopWords.txt:

è
ad
più
a
b
c
17

我的代码:

stopWord = set(open("<...>/myStopwords.txt").read().split("\n"))
oldWords = set(["a","b","ad", "è", "più","17","horse"])

print( oldWords.difference(stopWord) )

结果:

{'horse', 'ad', 'più', 'è'}

为什么不从set减去"ad""è""più"

结果应为{horse}

谢谢。 如先前评论中所建议,这是解决方案:

1)将文本文件转换为UTF-8。

2)

fname = '<...>/myStopwords.txt'

with open(fname, encoding='utf-8') as f:
    content = f.readlines()

stopWord = [x.strip() for x in content] 


oldWords = set(["a","b","ad", "è", "più","17","horse"])
print( oldWords.difference(stopWord) )

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM