[英]selecting unicode characters from text file or webpage
I am able to syllabalise the devnagari words as shown on the following page. 我能够对下页所示的梵文单词进行音节化。
https://gist.github.com/950405 https://gist.github.com/950405
But what I want to do is to find the words those start with "ह" from the following webpage. 但是我想做的是从以下网页中找到以“ह”开头的单词。
http://www.sacred-texts.com/hin/mbs/mbs12030.htm http://www.sacred-texts.com/hin/mbs/mbs12030.htm
How it can done using python? 如何使用python完成?
If your words are unicode strings, collected in a list words
, then the following snippet shows you all words beginning with "x"
如果您的单词是Unicode字符串(收集在列表中的
words
,则以下代码段将显示所有以"x"
开头的单词
for word in words:
if word.startswith(u"x"):
print word
Or if you want to get a list of all words starting with u"x"
, you can use list comprehension: 或者,如果您想获取以
u"x"
开头的所有单词的列表,则可以使用列表推导:
selected_words = [ w for w in words if w.startswith(u"x") ]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.