简体   繁体   中英

selecting unicode characters from text file or webpage

I am able to syllabalise the devnagari words as shown on the following page.

https://gist.github.com/950405

But what I want to do is to find the words those start with "ह" from the following webpage.

http://www.sacred-texts.com/hin/mbs/mbs12030.htm

How it can done using python?

If your words are unicode strings, collected in a list words , then the following snippet shows you all words beginning with "x"

for word in words:
    if word.startswith(u"x"):
         print word

Or if you want to get a list of all words starting with u"x" , you can use list comprehension:

selected_words = [ w for w in words if w.startswith(u"x") ]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM