简体   繁体   中英

How to get specific ranged words from raw corpus?

import nltk   
import nltk.data    

nltk.corpus.brown    
y= nltk.corpus.brown.raw()  
print(y)

When I do print(y) it shows me all of the raw data in this corpus, but I want to get only 10,000 words from this raw corpus. How can I achieve this?

You could do :

import random
words = nltk.corpus.brown.words()
random_words = random.sample(words, 10000)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM