简体   繁体   English

如何将列从数据帧传递到NLTK Python中的wordnet.synsets()

[英]How to pass a column from a data frame into wordnet.synsets() in NLTK python

I have a dataframe in which one the columns contains english words. 我有一个数据框,其中一列包含英语单词。 I want to pass each of the elements in that columns through NLTKs synsets() function. 我想通过NLTKs synsets()函数传递该列中的每个元素。 My issue is that synsets() only takes in the a single word at a time. 我的问题是synsets()一次只能输入一个单词。

eg wordnet.synsets('father') 例如wordnet.synsets('father')

Now if I have dataframe like: 现在,如果我有像这样的数据框:

dc = {'A':[0,9,4,5],'B':['father','mother','kid','sister']}
df = pd.DataFrame(dc)
df
   A       B
0  0  father
1  9  mother
2  4     kid
3  5  sister

I want to pass column B though synsets() function and have another column that contains its output. 我想通过synsets()函数传递B列,并让另一列包含其输出。 I want to do this without iterating through the dataframe. 我想做到这一点而无需遍历数据框。

How do I do that? 我怎么做?

You could use the apply method: 您可以使用apply方法:

In [4]: df['C'] = df['B'].apply(wordnet.synsets)

In [5]: df
Out[5]: 
   A       B                                                  C
0  0  father  [Synset('father.n.01'), Synset('forefather.n.0...
1  9  mother  [Synset('mother.n.01'), Synset('mother.n.02'),...
2  4     kid  [Synset('child.n.01'), Synset('kid.n.02'), Syn...
3  5  sister  [Synset('sister.n.01'), Synset('sister.n.02'),...

However, having a column of lists is usually not a very useful data structure. 但是,具有一列列表通常不是非常有用的数据结构。 It might be better to put each synonym in its own column. 将每个同义词放在自己的列中可能会更好。 You can do that by making the callback function return a pd.Series : 您可以通过使回调函数返回pd.Series

In [29]: df.join(df['B'].apply(lambda word: pd.Series([w.name for w in wordnet.synsets(word)])))
Out[29]: 
   A       B            0                1            2                   3  \
0  0  father  father.n.01  forefather.n.01  father.n.03  church_father.n.01   
1  9  mother  mother.n.01      mother.n.02  mother.n.03         mother.n.04   
2  4     kid   child.n.01         kid.n.02     kyd.n.01          child.n.02   
3  5  sister  sister.n.01      sister.n.02  sister.n.03           baby.n.05   

             4                     5             6         7           8  
0  father.n.05           father.n.06  founder.n.02  don.n.03  beget.v.01  
1  mother.n.05           mother.v.01    beget.v.01       NaN         NaN  
2     kid.n.05  pull_the_leg_of.v.01      kid.v.02       NaN         NaN  
3          NaN                   NaN           NaN       NaN         NaN  

(I've chosen to display just the name attribute of each Synset ; you could of course use (我选择只显示每个Synsetname属性;您当然可以使用

df.join(df['B'].apply(lambda word: pd.Series(wordnet.synsets(word))))

if you want the Synset objects themselves.) 如果您想要Synset对象本身。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用 wordnet.synsets() 获取列表中多个单词的定义 - How to obtain definitions for multiple words in a list with wordnet.synsets() 如何在Python上绘制描述WordNet同义词集(NLTK)的图形 - How to make a graph on Python describing WordNet's synsets (NLTK) Python:将变量传递到Word中NL4K中的Synsets方法 - Python: Passing variables into Wordnet Synsets methods in NLTK 带有nltk.wordnet.synsets的Python IF语句 - Python IF statement with nltk.wordnet.synsets NLTK Wordnet获取Python中的同义词集列表 - NLTK Wordnet getting the list of Synsets in Python 分解列名并在多个单词而不是一个单词上使用 wordnet.synsets() - Breaking up column names and use wordnet.synsets() on multiple words instead of one 查找列表中每个项目的Python NLTK Wordnet Synsets - Find Python NLTK Wordnet Synsets for a each item of a list Python:NLTK使用WordNet在计算同义词时给出了MemoryError - Python: NLTK is giving MemoryError on computing synsets using WordNet 如果手动加载wordnet,如何在同义词集中(nltk)中使用language选项? - How to use the language option in synsets (nltk) if you load a wordnet manually? 构建Wordnet.Synsets()。Definition()的列表理解时发生AttributeError - AttributeError when building list comprehension for Wordnet.Synsets().Definition()
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM