简体   繁体   English

替换数据框列中的特定值

[英]Replacing specific values within a dataframe column

I am running the following code in jupyter notebook which checks strings of text within nametest_df['text'] and returns Persons names. 我在jupyter笔记本中运行以下代码,该代码检查nametest_df['text']的文本字符串并返回人员名称。 I managed to get this working and would like to push these names to the respective fields within the nametest_df['name'] where currently all values are NaN . 我设法使此工作正常,并想将这些名称推送到nametest_df['name']中的各个字段,其中当前所有值均为NaN

I tried the Series.replace() method however all entries within the 'name' column are all showing the same name. 我尝试了Series.replace()方法,但是“名称”列中的所有条目都显示相同的名称。

Any clue how I can do this efficiently? 有什么线索可以有效地做到这一点吗?

for word in nametest_df['text']:

    for sent in nltk.sent_tokenize(word):
        tokens = nltk.tokenize.word_tokenize(sent)
        tags = st.tag(tokens)

        for tag in tags:
            if tag[1]=='PERSON':
                name = tag[0]
                print(name)

    nametest_df.name = nametest_df.name.replace({"NaN": name})

Sample nametest_df 样本名称test_df

      **text**                    **name**
0   His name is John                NaN
1   I went to the beach             NaN
2   My friend is called Fred        NaN

Expected output 预期产量

      **text**                    **name**
0   His name is John                John                
1   I went to the beach             NaN
2   My friend is called Fred        Fred      

Don't try and fill series values one by one. 不要尝试一一填写序列值。 This is inefficient prone to error. 这是低效率的,容易出错。 A better idea is to create a list of names and assign directly. 一个更好的主意是创建一个名称列表并直接分配。

L = []
for word in nametest_df['text']:
    for sent in nltk.sent_tokenize(word):
        tokens = nltk.tokenize.word_tokenize(sent)
        tags = st.tag(tokens)
        for tag in tags:
            if tag[1]=='PERSON':
                L.append(tag[0])

nametest_df.loc[nametest_df['name'].isnull(), 'name'] = L

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM