简体   繁体   中英

How to add another column in dataframe with calculated values

I have a news dataset and I am carrying NLP over it. I have 2 functions right now, One calculates similarity and another one calculates sentiments both of them takes the input from data frame, that I am trying to do is to create another column in the dataframe with the calculated values like similarity & sentiment(Pos/Neg)

the functions are as follows

i=0
for i in range(0, 9):
    text1 = df.description[i]
    text2 = df.title[i]


    vector1 = similarity.text_to_vector(text1)
    vector2 = similarity.text_to_vector(text2)

    token1 = similarity.tokenize(text1)
    token2 = similarity.tokenize(text2)

    jaccard = similarity.jaccard_similarity(token1,token2)
    print ('Jaccard Similarity:', jaccard)

    i=i+1

Output:

('Jaccard Similarity:', 0.07142857142857142)
('Jaccard Similarity:', 0.125)
('Jaccard Similarity:', 0.03225806451612903)
('Jaccard Similarity:', 0.07692307692307693)
('Jaccard Similarity:', 0.2)
('Jaccard Similarity:', 0.07407407407407407)
('Jaccard Similarity:', 0.12)
('Jaccard Similarity:', 0.043478260869565216)
('Jaccard Similarity:', 0.0)

Code:

i=0
for i in range(0, 9):
    blob = TextBlob(df.description[i], analyzer=NaiveBayesAnalyzer())
    y = blob.sentiment.classification
    print ('Result', y)
    i=i+1

Output:

('Result', 'pos')
('Result', 'neg')
('Result', 'pos')
('Result', 'pos')
('Result', 'pos')
('Result', 'neg')
('Result', 'pos')
('Result', 'pos')
('Result', 'neg')

This will solve your problem

def jaccard(text1,text2):
    vector1 = similarity.text_to_vector(text1)
    vector2 = similarity.text_to_vector(text2)

    token1 = similarity.tokenize(text1)
    token2 = similarity.tokenize(text2)

    jaccard = similarity.jaccard_similarity(token1,token2)
    return jaccard

def result(t1):
    blob = TextBlob(t1, analyzer=NaiveBayesAnalyzer())
    y = blob.sentiment.classification
    return y

df['result'] = df['description'].map(lambda x: result(x))

df['jaccard'] = df.apply(lambda x: lambda x : jaccard(x['description'],x['title']))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM