如何在pandas中使用用户自定义的function根据列值和Timestamp返回一个值

Question

I have two dataframe, which I have joined.我有两个dataframe，我已经加入了。 On the joined Dataframe, I'm writing a user-defined function where based on Timestamp and the value count of the column i need to return the value based on the condition mentioned below create a new column called "Day_Sentiment".在加入的 Dataframe 上，我正在编写一个用户定义的 function，其中基于时间戳和列的值计数，我需要根据下面提到的条件返回值，创建一个名为“Day_Sentiment”的新列。 But I'm getting below error.但是我遇到了以下错误。 Please let me know how to go about it请让我知道如何 go 关于它

Input:输入：

                  Date          Content    Cleaned-content   Sentiment   
                  11/12/2020    abb        bbc abb           Bad         
                  12/10/2020    xyz        xxy               Good        
                  11/24/2020    tyu        yuu               Neutral     
                  12/16/2020    iop        yui               Bad

Output: Output：

               Date          Content    Cleaned-content   Sentiment   Day_Sentiment
               11/12/2020    abb        bbc abb           Bad         Bad
               12/10/2020    xyz        xxy               Good        Bad
               11/24/2020    tyu        yuu               Neutral     Bad
               12/16/2020    iop        yui               Bad         Bad

So far I tried below:到目前为止，我在下面尝试过：

df = input_data.join(results)

def compare_def(df):

    no.bad_senti= df.loc[df['Sentiment'] == 'Bad']
    no.neut_senti = df.loc[df['Sentiment'] == 'Neutral']
    no.good_senti= df.loc[df['Sentiment'] == 'Good']

    if ((no.bad_senti> no.good_senti) & (no.bad_senti> no.neut_senti)):
       output = 'Bad'
    elif ((no.good_senti> no.bad_senti) & (no.good_senti> no.neut_senti)):
       output= 'Good'
    elif ((no.neut_senti> no.bad_senti) & (no.neut_senti> no.good_senti)):
       output= 'Neutral'
    elif no.good_senti== no.bad_senti:
       output= 'Neutral'
    elif no.bad_senti== no.neut_senti:
       output= 'bad'
    elif no.good_senti== no.neut_senti:
       output= 'good'
    else:
       output= 'Neutral'

    return output

df['Day_Sentiment'] = output

Alternate:备用：

 output = compare_def(df)
 df['Day_Sentiment'] = output

Error:错误：

     ValueError: Can only compare identically-labeled DataFrame objects

Example 1: Predicted Sentiments Sentiment 2 bad 1 good 1 Neutral示例 1：预测情绪情绪 2 坏 1 好 1 中性

Then in function 2 > 1 and 2 > 1 returns Bad然后在 function 2 > 1 和 2 > 1 返回 Bad

Example 2: Sentiment: 2 bad 5 good 5 neutral示例 2：情绪：2 坏 5 好 5 中性

Function: Function：

2 > 5 false 5 > 2 and 5 > 5 false 5 > 2 and 5 > 5 false 5==2 false 2==5 false 5==5 True return good 2 > 5 false 5 > 2 and 5 > 5 false 5 > 2 and 5 > 5 false 5==2 false 2==5 false 5==5 True 返回 good

Answer 1

There Several issues with your code.您的代码有几个问题。 To begin the variables bad, good, & neut are Panda Series of different lengths containing string variables.首先，变量 bad、good 和 neut 是包含字符串变量的不同长度的熊猫系列。 You then attempt to evaluate perform several conditional tests for example if ((bad> good) & (bad> neut) which generates your ValueError. I am not quite sure what logic you are attempting to implement, but the following template may help:然后您尝试评估执行几个条件测试，例如if ((bad> good) & (bad> neut)会生成您的 ValueError。我不太确定您尝试实现的逻辑是什么，但以下模板可能会有所帮助：

def compare_data(row):
    value = 'Good'
    # The logic here escapes me
    # Evaluate the row contents of row[Sentiment] and modify value
    return value  

df["Day Sentiment"]= df.apply(lambda row: compare_data(row), axis= 1)

Yields:产量：

    Date    Content Cleaned-content Sentiment   Day Sentiment
0   11/12/2020  abb bbc abb Bad Good
1   12/10/2020  xyz xxy Good    Good
2   11/24/2020  tyu yuu Neutral Good
3   12/16/2020  iop yui Bad Good

如何在pandas中使用用户自定义的function根据列值和Timestamp返回一个值

问题描述

1 个解决方案

解决方案1
0 2020-12-22 14:46:45

如何在pandas中使用用户自定义的function根据列值和Timestamp返回一个值

问题描述

1 个解决方案

解决方案1 0 2020-12-22 14:46:45

解决方案1
0 2020-12-22 14:46:45