使用 unstack() 时将数据帧值应用于数据帧

Question

似乎有很多关于这个，但我找不到我需要的东西。 我正在使用 unstack() 创建一个可以绘制项目的 DataFrame。

启动 DataFrame 示例：

Date        word    tf_idf
2015-01-02  grout   0.0016774129439329863
2015-01-02  rhrsw   0.0015957287173067212
2015-01-02  county  0.001501862322171032
2015-01-02  limestone 0.001501862322171032
2015-01-02  lgt       0.0014079959270353424

一个更好的例子是单词 tf_idf 值随时间变化（这是我需要绘制的）：

Date word tf_idf
2015-01-02 grout 0.0016774129439329863
2015-01-02 rhrsw 0.0015957287173067212
2015-01-17 bfn0eq 0.0026125536132961145
2015-01-17 rhrsw 0.001473748192115757

使用在线教程，我可以通过以下方式使其工作：

plotFrame = plotDf.groupby(['Date','word']).count()['tf_idf'].unstack()

产量：

      word    word2   word3
date    1       nan     nan
date2   nan     1        1
date3   nan     nan      1

但是，这给了我字数。 我需要获取该特定单词的实际tf_idf值。 我试过了：

plotFrame=plotDf.groupby(['Date','word']).apply(plotDf.loc['Index'].at['tf_idf'])['tf_idf'].unstack()
and

plotFrame = plotDf.groupby(['Date','word']).apply(plotDf.loc['tf_idf'])['tf_idf'].unstack()

and

plotFrame = plotDf.groupby(['Date','word']).apply(plotDf.at['tf_idf'])['tf_idf'].unstack()

与其他一些apply() .loc组合一起，除了 Series not hashable/Key/Numpy array is not callable type 错误之外没有其他输出。

如何在特定单词处检索tf_idf的值并将其应用于我的新 DataFrame？

我想要：

      word       word2   word3
date    0.012      nan     nan
date2   nan     0.019     0.03
date3   nan     nan       0.01

Answer 1

from datetime import datetime
import pandas as pd

这是我的方法，它看起来并不完美，但可以解决您的问题：

如何在特定单词处检索 tf_idf 的值并将其应用于我的新 DataFrame？

today = datetime.now()
today = today.strftime("%d/%m/%Y")
c = {'Date': today, 'word':['grout','rhrsw','county','limestone','lgt'],"SITEID": ['SQN','BFN','BFN','BFN','BFN'],
    'tf_idf': [0.0016774129439329863,0.0015957287173067212,0.001501862322171032,0.001501862322171032,0.0014079959270353424]}
dframe = pd.DataFrame(data = c)
df = pd.DataFrame(dframe.set_index('tf_idf').stack())

输出：

Answer 2

解决了。 使用枢轴：

plotDf.pivot(index='Date', columns='word', values='tf_idf').plot(ax=ax)

使用 unstack() 时将数据帧值应用于数据帧

问题描述

2 个解决方案

解决方案1
0 2019-12-18 18:07:29

解决方案2
0 已采纳 2019-12-18 20:00:10

使用 unstack() 时将数据帧值应用于数据帧

问题描述

2 个解决方案

解决方案1 0 2019-12-18 18:07:29

解决方案2 0 已采纳 2019-12-18 20:00:10

解决方案1
0 2019-12-18 18:07:29

解决方案2
0 已采纳 2019-12-18 20:00:10