[英]What is the best data format for a time series in a Python Visualization in Power BI?
As of today, August 9 2018, Power BI supports Python Visualizations. 截至今天,即2018年8月9日,Power BI支持Python可视化。 They've had support for R Visualizations before, but I still find these integrations to be a bit awkward. 他们之前已经支持过R Visualizations,但我仍然发现这些集成有点尴尬。 Let me show you what I mean: 让我告诉你我的意思:
Let's say that you have a table with time series data, where the top row containts the names 'Date' and 'Value', and the contents are dates of the form yyyy-mm-dd and a number, respectively: 假设您有一个包含时间序列数据的表,其中顶行包含名称'Date'和'Value',内容分别是yyyy-mm-dd和数字的日期:
Date,Value
2017-01-12,1
2017-01-13,4
2017-01-14,2
2017-01-15,4
2017-01-16,2
2017-01-17,2
2017-01-18,2
2017-01-19,5
2017-01-20,5
2017-01-21,5
2017-01-22,5
2017-01-23,6
2017-01-24,3
2017-01-25,6
2017-01-26,6
2017-01-27,5
2017-01-28,8
2017-01-29,4
2017-01-30,2
If you store that dataset as a textfile like timerseries.csv
and import it using Get Data | 如果将该数据集存储为timerseries.csv
等文本文件,并使用Get Data |将其导入 Text/CSV , you get a table uner VISUALIZATIONS | 文本/ CSV ,你得到一个表格uner VISUALIZATIONS | FIELDS , like this: FIELDS ,像这样:
You can inspect your table using VISUALIZATIONS | 您可以使用VISUALIZATIONS |来检查您的表格 Table and get: 表并获得:
With this setup, one should think that you were all set for unleashing the power of a Py VISUALIZATION using this beautiful new feature: 通过这种设置,人们应该认为你已经准备好使用这个漂亮的新功能释放Py VISUALIZATION的强大功能:
If you click that, you get this: 如果你点击它,你得到这个:
And you're told to 你被告知了
Drag fields into the Values area in the Visualization pane to start scripting 将字段拖到“可视化”窗格的“值”区域中以开始编写脚本
If you start with Value
, you get this default setup in the editor: 如果从Value
开始,则在编辑器中获得此默认设置:
And if you follow the instructions given by the Power BI team in the August 2018 feature summary you should be able to make a matplotlib plot quite easily. 如果按照Power BI团队在2018年8月功能摘要中提供的说明进行操作,您应该可以非常轻松地制作matplotlib图。
But this is where it ends for me at the time being. 但这就是我现在所处的结局。
If the default dataframe in the editor shares the features of a standard dataframe, you should be able to reference a column in that dataframe and easily make a plot with this snippet: 如果编辑器中的默认数据框共享标准数据框的功能,您应该能够引用该数据框中的列,并使用此代码段轻松制作绘图:
import matplotlib.pyplot as plt
plt.plot(dataset['Value'])
plt.show()
But when you run it, it onlu returns an error: 但是当你运行它时,onlu会返回一个错误:
And the details are elaborate to say the least. 至少可以详细说明细节。
I've also tried to import both Dates
and Values
, and I've tried plotting the dataframe directly with dataset.plot()
, but nothing seems to be working. 我也尝试导入Dates
和Values
,我尝试直接用dataset.plot()
绘制数据框,但似乎没有任何工作。 I've also tried stripping the date hierarchy down to simple dates this way: 我也尝试过以这种方式将日期层次结构拆分为简单日期:
So, any ideas on the dataformat, import method and/or the snippet? 那么,关于dataformat,import方法和/或片段的任何想法?
Thank you for any suggestions! 谢谢你的任何建议!
EDIT 1 - Following the answer from Foxan Ng: 编辑1 - 根据Foxan Ng的回答:
Add both columns in the Value field: 在“值”字段中添加两列:
This still returns an error edning with: 这仍然会返回一个错误:
TypeError: from_bounds() takes 4 positional arguments but 6 were given TypeError:from_bounds()需要4个位置参数,但是给出了6个
I didn't encounter errors that you've mentioned. 我没有遇到你提到过的错误。 Have you dropped in both columns into Values
? 您是否已将这两列放入Values
?
import matplotlib.pyplot as plt
plt.plot(dataset['Date'], dataset['Value'])
plt.show()
UPDATED with M query: 用M查询更新:
let
Source = Csv.Document(File.Contents("C:\your-directory..\timerseries.csv"),[Delimiter=",", Columns=2, Encoding=1252, QuoteStyle=QuoteStyle.None]),
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Date", type date}, {"Value", Int64.Type}})
in
#"Changed Type"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.