I am trying to run a validated Python script to impute data in PowerBI. The data is originally consolidated in Power BI, then exported to Excel, imputed and analysed with Python.
Now, I would like to use the code from Python into Power BI's query editor, so that I can get imputed data directly into Power BI and use its visualizations, but I get errors.
I tried pasting the same code I have in Python in Power BI - I think there might be an issue with the syntax.
dataset=#"PreviousStep"
import pandas as pd
byISO = dataset.groupby(['country ISO'])
byIG = dataset.groupby(['WBG Income Group'])
bytIG = dataset.groupby(['WBG Income Group','Year'])
bytR = dataset.groupby(['UN Sub-Region','Year'])
#Country-level
#Filling up and down
dataset[['col1','col2']] = byISO[['col1','col2']].fillna(
method='ffill')
dataset[['col1','col2']] = byISO[['col1','col2']].fillna(
method='bfill')
#Interpolation
dataset[['col1','col2']] = byISO[['col1','col2']]\
.apply(lambda i: i.interpolate(method='linear', limit_area='inside'))
#Extrapolation (FILLING DOWN CURRENTLY)
dataset[['col1','col2']] = byISO[['col1','col2']]\
.apply(lambda i: i.interpolate(method='linear', limit_area='outside'))
#Median
dataset[['col1','col2']] = byISO[['col1','col2']]\
.transform(lambda i: i.fillna(i.median()))
#Group-level
#Median
dataset[['col1','col2']] = byIG[['col1','col2']]\
.transform(lambda i: i.fillna(i.median()))
#Yearly median
dataset[['col1','col2']] = bytIG[['col1','col2']]\
.transform(lambda i: i.fillna(i.median()))
#Region-level
#Yearly median
dataset[['col1','col2']] = bytR[['col1','col2']]\
.transform(lambda i: i.fillna(i.median()))
#No level (All)
#0
dataset[['col1','col2']].fillna(0)
I expect a table with imputed values, but I get this error as a result instead:
DataSource.Error: ADO.NET: Python script error.
Traceback (most recent call last):
File "PythonScriptWrapper.PY", line 2, in <module>
import os, pandas, matplotlib.pyplot
File "C:\Users\GEscamilla\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\__init__.py", line 19, in <module>
"Missing required dependencies {0}".format(missing_dependencies))
ImportError: Missing required dependencies ['numpy']
Details:
DataSourceKind=Python
DataSourcePath=Python
Message=Python script error.
Traceback (most recent call last):
File "PythonScriptWrapper.PY", line 2, in <module>
import os, pandas, matplotlib.pyplot
File "C:\Users\GEscamilla\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\__init__.py", line 19, in <module>
"Missing required dependencies {0}".format(missing_dependencies))
ImportError: Missing required dependencies ['numpy']
ErrorCode=-2147467259
ExceptionType=Microsoft.PowerBI.Scripting.Python.Exceptions.PythonScriptRuntimeException
If you look at the error output it is telling you
ImportError: Missing required dependencies ['numpy']
This means that you have to import numpy along with your other import statements as @prathik says in the comment. You can find example here from microsoft
import numpy
If that does not work you need to make sure you need to install with
pip install numpy
The bigger picture
You should consider placing the script before the dashboard - so that the transformed data can be used by other dashboards as well.
Usually I would recommend making all data transformations in a Data Warehouse, or a mart for a specific purpose. However, this all depends on whether or not this a one-time exercise or something you are going to use in production.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.