In our previous post, we installed and set up Python to use with Power BI and used Python as a data source. Let’s look at how Python can be used to filter data inside Power Query.
Let’s filter records where the number of Employees is greater than 5000. We will use the query:
# 'dataset' holds the input data for this script import pandas as pd dataset_filtered = dataset.query('Employees > 5000')
It should look like below. Click OK:
You may get this message regarding privacy levels. Per the Power BI documentation, “For the Python scripts to work properly in the Power BI service, all data sources need to be set to public“. Click Save:
This will output below. Click on dataset_filtered:
Your dataset will now be filtered accordingly:
In the next post, we will look at using the Python visualization in Power BI.
I AM SPENDING MORE TIME THESE DAYS CREATING YOUTUBE VIDEOS TO HELP PEOPLE LEARN THE MICROSOFT POWER PLATFORM.
IF YOU WOULD LIKE TO SEE HOW I BUILD APPS, OR FIND SOMETHING USEFUL READING MY BLOG, I WOULD REALLY APPRECIATE YOU SUBSCRIBING TO MY YOUTUBE CHANNEL.
THANK YOU, AND LET'S KEEP LEARNING TOGETHER.
CARL
Thanks a lot for sharing this, it has been really helpful! I’ve followed these steps and it works for most of my dashboards, however for one of them whenever I try to apply the changes from the query editor, I always get the follow error:
DataSource.Error: ADO.NET: A problem occurred while processing your Python script.
Here are the technical details: Process must exit before requested information can be determined.
Details:
DataSourceKind=Python
DataSourcePath=Python
Message=A problem occurred while processing your Python script.
Here are the technical details: Process must exit before requested information can be determined.
ErrorCode=-2147467259
ExceptionType=Microsoft.PowerBI.Scripting.Python.Exceptions.PythonUnexpectedException
I have made sure all my data source privacy levels are ‘Public’. Do you know what might be causing this error? Thanks a lot.