Power BI dataflows (PBI dataflows) is a powerful data prep tool for you to transform data and reuse them in other places downstream. At the same time, while Excel is very popular among business users, there is still, at the time of writing this post, no out-of-the-box connector to import PBI dataflows to Excel. This post aims to solve that pain point.
This post also proposes an alternative to the existing PBI dataflows connector to Power BI Desktop to push the users’ access control to the table level, instead of workspace level like the current connector.
As mentioned, unlike Power BI, Excel doesn’t have the PBI dataflows connector yet. The good news is that you can overcome this limitation. Most of the configurations needed are already arranged in the M scripts hosted on my GitHub repo. Besides, you need to make sure that:
- Your PBI dataflows are stored in an ADLS Gen2 account of your organization. Details about the integration can be found here;
- You are the owner of that ADLS Gen2 account, so you can modify the ACLs for end-users;
- You have already created one dataflow, and the data is already stored in the ADLS Gen2 account. Refer to this Microsoft document for more guidance.
That’s it! Let’s begin with how to make it done.
THE POWER QUERY PART
You can download the Excel template on my GitHub repo. When you open the Power Query Editor, you will see the Power Query structure screenshot above. There are several elements that you should know:
- prADLSURL is the URL of ADLS Gen2’s folder including the data of the dataflow. I will explain where to find it later in the post;
- ColumnsAttribute is a staging table which contains the metadata about the column names and data types of the Table1, Table2, etc.;
- Table1, Table2 contain the data in the tables of PBI dataflow;
- LatestDateRefreshed shows the latest refreshed date and time of the underlying PBI dataflow, so the end-users know if the data is up-to-date as the screenshot below.
When you download the “PBI dataflow in Excel.xlsx” file, all sheets are blank at the beginning. You need to add your own values in the Power Query. There may be some warning/error pop-up windows, but just close them all.
To my surprise, although Power Query is a very powerful tool in Excel, opening the tool is not always obvious. The screenshot below shows one of some ways to open it in the Excel file that you just downloaded.
- Select the LatestDateRefreshed sheet
- Choose cell A2, which is part of the loaded table from Power Query
- The Query tab appears.
- Choose the Query tab and click Edit
- The Power Query Editor window will be opened.
The next step is to find the URL of your PBI dataflow inside your ADLS Gen2 account.
- Option 1: One way is to open Microsoft Azure Storage Explorer. As you see in the screenshot below, you have to open the container called “powerbi” (default name by Microsoft and is unchangeable). Next, open your PBI Workspace name’s folder, then your PBI dataflow’s folder. You right-click and choose Copy URL.
- Option 2: Another way is to input your own values into this URL pattern: https://yourADLSGen2AccountName.blob.core.windows.net/powerbi/yourWorkspaceName/yourDataflowName/. If there are spaces in your values, the URL is valid only if you replace those spaces with “%20” escape codes. But good news is that you can ignore it, as it is already taken into account in the M script.
After you have the URL, you can replace it with the current value in the prADLSURL parameter. You also have to replace the names of your tables in the varTableName step (see screenshot below).
If you have more tables, simply repeat the steps. You can add as many tables in the PBI dataflow as possible just by copying Table1 and adjusting the value for the varTableName variable as in the previous step. You can also rename Table1 and Table2 in the Queries and Sheet names to make them more meaningful.
After all these steps, click “Close & Load” in the top left corner to load the data from PBI dataflow to Excel. So now half of the task is done! You can already send this file to the end-users. They can already use the Excel file to create pivot tables to perform their own analyses. However, it’s more useful if they can refresh the Excel file by themselves to get the latest data while updating their pivot tables and reports in Excel. To achieve that, there is an extra task to do.
THE PERMISSION PART
ADLS Gen2 provides many ways to share the data. It’s out of the scope of this article, but you can read more here. In this article, I will focus only on Access Control Lists (ACLs), which is very useful in this use case. That’s because we need to apply fine-grained access to folders and files under the powerbi container, which contains the data and metadata of the PBI dataflow in question. With ACLs, you don’t even need to assign any Azure Role-based access control role (Azure RBAC) to the users. Thus, ACLs are more than enough.
You can grant permission for end-users to refresh (read) the data via Microsoft Azure Storage Explorer. One of the steps is to “Propagate Access Control Lists”, and it seems to me that you cannot do it in Azure Portal yet.
Within the ADLS Gen2 account, you right-click on the powerbi default container and choose “Manage Access Control Lists…” as shown below.
In the next window, you need to add a security principal. It can be a user, security group, managed identity,… but it’s good practice not to add individual users as it may be difficult to maintain later on. The purpose of this is to grant Execute permission, so it can list (not read or write) all the folders and subfolders under the powerbi container. We don’t grant the Read/Write permission here, as there can be several folders (which are workspaces), and you may not want this security principal to read/write the data of all workspaces.
The Access box should be checked. Tick the Default box, if you want that security principal to get the same Execute permission when new folders are created.
The permission you have just granted only affects the powerbi container, but not its children directories. To propagate the same Execute permission to all existing children, right-click the powerbi container again, and choose “Propogate Access Control Lists…”, tick the box “I understand that propagating ACLs cannot easily be reversed.”, and click OK.
The next step is to choose the workspace folder containing the PBI dataflow in question. Then, right-click again to open the “Manage Access” window. This time, add to the same security principal as in the previous step, but with more permission. As you can see in the screenshot below, I add the Read permission for both Access and Default (the Execute permission was already ticked from the previous “Propagate Access Control Lists” step). At this step, you have already allowed end-users to read all PBI dataflows within that workspace, which is equivalent to the PBI dataflows connector in PBI Desktop.
However, this solution is even more flexible than the existing connector in PBI Desktop, with which you can limit the end-users to read data of a specific PBI dataflow, or even just a table in that PBI dataflow. You do that by adding the extra Read permission like mentioned above, this time not for the workspace folder, but for the folder/subfolder related to that PBI dataflow/table in question. You can even grant the permission to read only one file, which is the finest grain level the ACLs can reach.
As you may guess, after you add extra Read permission to the folder/subfolder, you have to repeat the “Propagate Access Control Lists”, so it applies to all of its existing children.
Even when Microsoft introduces the out-of-the-box PBI dataflows connector in Excel, this method is still relevant thanks to the 2 justifications below:
- The access control offered by the current out-of-the-box PBI dataflows connector in PBI Desktop is not fine-grained enough. This means the end-users can see all dataflows if they have Viewer access to that workspace. It can be predicted that the future solution in Excel will have this limitation as well.
- The technique proposed in this article can replace the existing out-of-the-box PBI dataflows connector in PBI Desktop. With this technique, you can give the users access to even a specific table of a specific PBI dataflow, instead of everything included in the workspace.
In case, for some reason, you cannot see the latest refreshed date, you can open the “Queries & Connections” window and refresh that “LatestDateRefreshed” query manually, as instructed in the screenshot below.
One thought on “How to connect Power BI Dataflows to Excel”
Reblogged this on ERP and BI.