Powerful Web Scraping in Power BI for Financial Analysis

For this project I wanted to pull a large financial dataset from a public resource so I could build Dax formulas for financial analysis. I wanted five years of stock price history and five years of financial reporting to work with from multiple companies. The project went through several stages before I was satisfied with the dataset. All data for this report has been loaded using Power BI web scrapping.

Research

I found Chang Yue Sin’s article posted at The Babylonians very helpful for pulling the financial Income Statement, Balance Sheet, and Cash Flow out of Reuters. His sample dashboard is well organized and his suggested list of Dax formulas is pretty complete for the task. However, when it came to stock options his suggestion wasn’t dynamic enough for me to post the solution and let it run without any addition maintenance required.

I created Chang’s report and published it in my workspace, below, if you want to check it out.

Alternate approaches

My own effort to pull stock price history first met with failure. Yahoo limited me to one hundred lines of data when pulling stock history. I wanted to be able to pull over five thousand lines. Obviously, this wasn’t going to be enough. I turned to R script and tested several solutions before abandoning this path as using R requires refreshing the data from the client side in RStudio. Again, not a fully automated outcome. I have the sample report that pulls five years stock history, below.

The Solution

Fortunately, through dogmatic goo-gamming I found this great video from Conservational Business Intelligence that demonstrated an alternate way of scraping stock price history- and capable of going back five years. I followed his link back to Steve Campbell and his article Historical Stock Price Function in Power Query. After analyzing the query I decided to extend it’s capability by adding open price, high, low, close and volume to the adjclose and scaling up the query to pull from multiple companies at once. Then I enhanced the power queries pulling the financial data to scrape the same company list.

EDIT* The final step once I published the report was to schedule an automated daily refresh. Here, I discovered that currently dynamic queries cannot be refreshed. So, to update the report the dataset in Power BI will have to be manually updated. Looking for dynamic dataset refresh from Microsoft now…

Outcome

Finally, I built my own version of the dashboard.

Last, and certainly the most fun- I set up the portfolio page for this project and tested some video interactive features that comes with this WordPress theme that I think are just slick.

Conclusion

Web scraping is definitely a feature you can lean on if you need to pull in some public data to help your analysis and you don’t want to have to manage excel sheets, or involve IT in pulling data into your Data Warehouse or DataMart.

Additional resources:

Leave a Reply