Right now, in most companies, there is a crisis of Business Intelligence dashboards. For good reasons, BI practitioners use the dashboard as way to deliver data and analytics to pretty much everyone. Dashboards are the hammer for every nail and as a result at most companies there are hundreds or thousands of dashboards. Once you have the dashboard you need, using it can be a joyful experience. But navigating the swamp of dashboards is a mess. It is not easy for someone seeking to use data to find what a company knows about itself.
Don't Miss: The Best HDR TVs
There are several ways that this challenge is being attacked through catalogs and search and other brute force approaches. Two of the most interesting are Metric Insights and Roambi. Metric Insights has created a KPI warehouse and a collaborative filtering system that makes it easy to consume lots of dashboards, track what is happening on them, and get suggestions about new dashboards that may be helpful. Roambi moves dashboards to an optimized mobile user experience that organizes them and makes them easier for beginners to find and consume.
While these approaches certainly provide a way to manage the mess, they also give up on the central question I want to ask: Do we really need so many dashboards?
My answer is, no. In fact, if we can figure out a way to have fewer dashboards each of which answer more questions — which is essentially the victory of QlikView and Tableau — we cannot only reduce the number of dashboards but also bring BI to a whole new population to the world of users.
In two previous articles I’ve explained key ideas related to how to expand BI adoption. In “How Semantics Can Make Data Analysis Work Like Google Search” I explained the key role that natural language can play in combination with semantic models. In “Why Top Of The Funnel BI Will Drive The Next Wave of Adoption”, I set forth what needs to happen for BI to expand to tackle the next wave of adoption.
This article will focus on why combining natural language and semantics can create dashboards that can answer many questions so that we can get by with many fewer of them.
Here’s the logic in 7 statements:
- Most ways that BI is consumed is based on static queries or models of information that are created by hand.
- Even when these models are configurable, such as parameterized reports or dashboards, severe limits are placed on the amount of data included and the way it can be displayed.
- Systems such as QlikView and Tableau have dramatically expanded the size of the model that can be represented using a single interface, but even these systems are still based on static models.
- Because the amount of available data will become overwhelming. Manual modeling just won’t be able to keep up.
- Therefore, the next generation of BI will be powered by automatically created semantic models.
- Natural language provides the simplest way for users to express their desires in terms of a question or a request.
- Many systems (IBM Watson, Microsoft’s Q&A in Power BI for Office 365, DataRPM) have shown that natural language can be parsed and connected to semantic models and lead to visualizations that can then be refined.
My view is that statements 1 and 2 are not controversial. Statement 3 is important because it shows that when you expand the amount of information that a dashboard provides, people can more easily find and discover their own semantic relationships. In other words, when you can see lots of data and how it is connected, you are going to have a higher probability of seeing important patterns. QlikView achieves this through its associative model. Tableau enables it through making so much data visual and explorable, using its VizQL technology But in both of these cases the semantics model is mostly in the mind of the user until it is captured.
“In one well-constructed dashboard, QlikView can meet the needs of users with dozens of different perspectives,” said Donald Farmer, VP of Product Management of QlikView. “We accelerate the pace at which the meaning of the data can be understood. When dashboards are built to answer one question the feeling is like having to move from app to app to app on a mobile phone. It can be frustrating.”
But as statement 3 points out, the models used by QlikView and Tableau and pretty much every other dashboard are hand crafted. The question central to statement 4 is: Can semantics be discovered?
We know that at some level this is true. Google is able to discover semantic relationships statistically and through patterns. I’m not sure what deep magic goes on in Watson, but it is clearly got a massive semantic model and it is too big to be completely hand crafted.
I spoke with Quentin Clark, Corporate Vice President, Data Platform Group, Microsoft, about this and he suggested that hand-crafting can go along way. He pointed out that Excel allows semantic relationships between data to be captured in a portable way. Clark explained that the work that has gone into the Power BI for Office 365 allows very large data sets to be processed. In turn, the data sets can be wide as well. Clark said that the semantic model that powers the corporate level analysis of Microsoft’s corporate financials is stored within an Excel-based semantic model.
The semantic models can be uploaded from Excel to Microsoft’s Analysis Server where they can be used in data warehouse and other applications and made more of a central resource.
But if Microsoft’s hand-crafted approach is successful, it means that semantic modeling will become the bottleneck. Is this really the best we can do?
Another approach to semantic models is to discover them. QlikView’s approach to master data management is designed to examine the way data is used around a company and unearth inconsistencies in fundamental definitions. If can be a problem if revenue, or income, or days sales outstanding is computed differently by separate parts of a company unless everyone is aware of the variations. But such research, which is enabled by QlikView’s Expressor product, can also find where large hand-crafted semantic models have been created that can be turned into shared resources.
DataRPM has bet its company’s future on its ability to achieve victory with respect to statement 5 and automatically create semantic models that collect the data that represent a useful context. It is not likely that in the near- or medium-term future a master semantic modeler will emerge that makes sense of all data, but that isn’t required.
DataRPM and others who are attempting this feat don’t have to succeed 100 percent to provide a huge amount of value. It is safe to say that people will be hand-crafting semantic models for a long time. But if it is possible to create useful models faster, either automatically or with a machine learning assist, then lots more data can be brought to the attention of users in a single interface.
This is even more important because today in the age of Big Data, data sources are increasing every day and each new data source also brings in new data formats, which if they can be automatically understood and mapped with existing data using algorithms, via an automatically created semantic model, give tremendous power of instant data discovery and analytical capabilities to the end user. At the same time it cuts down the time, effort and cost involved in having to manually define data relationships. The challenge of course is quality. DataRPM will be the next QlikView or Tableau if its automatic models allow users to find the answers they need.
It is important to recognize that companies like QlikView and Tableau are extremely well positioned to take advantage of such automated modeling when it arrives in high quality form from DataRPM or IBM or whoever makes it available. The output of such automated modelers could easily be associative data models and VizQL. It is also true that Microsoft and traditional BI players could use such models, for example, to create an SAP Business Objects Universe.
So let’s say we have the ability to automatically or easily create a semantic model. We would still need a way for users to express their desires. Statement 6 asserts that a natural language interface would be the simplest and most natural way to meet this need. Given how many people are familiar with the way that Internet search works through a natural language interface, it is likely that such an approach would be popular and would be a huge step forward for TOFU BI, if it produced satisfying results.
DataRPM and Microsoft’s Q&A in Power BI for Office 365 are enabling TOFU BI with their natural language question and answer interface powered by semantic data modeling. DataRPM is focused on automatically creating models and supporting search with computational search. Microsoft’s approach, which relies more on hand-crafted semantic models, is embedded in Office 365.
But the approach taken is far less important than the quality of the experience to guide people to answers. The key to producing satisfying results is to determine the “BI intent”, as Clark says, from the language and then map that to the semantic model and then create a visualization that is a starting point. Then, as statement 7 asserts, the key is to provide suggestions for how to improve that visualization based on the semantic model.
“It is key to automatically and in real-time discern which are the representative entities in the data and which attributes in those are dimensions, which are metrics, and which are time series from across the different available data sources to be able to make instant suggestions and enable data discovery based on the algorithmically identified entity relationships”, says Ruban Phukan, Chief Product Officer and Co-Founder of DataRPM.
Clark points out that it is vital to be able to tune the vocabulary to match the way a company naturally thinks about certain issues.
With a semantically driven, natural language interface TOFU BI is realized. As long as the data and semantic model have some fit to the context, users should be able to enter their desires and the data will come to them in a form that will answer more questions than any other approach. While this won’t work perfectly right away, my prediction is that just getting it partially right will have a huge impact.
Follow Dan Woods on Twitter:
Dan Woods is CTO and editor of CITO Research, a publication that seeks to advance the craft of technology leadership. For more stories like this one visit www.CITOResearch.com. Dan has done research for QlikView and Tableau and many other Business Intelligence, big data, and analytics companies.
How To: Buy a Pokemon Go Plus