(User) Stories for Analytics Projects – Part 2

You can’t spend long in agile project management without encountering the term “user story”. I want to use this two-part article series to summarize how the concept, which originated in software development, can be applied to analytics projects, also known as business intelligence and data warehousing. In the first part, I introduced the basics of stories. Now I would like to explain how we can structure or tailor stories to suit our circumstances in analytics systems and development projects and meet these three principles:

  1. The story is the smallest planning unit I use to plan a project.
  2. The story covers a timespan limited to one or two working days.
  3. A story always has an end-to-end characteristic. In analytics, this often means that a story has a data source as its start and a form of data evaluation as its end.

From feature to story

Let’s take a look at this graphic and go through it point by point:

On the left, we see the simplified architecture stack of an analytics solution. For requirements elicitation and subsequent agile project structuring, the three levels shown are sufficient in most cases:

  • BI application: These are the most visible aspects of an analytics solution. They incorporate information products of all kinds, from reports to dashboards and data interfaces to other systems. Often, the finished product at this level is composed of data from several entities on the level below, such as data from a sales data mart and a logistics data mart.
  • DWH or data basis: Here I deliberately combine the core data warehouse and data mart levels. Even if they are technically different things, a Core DWH alone hardly adds any value to the practical use of a story. It’s only through the implementation of initial key figures and the application of case-specific business rules that the data really becomes useful.
  • Connectivity and infrastructure: Developing a new data source can be a costly business. The same applies to providing the initial infrastructure. Clearly a maximum implementation time for a story of one to two days requires an established infrastructure. But this also means that we need our own stories to develop this foundation.

In many cases, an analytics solution consists of all three levels. At this point, the technology plays a subordinate role. For instance, the DWH aspect might be mapped purely virtually in the BI front-end tool. Nevertheless, it must be considered equally in terms of content (cf. also the model of T-shirt size architecture). These three levels represent for me the macro variant of end to end, from the actual source system to the finished information product. At this macro level, I find it useful to speak of “features”. A feature is formulated analogously to a user story; you can find an example in the upper right corner of the graphic. A feature is often implemented within two to three weeks. But this is too long for the daily monitoring of success in a project. A feature must then be cut further into “epics” and stories on the three levels of BI application, DWH, and connectivity and infrastructure. Epics are merely a means of bracketing stories together. Let’s assume that Feature 1 in the fictitious club dashboard introduced in Part 1 can be outlined in four stories at the BI application level. Then I would gather these four stories into an epic to emphasize the coherence of the stories.

The central aspect of the division into the three architectural levels is that each level also has end-to-end characteristics, but this time at the micro level. Moreover, these three levels overlap somewhat. Let’s look at the club dashboard example and the story there: “As a TWDI back-office employee, I’d like a diagram showing the participants registered per round table in a city that I can select”. To implement this story, the underlying data model is our data source. At best, this may need to be adapted or extended in the application story. The final product is part of the finished dashboard and potentially generates immediate benefits for the end user.

At the DWH level, we build on an established interface to the source system and a functioning data infrastructure. But what does the other end look like, where we also expect some form of evaluation? We don’t need to build a complete dashboard for this either; there’s a story for that. Instead, we can generate a simple analysis on top of any data model, for instance with Excel, and make data visible. And immediately the product owner can start a conversation with the DWH developer about whether these initial results meet expectations.

This approach can even be applied at the connectivity level. Here, the starting point is the source system. In our example, it’s a fictitious, web-based system that allows either a CSV export or a web service query. In a first-user story, a CSV export is loaded into a load layer of the DWH. The result the product owner sees is not just a table definition but a simple evaluation of the raw data. This enables the product owner to detect possible data quality problems early.

As you can see from these example stories, I like to use the traditional story pattern with “I (end user) in my role as want so that ” at the feature and BI application levels. At the DWH and connectivity and infrastructure levels, I prefer an alternative pattern: <action> <result> <after/for/from/in> <object>.

Summary and outlook

In this second article on using stories in analytics, I showed how small stories can be reconciled with the demand for an end-to-end story. It’s important not to lose sight of the big picture, the feature, and to always work towards concrete visible analysis at the epic and story levels.
Even with this structure, it is still a challenge to tailor stories so that they can be implemented in one or two days. There are also many practical approaches to this—and material for another blog 😉

(This article was first published in German on my IT-Logix blog. Translated to English by Simon Milligan)

(User) Stories for Analytics Projects – Part 1

You can’t spend long in agile project management without encountering the term “user story”. I want to use this two-part article series to summarize how the concept, which originated in software development, can be applied to analytics projects, also known as business intelligence and data warehousing.

User stories or general stories are a tool for structuring requirements. Roman Pichler summarizes their use like this:

 “(User) stories are intended as a lightweight technique that allows you to move fast. They are not a specification, but a collaboration tool. Stories should never be handed off to a development team. Instead, they should be embedded in a conversation: The product owner and the team should discuss the stories together. This allows you to capture only the minimum amount of information, reduce overhead, and accelerate delivery.”

To keep my explanations a bit less abstract, I’ll introduce a simple, fictitious case study. Let’s assume that a club like TDWI, of which I am an active member, wants a new club dashboard. For this purpose, the project manager has already sketched their first ideas as a mockup:

Basics

All user stories follow a particular pattern, of which there are different basic variations. The most common pattern is as follows:

Some projects involve systems that aren’t designed for end users. Let’s assume we build a persistent staging area and a data interface from it that delivers integrated data to another system. In such cases, it seems a bit artificial to speak of “user stories”. Here’s an alternative pattern for formulating a story (source):

One of the central questions when using stories is always how they are tailored. How broad should a story be? I follow three principles:

  1. The story is the smallest planning unit I can use to plan a project. Of course, you can also divide a story into tasks and technical development steps. But that’s up to the people who implement it. As long as we are at the level where the product owner, and thus a specialist, is involved, we remain at the story level.
  2. The story covers a timespan limited to one or two working days. This includes the specification of requirements (a reminder to have a conversation about it), their technical implementation, and integrated testing. This forces us to create small stories of a more or less uniform size. This principle makes it possible to measure a flow in implementation daily (by now, it should be clear that I don’t think much of story points, but that’s another story).
  3. A story always has an end-to-end characteristic, and it should provide something usable in the implementation of the overall system. In analytics, this often means that a story has a data source as its start and a form of data evaluation as its end.

Let’s take a more concrete look at this using the case study. First, we focus on a specific part of the dashboard let’s call it Feature 1, where the number of participants is evaluated:

Independently of technological implementation, we can deduce two things here: a simple, analytical data model, such as a star schema, and the data sources required. From the principles formulated above, the question now arises what “end to end” means. In simple cases, it may well be that the scope of a story extends from the data source to the data model, including the actual dashboard, and that this scope can easily be implemented in one or two working days. But what if the technical development of the data source connectivity alone means several days of work? Or the calculation of a complex key figure requires several workshops?

Particularly in larger organizations, I hear people say: “This end-to-end idea is nice, but it doesn’t work for us. We prefer to write ‘technical’ stories”. This means, for example, that I as data architect model the fact table XY and event dimension, I as ETL developer load the fact table XY and the event dimension, or I as data protection officer define the row-level security for fact table XY. Although each of these examples provides a benefit to the overall system, on its own each does not add value and success is difficult to test. Another common response is: “This end-to-end story makes perfect sense, but it doesn’t work with a lead time of one to working days. We just work on a story for two or three weeks”. That’s just as problematic. If a story takes several weeks to implement, then monitoring success and progress becomes very difficult. In the worst case, we don’t realize that we’re on the wrong track after one or two days but only after three or even four weeks.
Both technical stories and long stories should be avoided. How to do this exactly is explained in the second part of this series.

(This article was first published in German on my IT-Logix blog. Translated to English by Simon Milligan)

Positioning Architecture T-Shirt-Sizes

In my previous post, I’ve introduced the idea of architecture t-shirt sizes to depict the idea that you BI architecture should growth with your requirements. In this blog post I position the four example t-shirt sizes on Damhof’s Data Management Quadrants.

T-Shirt Sizes in the Context of Data Management Quadrants

If you haven’t read my previous blog post, you should do so first.

In [Dam], Ronald Damhof describes a simple model for the positioning of data management projects in an organization. Here, he identifies two dimensions and four quadrants (see also Figure 1). On the x-axis, Damhof uses the terms push-pull strategy, known from business economics. This expresses how strongly the production process is controlled and individualized by demand. On the right or pull side, topic-specific data marts and from them information products such as reports and dashboards, for example, are developed in response to purely technical requirements. Agility and specialist knowledge are the key to this. The first two T-shirt sizes, S and M, can be categorized as belonging on this side. On the left or push side, the BI department connects various source systems and prepares the data in a data warehouse. The focus here is on economies of scale and deploying a stable basic infrastructure for BI in the company. Here we can see the other two T-shirt sizes, L and XL.

Figure2
Figure 2: Damhof’s Data Quadrant Model

On the y-axis, Damhof shows how an information system or product is produced. In the lower half, development is opportunistic. Developers and users are often identical here. For example, a current problem with data in Excel or with other tools is evaluated directly by the business user. This corresponds to the S-size T-shirt. As can be seen in my own case, the flexibility gained for research, innovation, and prototyping, for example, is at the expense of the uniformity and maintainability of results. If a specialist user leaves the company, knowledge about the analysis and the business rules applied is often lost.

In contrast, development in the upper half is systematic: developers and end users are typically different people. The data acquisition processes are largely automated and so do not depend on the presence of a specific person in daily operations. The data is highly reliable due to systematic quality assurance, and key figures are uniformly defined. The L- and XL-size T-shirts can be placed here in most cases.

The remaining T-shirt, the M-size, is somewhere “on the way” between quadrants IV and II. This means it is certainly also possible for a business user without IT support to implement a data mart. If the solution’s development and operation is also systematized, this approach can also be found in the second quadrant. This also shows that the architecture sizes are not only growing in terms of the number of levels used.

The positioning of the various T-shirt sizes in the quadrant model (see Figure 2) indicates two further movements.

  • The movement from bottom to top: We increase systematization by making the solution independent of the original professional user. In my own dashboard, for example, this was expressed by the fact that at some point data was no longer access using my personal SAP user name but using a technical account. Another aspect of systematization is the use of data modeling.
  • While my initial dashboard simply imported a wide table, in the tabular model the data was already dimensionally modelled.
  • The movement from right to left: While the first two T-shirt sizes are clearly dominated by technical requirements and the corresponding domain knowledge, further left increasing technical skills are required, for example to manage different data formats and types and to automate processes.
Figure3
Figure 2: T-Shirt Sizes in the Data Quadrant Model
Summary and Outlook

Let’s get this straight: BI solutions have to grow with their requirements. The architectural solutions shown in T-shirt sizes illustrate how this growth path can look in concrete terms. The DWH solution is built, so to speak, from top to bottom – we start with the pure information product and then build step by step up to the complete data warehouse architecture. The various architectural approaches can also be positioned in Ronald Damhof’s quadrant model: A new BI solution is often created in the fourth quadrant, where business users work exploratively with data and create the first versions of information products. If these prove successful, it is of particular importance to systematize and standardize their approach. At first, a data mart serves as a guarantor for a language used by various information products. Data decoupling from the source systems also allows further scaling of the development work. Finally, a data warehouse can be added to the previous levels to permanently merge data from different sources and, if required, make them permanently and historically available.

Organizations should aim to institutionalize the growth process of a BI solution. Business users can’t wait for every new data source to be integrated across multiple layers before it’s made available for reporting. On the other hand, individual solutions must be continuously systematized, gradually placed on a stable data foundation, and operated properly. The architecture approaches shown in T-shirt sizes provide some hints as to what this institutionalization could look like.

This article was first published in TDWI’s BI-SPEKTRUM 3/2019

Growing a BI Solution Architecture Step by Step

The boss needs a new evaluation based on a new data source. There isn’t time to load the data into an existing data warehouse, let alone into a new one. In this article I introduce the idea of architectural T-shirt sizes. Of course, different requirements lead to different architectural approaches, but at the same time, it should be possible that architecture can grow as a BI system is expanded.

T-shirt sizes for BI solutions

The boss needs a new evaluation based on a new data source. There isn’t time to load the data into an existing data warehouse, let alone into a new one. It seems obvious to reach for “agile” BI tools such as Excel or Power BI. After several months and more “urgent” evaluations, a maintenance nightmare threatens. But there is another way: In this article I introduce the idea of architectural T-shirt sizes, illustrated using our own dashboard. Of course, different requirements lead to different architectural approaches, but at the same time, it should be possible that architecture can grow as a BI system is expanded.

The summer holidays were just over, and I was about to take on a new management function within our company. It was clear to me: I wanted to make my decisions in a data-driven way and to offer my employees this opportunity. From an earlier trial, I already knew how to extract the required data from our SAP system. As a BI expert, I would love to have a data warehouse (DWH) with automated loading processes and all the rest. The problem was that, as a small business, we ourselves had no real BI infrastructure. And alongside attending to ongoing customer projects, my own time was pretty tight. So did the BI project die? Of course not. I just helped myself with the tool I had to hand, in this case Microsoft Power BI. Within a few hours the first dashboard was ready, and it was published in our cloud service even faster.

The Drawbacks of Quick and Dirty

My problem was solved in the short term. I could make daily-updated data available to my employees. Further requirements followed, so I copied the power BI file and began fitting it in here and there. Over the next few weeks, I added new key figures to it and made more copies. However, I was only partly able to keep the various dashboard copies “in sync”. In addition, operational problems came up. Of course, I had connected the SAP system under my personal username, whose password has to be changed regularly. This in turn led to interruptions to the data updating and required manual effort in reconfiguring the new password in Power BI.

T-Shirt Sizes for Step-by-Step Development of BI Architecture

I guess a lot of professional users have been in my shoes. You have to find a solution on the quick – and before you know it, you’re in your own personal maintenance nightmare. At the same time, a fully developed BI solution is a distant prospect, usually for organizational or financial reasons.

As an expert in the adaptation of agile methods for BI projects, I have been dealing with this problem for a long time: How can we both address the short-term needs of the professional user and create the sustainability of a “clean” solution? As a father of two daughters, I remembered how my children grew up – including the fact that you regularly have to buy bigger clothes. The growth process is continuous, but from time to time we have to get a new dress size. It is exactly this principle that can also be applied to BI solutions and their architecture. Figure 1 shows four example T-shirt sizes, which are explained in more detail below.

Figure1
Figure 1 Architectural solutions in T-shirt sizes at a glance
Think Big, Start Small: The S-Size T-Shirt

This approach corresponds to my first set of dashboards. A BI front-end tool connects directly to a source. All metadata, such as access data for the source systems, business rules, and key figure definitions, are developed and stored directly in the context of the information products created.

This T-shirt size is suitable for a BI project that is still in its infancy. It is often only then that a broader aim is formulated involving everything that you would like to analyze and evaluate – this is when “thinking big” begins. Practically, however, not much more known than the data source. But you would also like to be more explorative and produce first results promptly, so it makes sense to begin with the small and technically undemanding.

However, this approach reaches its limits very quickly. Here is a list, by no means complete, of criteria for deciding when the time has come to think about the next architectural dress size:

  • There exist several similar information products or data queries that use the same key figures and attributes over and over again.
  • Different users regularly access the information products, but access to the data source is through the personal access account of the original developer.
  • The source system suffers from multiple, and mostly very similar, data queries of the various information products.
Developing a Common Language: The M-Size T-Shirt

In this size, we try to save commonly useful metadata such as key figure definitions and the access information for the source systems at its own level rather than the  information product itself. This level is often also called the data mart or the semantic layer. In the case of my own dashboard, we developed a tabular model for it in Azure Analysis Services (AAS). The various specifications or copies of the dashboards as such in large part remained – only the substructure changed. However, all variants now rested on the same central foundation. The advantages of this T-shirt size in comparison to the previous are clear: Your maintenance effort is considerably reduced, because the basic data is stored centrally once and does not need to be maintained for every single information product. At the same time, you bring consistency to the definition and designation of the key figures. In a multilingual environment, the added value becomes even more apparent, because translations are centralized only once and maintained uniformly in the semantic layer. All your dashboards thus speak a common language.

In this M-size t-shirt, we still do not store any data permanently outside the source. Even if the source data is transferred to the tabular model in AAS, it must be re-imported for larger adjustments to the model. Other manufacturers’ products come completely without data storage, for example, the Universes in SAP BusinessObjects. This means that a high load is sometimes applied to the source systems, especially during the development phase. Here is a list of possible reasons to give your “BI child” the next largest dress size:

  • The load on the source systems is still too large despite the semantic layer and should be further reduced.
  • The BI solution is also to be used as an archive for the source data, for example in the case of web-based data sources where the history is only available for a limited period of time.
  • If the source system itself does not historize changes to data records, this can be interpreted as another reason for using the BI solution as a data archive.
Decoupling from the Source: The L-Size T-Shirt

The next larger T-shirt for our BI architecture, the L-size, replaces the direct data access of the data mart to the source. To do this, the data is extracted from the source and permanently stored in a separate database. This level corresponds to the concepts of a persistent staging area (PSA) (see also [Kim04] pp. 31ff. and [Vos]) or an actively managed data lake (see also [Gor19], for example p. 9). They all have one thing in common: that the data is taken from the source with as little change as possible and stored permanently. This procedure means that the existing data mart can be reassigned to this new source relatively easily. In my own dashboard example, we’re not at this stage yet. But in the next step, we have planned to extract the SAP data using the Azure Data Factory and store it permanently in an Azure SQL database. For people who, like us, use the ERP as a cloud solution, this layer reduces the lock-in effect of the ERP cloud provider. Other advantages of this persistent data storage beyond the source systems include the archiving and historization function it brings with it: Both new and changed data from the source are continuously stored. Deleted records can be marked accordingly. Technically, our data model becomes very close to the data model of the source we use, and under certain circumstances, we are already harmonizing some data types. While this layer can be realized practically and fast, there are also indicators here when to jump to the next T-shirt size:

  • The desired information products require integrated and harmonized data from several data sources.
  • The data quality of the source data is not sufficient and can’t simply be improved in the source system.
  • Key calculations should be saved permanently, for example for audit purposes.
Integrated, Harmonized and Historicized Data: The XL-Size T-Shirt

The next, and for the time being last, T-shirt, the XL-size extends the existing architecture to a classical data warehouse structure between PSA and a data mart. It foregrounds the integration and harmonization of master data from the various source systems. This is done using a central data model, which exists independently of the source used (for instance, a data vault model or a dimensional model). This enables systematic data quality controls to be carried out when initially loading and processing data. Historization concepts can also be integrated into the data here as required. The persistent storage of data in this DWH layer means it is also permanently available for audit purposes.

The various T-shirt sizes don’t only differ in the number of levels they use. To characterize the four architectural approaches more comprehensively, it’s worth taking a brief look at Damhof’s model of Data Management Quadrants [Dam]. I’ll do this in the next blog post.

Literature

[Dam] Damhof, Ronald: “Make data management a live issue for discussion throughout the organization”. https://prudenza.typepad.com/files/english—the-data-quadrant-model-interview-ronald-damhof.pdf , accessed on 22.11.2019

[Gor19] Gorelik, Alex: The Enterprise Big Data Lake. O’Reilly 2019

[Kim04] Kimball, Ralph: The data warehouse ETL toolkit: practical techniques for extracting, cleaning, conforming, and delivering data. Wiley 2004

[Vos] Vos, Roelant: “Why you really want a Persistent Staging Area in your Data Vault architecture”. 25.06.2016, http://roelantvos.com/blog/why-you-really-want-a-persistent-staging-area-in-your-data-vault-architecture/, accessed on 22.11.2019

This article was first published in TDWI’s BI-SPEKTRUM 3/2019

BI-specific analysis of BI requirements

Problems of requirement analysis

Practically every BI project is about requirements, because requirements communicate “what the client wants”. There are essentially two problems with this communication: the first is that clients often do not end up with what they really need. This is illustrated in the famous drawing in Figure 1: What the customer really needs.

 

Figure 1: What the customer really needs (Source: unknown, with additional material by Raphael Branger)

The second problem is that requirements can change over time. Thus, it can be that, especially in the case of long implementation cycles, the client and the contractor share a close consensus about what is wanted at the time of the requirement analysis. By the time the solution goes into operation, however, essential requirements may have changed.

Figure 2: Requirements can change over time

Of course, there is no simple remedy for these challenges in practice. Various influencing factors need to be optimized. In particular, the demand for speed calls for an agile approach, especially in BI projects. I have already written various articles, including Steps towards more agility in BI projects In that article, among other things, I describe the importance of standardization. This also applies to requirement analysis. Unfortunately, the classic literature on requirement management is not very helpful; it is either too general or too strongly focused on software development. At IT-Logix, we have developed a framework over the last ten years that helps us and our customers in BI projects to standardize requirements and generate BI-specific results. Every child needs a name, and our framework is called IBIREF (the IT-Logix Business Intelligence Requirements Engineering Framework)

Overview of IBIREF

IBIREF is divided into three areas:

Figure 3: Areas of IBIREF
  • The area of requirement topics addresses the question of what subjects should be considered at all as requirements in a BI project. I’ll go into a little more detail about this later in this article.
  • In the requirements analysis process, the framework defines possible procedures for collecting requirements. Our preferred form is an iterative-incremental (i.e. agile) process; I have dealt here with the subject of an agile development process through some user stories. It is, of course, equally possible to raise the requirements upfront in a classic waterfall process.
  • We have also created a range of tools to simplify and speed up the requirement collection process, depending on the process variant. This includes various checklists, forms and slides.

Overview of requirement topics

Now I would like to take a first look at the structuring of possible requirement topics.

Figure 4: Overview of possible requirement topics

Here are a few points about each topic:

  1. The broad requirements that arise from the project environment need to be considered to integrate a BI project properly. Which business processes should be supported by the BI solution to be developed? What are the basic professional, organizational or technical conditions? What are the project aims and the project scope?
  2. If the BI solution to be created includes a data warehouse (DWH), the requirements for this system component must be collected. We split the data requirements into two groups: The target perspective provides information about the key figures, dimensions and related requirements, such as historiography or the need for hierarchies. This is all well and good, but the source perspective should not be forgotten either. Many requirements for the DWH arise from the nature of the source data. In addition, requirements for metadata and security in the DWH have to be clarified.
  3. The BI application area includes all front-end requirements. This starts with the definition of the information products required (reports, dashboards, etc.), their target publication, purpose and data contents. One can then consider how the users navigate to and within the information products and what logic the selection options follow. One central consideration is the visualization of the data, whether in the form of tables or of diagrams. In this area, advanced standards such as the IBCS provide substantial support for the requirement analysis process (read an overview of my blog contributions to IBCS and Information Design here). The functionalities sub-item concerns requirements such as exporting and commenting. When it comes to distribution, it is interesting to know the channels through which the information products are made available to the users. And it is important to ask what security is required in the area of BI application too.
  4. The issue of requirement metadata is often neglected; however, it is useful to clarify this as early as possible in the project. This concerns the type of additional information to be collected about a requirement: Does one know who is responsible for a requirement? When was it raised, and when was it changed again? Are acceptance criteria also being collected as part of the requirement analysis?
  5. Lastly, requirements need to be collected for the documentation and training required for the use and administration of the BI system.

Summary

In this article, I have indicated that requirement analysis presents a challenge, both in general and especially in BI projects. Our IBIREF framework enables us to apply a standardized approach with the help of BI-specific tools. This allows both our customers and us to capture requirements more precisely, more completely and more quickly, thus enhancing the quality of the BI solution to be created.

Upcoming event: Please visit my team and me at our workshop at the TDWI Europe Conference in Munich in late June 2017. The theme is “Practice Makes Perfect: Practical Analysis of Requirements for a Dashboard” (though the workshop will be held in German). We will use the IBIREF framework, focusing on the BI application part, in roleplays and learn how to apply them. Register now—the number of seats for this workshop is limited!

(This article was first published by me in German on http://blog.it-logix.ch/bi-anforderungen-bi-spezifisch-erheben/)

My life as a BI consultant: Update Spring 2017

Spring 2017 in Provence (France)

Obviously I hadn’t time to write much on my blog during the last nine months. Let me share with you what topics kept me busy:

For the upcoming months I’ll be visiting and speaking at various events:

  • IBCS Annual Conference Barcelona: June 2nd, Discussion of the next version of the International Business Communication Standards.
  • TDWI 2017 Munich: June 26-28, Half-day workshop about practical gathering of requirements for a dashboard.
  • MAKEBI 2017 Zurich: July 3rd, I’ll be presenting a new keynote around alternatives to traditional estimation practices
  • BOAK 2017 Zurich: September 12th, same as with MAKEBI, I’ll be presenting a new keynote around alternatives to traditional estimation practices
  • WhereScape Test Drive / AgileBI introduction Zurich: September 13th: During this practical half-day workshop you learn hands-on how to use a DWH automation tool and you’ll get an introduction to the basics of Agile BI.
  • My personal highlight of today, I’ll be speaking during Agile Testing Days 2017: I’ll do a 2.5 hours workshop regarding the introduction of Agile BI in a sustainable way.

It would be a pleasure to meet you during one of these events – in case you’ll join, send me a little heads-up!

Last but not least, let me mention the Scrum Breakfast Club which I’m visting on a regular basis. We gather once a month using the OpenSpace format to discuss practical issue all around the application of agile methods in all kind of projects incl. Business Intelligence and Datawarehousing. The Club has chapters in Zurich, Bern as well as in Milan and Lisbon.

Steps towards more agility in BI projects

“We now do Agile BI too” – such statements we hear often during conferences and while discussing with customers and prospects. But can you really do agility in Business Intelligence (BI) and data warehouse (DWH) project directly? Is it sufficent to introdouce bi-weekly iterations and let your employees read the Agile BI Memorandum [BiM]? At least in my own experience this doesn’t work in a sustainable way. In this post I’ll try to show basic root cause relations which finally lead to the desired agility.

DWHAutomation

If at the end of the day we want more agility, the first step towards it is “professionalism”. Neither an agile project management model nor an agile BI toolset is a replacement for “the good people” in project and operation teams. “Good” in this context means, that the people who work in the development and operation of a BI solution are masters in what they do, review their own work critically and don’t do any beginner’s mistakes.

Yet, professionalism alone isn’t enough to reach agility in the end. The reason for this is that different experts often apply different standards. Hence the next step is the standardization of the design and and development procedures. Hereby the goal is to use common standads for the design and development of BI solutions. Not only within one team, but ideally all over team and project boundaries within the same organization. An important aid for this are design patterns, e.g. for data modeling, the design and development of ETL processes as well as of information products (like reports, dashboards etc.).

Standardization again is a prerequisite for the next and I’d say the most important step towards more agility: The automation of as many process steps as possible in the development and operation of a BI solution. Automation is a key element – “Agile Analytics” author Ken Collier dedicateds even multiple chapters to this topic [Col12]. Because only if we reach an high degree of automation we can work with short iterations in a sustainable way. Sustainable means, that short iterations don’t lead to an increase in technical depts (cf. [War92] and [Fow03]). Without automation, e.g. in the areas of testing, this isn’t achievable in reality.

Now we are close to the actual goal, more agility. If one can release new and changed features to UAT e.g. every two weeks, these can be released to production in the same manner if needed. And this – the fast and frequent enhancement of features in your BI solutions is what sponsors and end users perceive as “agility”.

(this blog was originally posted in German here)

Event hints:

Literature:

[BiM] Memorandum for Agile Business Intelligence: http://www.tdwi.eu/wissen/agile-bi/memorandum/

[Col12] Collier Ken: Agile Analytics, Addison-Wesley, 2012

[War92] Cunningham Ward: The WyCash Portfolio Management System, http://c2.com/doc/oopsla92.html, 1992

[Fow03] Fowler Martin: Technical Debt, http://martinfowler.com/bliki/TechnicalDebt.html, 2003

A (Webi) dashboard built by a business (power) user

This blog post is inspired by a recent customer request to challenge their decision to use Design Studio for some “dashboard requirements”. Showing how you can create a dashboard in Webi doesn’t mean I told the customer not to use Design Studio. Much more it is to show that finally a dashboard as well as every other type of BI front end solution is made up of requirements and not primarily by the tool you build the solution. Please refer to my Generic Tool Selection Process for more details as well as my post regarding BI specific requirements engineering.

Having said this, let’s have a look at how we can use latest Webi 4.1 features to quickly build an interactive dashboard without the need of (much) scripting. First of all here is what the final result looks like:

01_DashboardOverview1

You can select values from the left side bar (Product Lines), you can select States by directly clicking into the table and you can switch from the bar chart to a line chart. Here you see it in action:

The first step to achieve this, is to create the basic table and the two charts. Until the dynamic switch is implemented, I placed them side by side. Next add a simple input control in the left side bar:

02_SimpleInputControl 03_SimpleInputControlDepend

Next thing is to define the table as an additional input control – right click the table and choose “Linking” and “Add Element Link”,  choose the two chart objects as dependencies:

04_TableAsInputControl 05_TableAsInputControlDepend

Next we need to create the “switch” to toggle the two charts. As I would like to position this switch at the top right corner of the chart, I again use a table input control. To generate the two necessary table values (namely “Bar Chart” and “Line Chart”) I prepared a simple Excel spreadsheet:

08_ExcelContent

In 4.1 you can now finally upload this sheet directly into the BO repository:

07_UploadExcel

If you need to update the Excel sheet later on, this is now feasible as well:

09_UploadExcelReplace

Finally, in Webi add the Excel sheet as a second query:

10_ExcelQuery    10_ExcelQueryDetails

In the report we need now two tables: A visible one to represent the chart switch and a (hidden – see the “Hide always” option) dummy table to act as a dependency for the first:

13_HiddenDummyTable  12_HideDummyTable

The most tricky part is to create a variable to retrieve the selected value:

15_VarSelectedChartType

Here the formula for copy / paste:

=If( Pos(ReportFilterSummary(“Dashboard”);”Chart Type Equal “) > 0)
Then Substr(ReportFilterSummary(“Dashboard”);Pos(ReportFilterSummary(“Dashboard”);”Chart Type Equal “) + Length(“Chart Type Equal “);999)
Else “Bar Chart”

(The idea for this formula I grabed from David Lai’s Blog here)

Finally you need to configure the hide formula for both charts:

16_DynamicallyHideChart

That’s it.

Conclusion

Positive: I’m not too technical anymore (I do more paperwork than I wish sometimes…). Therefore I don’t consider me a “developer” and I like solutions for the so called “business (power) user” more and more. Therefore I like Webi. It took me about 60 minutes to figure out how to create this kind of interactive dashboard. I didn’t need to install anything – I could do everything web based. Except for one single formula (which I didn’t need to write myself)  I could click together the above sample. And I dare to say it looks like some kind of a dashboard 🙂 In addition I have all the basic features of Webi like a broad range of data source support, plenty of export possibilities, Office integration and so on. Even integrating an Excel spreadsheet as a data source is now finally a no-brainer.

Negative: Clearly, Webi is not a “design tool”. For example I wasn’t able to show icons for my chart switch instead of the text lables. Putting a background image to the table doesn’t work well if the table is used as input control. When I discussed this prototype with the customer they also mentioned that there are still too many options end users might get confused with (e.g. that there is a “filter” section showing whether the Bar Chart or the Line Chart value is chosen). In Webi you can’t change that. Toolbars, tabs etc. are just there where they are. Live with it or choose a different tool.

Bottom line: Have a look at my Generic Tool Selection Process and the mentioned hands-on test. The above example is exactly what I mean with this: Create a functional prototype in one or two tools and then do a fact based decision depending on your requirements and end user expectations.

Important remark: This post focused on the technical aspect of the dashboard. The visual representation doesn’t yet fit to best practices mentioned in my earlier articels (e.g. about SUCCESS) In a next blog post I will outline how to optimize the existing dashboard in this regard.

Join my teammate Kristof Gramm during sapInsider’s BI2015 conference at Nice (June 16-18): He will go into much more details about how you can use Web Intelligence as a dashboard tool for business users. Use this link to see more infos and save 300€ on your conference registration!

BI Picture Books (BI specific requirements engineering – part 2)

Part 1 of this article you’ll find here.

Illustrate available options using a BI Picture Book

A BI Picture Book is a structured collection of “pictures” aka screenshots of features illustrating one or multiple products. It describes and illustrates the available options in a compact and easy to handle manual. It should help the user to identify what options they have in a given BI front end application.
Referring to scenario A and B above, in an ideal world one would create a BI Picture Book during the initial tool selection process (scenario B). In this context, the BI Picture Book helps to illustrate the available features of the different tools under consideration. Some (or all) of these tools will become “strategic” and therefore the preferred tools to be used during subsequent BI projects. In the same way, the corresponding parts of the original BI Picture Book will also be included in the “daily business” BI Picture Book, which only contains the available options regarding the strategic tool set.
One main characteristic of a BI Picture Book is that we compare feature (or requirement) categories one after another and not a tool (with all its different features) after another tool. This helps to clarify specific differences between the tools for each category.

Figure2

Based on the previously described structure, the BI Picture Book should contain notes which highlight unique features of one tool compared to the rest of available (or evaluated) tools, e.g. a specific chart type which is only available in one tool. On the other hand, one should highlight limitations regarding specific features that are initially “not obvious”, e.g. in cases where the color palette of charts cannot be customized. Another example is to specifically highlight a tool which does not contain an Excel export (because end users might assume that there is an Excel export for every imaginable BI tool, so that they think they do not have to specify this).

How to build a BI Picture Book

Building a BI Picture Book is primarily about taking screenshots and arranging them in a structured manner, e.g. following the seven feature categories introduced above. As with every other project, certain points need to be planned and clarified before you start:

  • What is the primary purpose of the BI Picture Book? This refers to either scenario A) requirements engineering or scenario B), creating a front end tool strategy.
  • Which BI tool vendors are to be taken into consideration? Which concrete tools of these vendors are to be integrated into the BI Picture Book? For scenario A) this is defined by the available strategically defined BI toolset. For scenario B) it depends on the procedure for evaluating and selecting tools for your front end tool strategy.
  • Once you know which tools you want to take screenshots of you need to define which software version to use. Depending on the release cycle of the BI vendor, the software version can make quite a difference regarding available features. Therefore a BI Picture Book is mostly specific to a certain version.
  • For cars, there are tuning shops which provide extra features not offered by the car manufacturer. Similarly, in the BI world, there are many add-on providers who extend the available features of BI products. If such add-ons are already in place, it is important to include their features in the BI Picture Book. Nevertheless, one shouldn’t forget to label features from add-on products specifically as they might be charged additionally.
  • Do not show options which are not applicable in practice, e.g. system wide customizations on a multi-tenant BI platform. An example is customizing the look and feel of the BI portal by modifying the portal’s CSS style sheet. Although, in theory, this option might exist, depending on your organizational and technical setup, to changing the style sheet might not be allowed because many other stakeholders would be affected.

After having answered these questions, you can start: Take whatever screen capture program you like and start taking the screenshots. Use either a tool like Microsoft Powerpoint or Word to collect and layout the screenshot in a meaningful way. Keep an eye on the point that the BI Picture Books’ main characteristic is about comparing a specific feature over multiple tools. Therefore, put the screenshots of a given feature for multiple tools side by side on the same page or slide.
The subsequent paragraphs will illustrate how a concrete BI Picture Book might look. Screenshots are taken from various SAP Business Intelligence front end tools.

1. Content Options

Content options are difficult to illustrate using screenshots regarding scenario A). For scenario B) we can, for example, compare the different available data connectivity options:

Figure4
Connectivity Options in Crystal Reports

Figure3
Connectivity Options in SAP Lumira

2. Navigation & Selection Options

For navigation options outside of information, products typically screenshots of a BI portal are to be taken. This can be either based on a vendor specific portal or your company’s intranet site (or both if end users have a choice and need to decide which one to use).

Figure5
SAP BusinessObjects BI Launchpad

On the other hand, a tool provides navigation and selection features inside information products. We usually take screenshots for at least the following elements:

  • Parameter & Prompts
  • Input Controls
  • Groups / Hierarchy View and Navigation
  • Drill Down features
  • Tabs

Some of these elements are illustrated as follows:

Figure6
Prompts in SAP BusinessObjects WebIntelligence

Figure7
Selectors in SAP BusinessObjects Dashboards (aka Xcelsius)

Figure9
Drill-Down in Web Intelligence

Figure8
Drill-Down in Crystal Reports

The drill-down example, in particular, shows that it is not enough for an end user to simply specify “we need drill-down functionality” as a requirement. End users need to specify requirements in alignment with the different options of drill-down available.

3. Layout Options

Figure10
Excerpt of Chart Picture Book for some SAP BusinessObjects front end tools

We suggest taking screenshots for the following elements:

  • Charts
  • Tables
  • Cross tables
  • Speedometers
  • Maps
  • Conditional formatting

Make sure you list all important features and highlight the unique ones as well as limitations that are not obvious. This helps end users to compare the different options. In some cases, it is important to shed more light on the settings of features such as charts. By way of example, specify if it is possible to change the colors of a pie chart?

4. Functional Options

Next up are functional options, for example export. It is quite simple to find the available options and therefore it is easy for end users to choose from the existing options. It is useless, for example, if you let someone define that he wants a PowerPoint export from a front end tool, if it does not exist. Of course this would be nice, but it is simply not part of the catalog.

Figure11
Different export formats for different tools

Another category of functions is printing. Usually it is not precise enough if an end user specifies he needs to print a document. Giving them a picture book, they can easily find out the available printing options. The BI Picture Book should clarify points such as if you can mix landscape and portrait page mode or choose «Fit to page». Below is our list of typical functions which could be integrated into the BI Picture Book:

  • Export formats
  • Printing options
  • Alerts
  • MS Office Integration
  • Commentary features
  • Multi-language support
  • Search options

 

5. Delivery Options

An up-to-date topic which falls into the category of delivery options is mobile-device compatibility. This is becoming increasingly important at a time when all information should be available independent of the end users geographical location. Depending on the BI vendor and the BI tool itself, mobile devices support can differ considerably. Some serve the information products 1:1 to mobile devices. Others transform existing information products into specific mobile versions, which might have quite a different look and feel compared to the original information product.

Figure13
Crystal Reports document being viewed on a desktop and on an iPad

Figure12
Web Intelligence document being viewed on a desktop and on an iPad

6. Security Options

Figure14
Different security options for Crystal Reports and Web Intelligence documents

As with content options, it is somehow difficult to visualize security options using screenshots in a meaningful way. Try to focus on the comparison aspect between different tools and highlight unique features and limitations that are not obvious. The following example illustrates the available access rights for two different tools. One tool can simply restrict the export functionality in general, whereas the other tool can control the different export formats.

7. Qualitative Options

It is hard to illustrate this category using screenshots. Yet, as indicated in a previous paragraph, you can try to find other illustrations to guide your end users in specifying qualitative requirements.

Final Words

As with my other blog posts this article doesn’t aim to be a complete list of something. A BI Picture Book is neither the only way to define BI specific requirements nor is it enought to define a complete BI front end tool strategy. It shows you a particular idea and it is up to you to apply it in your organization in combination with other appropriate methods.

Please share your experience – I’m looking forward to reading your comment just below!

BI specific requirements engineering – part 1

(Thanks to my co-author Alexander van’t Wout for supporting me writing this blog post!)

Collecting requirements for BI front end tools is often frustrating.

Imagine a sales conversation at your local car dealer. After some small talk you are going to tell the salesperson about your interest in buying a brand new car. Nothing easier than this you might think. But suddenly you are confused. The friendly salesperson asks you if you would please write down exactly what you want and draw a sketch of what you have in mind. As if this was not funny enough he hands a sketch board over to you with a blank sheet of paper on it.

This is how many Business Intelligence (BI) experts deal with their customers today. End users are often left alone to «design» their requirements. A car is a «commercial off-the-shelf product» and therefore very similar to a BI toolset, which is «off-the-shelf software». A common characteristic of both product types is the standardization of features, and therefore a limited set of features. On one hand, this might limit your flexibility; on the other hand, it simplifies the process of requirements’ definition drastically because you do not need to consider each and every detail to build a system.

We can distinguish two major scenarios where a business user community needs to specify requirements for BI front end tools: scenario A) is an organizational environment where the business intelligence software suite is already predefined. This means that for a regular project the project team is not free to choose from all available tools on the market, but only within the limited frame of what is usually called strategic vendors. In most organizations this means no choice at all. Typically, most organizations limit themselves to one or two strategic BI vendors, whereas every vendor provides a suite of tools and therefore provides a choice to project teams.

Scenario B) takes place when a company is about to choose their strategic BI vendors, or when it is about to define a front-end tool strategy based on a given toolset available. The difference to scenario A) is that there are no concrete requirements or previous use cases to do this. Decisions involving, for example, choosing strategic BI vendors, or building a front end tool strategy usually have to be derived from corporate requirements, which may mean some high-level requirements that are influenced by end users only in an indirect way.

In Scenario A, the main task is to map requirements to concrete features and specify detailed requirements (which take into consideration the chosen features). In scenario B, the main task is to get to know multiple tools and multiple tool suites of different BI vendors and make them comparable in an easy and quick way. For both cases, the authors suggest the visual approach of BI Picture Books as an analogy to a car catalog. In subsequent paragraphs, “end user” is used as a synonym for the party who is in charge of defining requirements for the BI front end tools.

Figure 1 Negotiation based on off-the-shelf softwareAs outlined in the introduction, working with business intelligence software is working with off-the-shelf software nowadays. This means that not all imaginable requirements are allowed anymore. In particular in scenario A) end users cannot have all they want, but their requirements need to be aligned with the available features of a given tool set. Still, the first step is collecting business requirements to compare with the technical features of the standard software. This process can be very frustrating for the business user after s/he has noted his requirements on a blank sheet of paper and tried to picture himself using a solution that fits his needs. The necessary negotiations regarding the technical feasibility are more likely a surrender of the business user’s initial requirements.

Therefore, the question that arises is, how could we show the end user in advance which options are available and therefore feasible as a solution to his requirements? To answer this question, we take a look at the automotive sector again.

Today, modern car manufacturers provide web-based car configurators, where customers can “build” their own car. The customers have to walk through several steps, e.g. choosing the color, the wheels, the engine and accessories. We can learn two things from such car configurators: First, guide the end user defining the necessary (and feasible) requirements. Second, provide visual support to the end user showing what different available options look like.

Structure BI front end requirements

To «build» a BI front end solution we identified seven crucial categories which need to be addressed during the requirements’ engineering process. This corresponds to scenario A above. For scenario B one can still use the same categories, but instead of defining requirements along the lines of these categories you can structure the available features and thus make the comparison of the different tools much easier. The following sections will outline the seven categories in more detail:

  1. Content options: In this first step, end users have to roughly define what information products they want to receive in the end, and the approximate content of these products. (The term information product is an umbrella term for all the various BI front end types such as report, statistics, cockpit, dashboard, analysis etc.). For scenario A end users are relatively free to note down everything they want, except for data content, which is a priori not available in the project time frame. For scenario B, the BI expert might list and compare all the available data connectivity options for a certain toolset.
  2. Navigation and selection options: In this second step, the end users need to think about how to navigate to or between the defined information products (e.g. using a folder structure in a given BI portal). Whereas navigation takes place outside information products, the selecting interactively data usually takes place inside an information product. In either case, the available options are limited by the software.
  3. Layout options: This third step is about collecting requirements regarding page layout, chart and table options. A common pitfall for end users is to assume that BI front ends are either like Microsoft Excel or Word. Trivial looking items such as a table of contents or some specific chart options which are available in Office products might not be available in the BI front end tools. In addition, if the end users’ organization adheres to notation standard rules such as the International Business Communication Standards (IBCS; http://www.ibcs-a.org/) this might further restrict the allowed layout options, in particular for charts and tables.
  4. Functional options: Whereas the third step addressed more of the static elements in a report, this fourth step is about defining requirements for the functions of a BI front end solution (in addition to the functions already defined in the navigation and selection category). Typical examples of functions are the usage of (interactive) alerts, export to various formats, printing, search, multi-language options, commentary features and so on. This category depends even more on the available features of a given BI front end tool than the previous ones.
  5. Delivery options: Step number five addresses how an information product is delivered to end users. Besides defining the delivery channel (e.g. by web browser, mobile, email) one must define how and when the report is refreshed. One possible option is viewing an information product on demand (the refresh is triggered directly by an end user). Scheduling the information product to be run at night is another option. Scheduling can be further divided into single information product scheduling and information product bursting where, based on one main product, a personalized instance of the information product is created and usually distributed to the specific recipient. Requirements for this category’s “delivery options” usually depend not only  on the front end tool itself, but also on the underlying BI platform system or available third party extensions, e.g. for bursting.
  6. Security options: Finally, end users have to think about security. In the context of BI front end solutions, there are two main security aspects to consider: Access restrictions on information product level, on one hand, and data level security, on the other hand. For the first aspect, an end user has to define who and in which role is allowed to see the report, and which features should be available, e.g. one user might access and refresh the report, but must not export the report. Similar to the previous category of delivery options, the access restrictions are highly dependent on the underlying BI platform and the available security options.
    The the second aspect of data level security is either addressed on database level or some kind of semantic layer of the BI front end tool. Again, the available technology decides upon available options.
  7. Qualitative options: Last but not least, this final category of options summarizes requirements of a qualitative nature. This includes elements such performance or usability requirements. For this category, it is more difficult to define requirements allowed. Nevertheless, one can guide the end user in defining realistic requirements, e.g. instead of asking an end user to define the maximum report refresh duration, provide predefined performance classes such as “< 30 sec”, “30 – 60 sec” and so on. This way an end user won’t define an unrealistic value like “every report must be refreshable below 3 seconds”.

Using these seven categories to either structure your end user requirements (scenario A) or structure and therefore compare the available features of multiple tools in an evaluation process (scenario B), you will be able to catch at least 80% of typical BI front end requirements. Depending on the concrete project, you will most probably have to extend the list with your own items. Still, the basic principle of guiding end users whilst defining requirements remains the same.

Another way of structuring the requirements using the seven categories is to outline dependencies. Similar to web based car configurators, there are certain requirements in a given category which have a direct impact on the allowed (or needed) requirements in another category, e.g. defining a delivery channel using mobile devices will most probably have an impact on the desired (or available) layout options, as well as certain security options. In such a case, one needs to cycle back or forwards in the categories and adjust previously defined requirements. In sum, the typical procedure will be to run through the seven categories in an iterative way starting with a rough idea of requirements in the first round and refining requirements (also considering newly discovered dependencies) in subsequent rounds.

However, there is one question left: What does a non-technical user understand by these categories? A simple feature list is usually not enough, in particular for people whose daily business is not building a BI front end solution. The authors suggest building and using a visual catalog of available options, just like the car-configurator. We call this a BI Picture Book. (More about this in part 2)