BI-specific analysis of BI requirements

Problems of requirement analysis

Practically every BI project is about requirements, because requirements communicate “what the client wants”. There are essentially two problems with this communication: the first is that clients often do not end up with what they really need. This is illustrated in the famous drawing in Figure 1: What the customer really needs.

 

Figure 1: What the customer really needs (Source: unknown, with additional material by Raphael Branger)

The second problem is that requirements can change over time. Thus, it can be that, especially in the case of long implementation cycles, the client and the contractor share a close consensus about what is wanted at the time of the requirement analysis. By the time the solution goes into operation, however, essential requirements may have changed.

Figure 2: Requirements can change over time

Of course, there is no simple remedy for these challenges in practice. Various influencing factors need to be optimized. In particular, the demand for speed calls for an agile approach, especially in BI projects. I have already written various articles, including Steps towards more agility in BI projects In that article, among other things, I describe the importance of standardization. This also applies to requirement analysis. Unfortunately, the classic literature on requirement management is not very helpful; it is either too general or too strongly focused on software development. At IT-Logix, we have developed a framework over the last ten years that helps us and our customers in BI projects to standardize requirements and generate BI-specific results. Every child needs a name, and our framework is called IBIREF (the IT-Logix Business Intelligence Requirements Engineering Framework)

Overview of IBIREF

IBIREF is divided into three areas:

Figure 3: Areas of IBIREF

  • The area of requirement topics addresses the question of what subjects should be considered at all as requirements in a BI project. I’ll go into a little more detail about this later in this article.
  • In the requirements analysis process, the framework defines possible procedures for collecting requirements. Our preferred form is an iterative-incremental (i.e. agile) process; I have dealt here with the subject of an agile development process through some user stories. It is, of course, equally possible to raise the requirements upfront in a classic waterfall process.
  • We have also created a range of tools to simplify and speed up the requirement collection process, depending on the process variant. This includes various checklists, forms and slides.

Overview of requirement topics

Now I would like to take a first look at the structuring of possible requirement topics.

Figure 4: Overview of possible requirement topics

Here are a few points about each topic:

  1. The broad requirements that arise from the project environment need to be considered to integrate a BI project properly. Which business processes should be supported by the BI solution to be developed? What are the basic professional, organizational or technical conditions? What are the project aims and the project scope?
  2. If the BI solution to be created includes a data warehouse (DWH), the requirements for this system component must be collected. We split the data requirements into two groups: The target perspective provides information about the key figures, dimensions and related requirements, such as historiography or the need for hierarchies. This is all well and good, but the source perspective should not be forgotten either. Many requirements for the DWH arise from the nature of the source data. In addition, requirements for metadata and security in the DWH have to be clarified.
  3. The BI application area includes all front-end requirements. This starts with the definition of the information products required (reports, dashboards, etc.), their target publication, purpose and data contents. One can then consider how the users navigate to and within the information products and what logic the selection options follow. One central consideration is the visualization of the data, whether in the form of tables or of diagrams. In this area, advanced standards such as the IBCS provide substantial support for the requirement analysis process (read an overview of my blog contributions to IBCS and Information Design here). The functionalities sub-item concerns requirements such as exporting and commenting. When it comes to distribution, it is interesting to know the channels through which the information products are made available to the users. And it is important to ask what security is required in the area of BI application too.
  4. The issue of requirement metadata is often neglected; however, it is useful to clarify this as early as possible in the project. This concerns the type of additional information to be collected about a requirement: Does one know who is responsible for a requirement? When was it raised, and when was it changed again? Are acceptance criteria also being collected as part of the requirement analysis?
  5. Lastly, requirements need to be collected for the documentation and training required for the use and administration of the BI system.

Summary

In this article, I have indicated that requirement analysis presents a challenge, both in general and especially in BI projects. Our IBIREF framework enables us to apply a standardized approach with the help of BI-specific tools. This allows both our customers and us to capture requirements more precisely, more completely and more quickly, thus enhancing the quality of the BI solution to be created.

Upcoming event: Please visit my team and me at our workshop at the TDWI Europe Conference in Munich in late June 2017. The theme is “Practice Makes Perfect: Practical Analysis of Requirements for a Dashboard” (though the workshop will be held in German). We will use the IBIREF framework, focusing on the BI application part, in roleplays and learn how to apply them. Register now—the number of seats for this workshop is limited!

(This article was first published by me in German on http://blog.it-logix.ch/bi-anforderungen-bi-spezifisch-erheben/)

My life as a BI consultant: Update Spring 2017

Spring 2017 in Provence (France)


Obviously I hadn’t time to write much on my blog during the last nine months. Let me share with you what topics kept me busy:

For the upcoming months I’ll be visiting and speaking at various events:

  • IBCS Annual Conference Barcelona: June 2nd, Discussion of the next version of the International Business Communication Standards.
  • TDWI 2017 Munich: June 26-28, Half-day workshop about practical gathering of requirements for a dashboard.
  • MAKEBI 2017 Zurich: July 3rd, I’ll be presenting a new keynote around alternatives to traditional estimation practices
  • BOAK 2017 Zurich: September 12th, same as with MAKEBI, I’ll be presenting a new keynote around alternatives to traditional estimation practices
  • WhereScape Test Drive / AgileBI introduction Zurich: September 13th: During this practical half-day workshop you learn hands-on how to use a DWH automation tool and you’ll get an introduction to the basics of Agile BI.
  • My personal highlight of today, I’ll be speaking during Agile Testing Days 2017: I’ll do a 2.5 hours workshop regarding the introduction of Agile BI in a sustainable way.

It would be a pleasure to meet you during one of these events – in case you’ll join, send me a little heads-up!

Last but not least, let me mention the Scrum Breakfast Club which I’m visting on a regular basis. We gather once a month using the OpenSpace format to discuss practical issue all around the application of agile methods in all kind of projects incl. Business Intelligence and Datawarehousing. The Club has chapters in Zurich, Bern as well as in Milan and Lisbon.

AgileBI workshop London

Just a quick note for my UK based readers who are interested in AgileBI: I’m proud to have been selected as a speaker for the upcoming Enterprise Data & Business Intelligence (EDBI) conference in London, taking place on November 7 – 10. I’ll lead a half day workshop on Monday afternoon November 7 around my Agile BI Maturiy Model and of course would be happy to welcome you there too!

Just have a look now: Introducing Agile Business Intelligence Sustainably: Implement the Right Building Blocks in the Right Order

If you are interested in participating in this event, drop me note either by leaving a comment or contact me on LinkedIn and I can send you a voucher to save 200£ on the registration fee.

In addition, follow #IRMEDBI on Twitter!

 

AgileBI: How Corporate culture influences the development approach

In this blog post I discussed various building blocks which can help to establish and manifest an agile approach to Business Intelligence within an organisation. In this article I will focus on the aspect “Agile Mindset & Organisation”. The probing question is, which approach is best suited to develop a data warehouse (= DWH; not just the BI Frontend) incorporating the agile principles. Closely linked to this matter is the question of cutting user stories. Is it sensible to size a user story “end to end”, i.e. from the connection of the source, staging area and core DWH all the way to the output (e.g. a dashboard)? As you can imagine, the initial answer is “it depends”.

When looking at the question more closely, we can identify multiple factors that will have an impact on our decision.

Organizational structure

First of all, I would like to point out the distinction of the following two cases of the development organisation:

  • Organizational Structure A) DWH and BI Frontend are considered one application and are developed by the same team.
  • Organizational Structure B) DWH und BI Frontend are considered as separate applications (while being loosely coupled systems) and developed by different teams.

Characteristics of the Organizational Culture

In a next step we need to differentiate between two possible characteristics of organizational culture, which are extracted from Michael Bischoff’s book “IT Manager, gefangen in Mediokristan”, (engl.: IT Manager, trapped in Mediokristan). A nice review of the book can be found in this blog entry (in German).

  • „Mediokristan“: The „country“ Mediokristan is described as a sluggish environment where hierarchies and framework are predetermined, and risk- and management concepts are dominating factors. It stands symbolically for the experiences I have made in large corporate IT organizations. Everything moves at a slow pace, cycle times tend to be high; mediocrity is the highest standard.
  • “Extremistan”: The “country” Extremistan is best described as the opposite of Mediokristan. New, innovative solutions are developed and implemented quickly, fostered by individual responsibility, self-organization and a start-up atmosphere. Its citizens strive for the extreme: extremely good, extremely flexible, regardless of the consequences. What works will be pursued further, what does not will be discontinued. Any form of regulation or framework is rejected extremely too in case they are seen to hinder innovation.

Needless to say, the two cultures described are extremes, there are various other characteristics between the two ends of the spectrum.

Alternatives to agile development approaches

The third distinction I would like to point out are the following two alternative approaches to agile development:

  • Development Approach A) Single Iteration Approach (SIA): A number of user stories is selected in the beginning of an iteration (called a Sprint in the Scrum jargon), with the goal to have a potentially deployable result at the end of one iteration. With Organizational Structure A in place (see above), the user story will be cut “end to end” and has to encompass all aspects: Connecting the required source system (if not available already), data modeling, loading of the data into the staging layer, EDWH and data mart layer and last but not least creating a usable information product (e.g. a report or a dashboard). The processing of the user story also includes developing and carrying out tests, writing the appropriate documentation, etc. It is a very challenging approach and will generally require a Team of T-Skilled People, where each member of the team possesses the skills to manage and fulfil any and all of the upcoming tasks over the time.
    Folie4_SIA
  • Development approach B) Pipelined Delivery Approach (PDA): The PDA exists because of the assumption that the SIA is illusive in many cases. One reason is the lack of T-Skilled people in our specialist driven industry, or the involvement of multiple teams (e.g. separate teams of Business Analysts, Testing) in the development process in extreme cases. Another reason is the mere complexity we often see in DWH solutions. An iteration cycle of two to four weeks is already quite ambitious when – in the figurative sense – doing business in Mediokristan.
    As an alternative, PDA describes the creation of a DWH/BI solution based on a production line (also see the book written by Ralph Hughes: “Agile Data Warehousing Project Management: Business Intelligence Systems Using Scrum, Morgan Kaufmann, 2012). The production line (=pipeline) consists of at least three work stations: 1. Analysis & Design, 2. Development, 3. Testing. In the shown illustrations below (taken from a concrete customer project) I added a fourth station dedicated for the BI Frontend development. A user story runs through each work station in one iteration at the most. Ideally, the entire production line is run by the same team, which we will assume in the following example:

    • The user stories that are to be tackled in the early stages are defined and prioritized during a regular story conference at the outset of a production cycle. They will thereafter be worked on in the first work station “Analysis & Design”. Evidently, in this initial phase, the pipeline as well as some members of the team are not used to full capacity. In accordance with the Inception Phase in the Disciplined Agile approach, these gaps in capacity can be used optimally for other tasks necessary at the beginning of a project.
    • After the first iteration, the next set of user stories that will be worked on will be defined at the story conference and passed on to the first work station. Simultaneously, the user stories that were worked on in the first work station in the previous iteration will be passed on to the next one, the development station.
    • After the second iteration, more user stories will be chosen out of the backlog and passed on to the first work station, while the previous sets will move further along the pipeline. Thus, the user stories of the first iteration will now be worked on in the testing work station. (One word re testing: Of course we test throughtout the development too! But why would we need a separate testing iteration then? The reason lies within the nature of a DWH, namley data: During the development iteration a developer will work and test with a limited set of (test) data. During the dedicated testing iteration full and in an ideal case production data will be processed over multiple days; something you can hardly do during development itself).
    • Consequently, the pipeline will have produced “production ready” solutions for the first time after three iterations have passed. Once the pipeline has been filled, it will deliver working increments of the DWH solution after each iteration – similar to the SIA.
      Folie5_PDA1

      Pipelined Delivery Approach with four work stations

      Folie7_PDA3

      Testing inside one pipe cycle

The two approaches differ in the overall lead time of a user story between story conference and the ready-made solution, which tends to be shorter using the SIA in comparison to the PDA. What they have in common is the best practice of cutting user stories: They should always be vertical to the architecture layers, i.e. “end-to-end” from the integration of a source system up to the finished report (cf. organizational structure A) or at least the reporting layer within the DWH (cf. organizational structure B). While the PDA does incorporate all values and basic principles (this topic might be taken up in another article) of agile development, in case of doubt, the SIA is more flexible. However, put in practice, the SIA implementation is much more challenging and the temptation of cutting user stories per architectural layer (e.g. “Analysis Story”, “Staging Story”, “Datamart Story”, “Test Story”) rather than end-to-end is ever present.

When is which approach best suited?

Finally I would like to show where and how the above mentioned aspects are in correlation with each other.

Let’s have a look at “Organizational Structure A) DWH and BI Frontend are considered one application”. If the project team is working out of Extremistan and consists of mostly T-skilled team members, chances are that they will be able to implement a complete DWH/BI solution end-to-end working with the SIA. The success of the project with this approach is less likely, if large parts of the team are located in Mediokristan. Internal clarifications and dependencies will require additional time in matters of Analysis and Design. Furthermore, internal and legal governance regulations will have to be considered. Due to these factors, in my experience, the PDA has a better chance at success than the SIA when developing a DWH/Frontend in Mediokristan. Or, to put it more positively: Yes, Agile Development is possible even in Mediokristan.

The situation I have come across more often is „Organizational structure B) DWH and BI Frontend are considered two separate applications“. As a matter of principle, agile development with the SIA is simpler in the BI Frontend than the DWH backend development. That being said, the environment (Mediokristan vs. Extremistan) also has a great impact. It is possible to combine the two approaches (PDA for the DWH Backend, SIA for the BI Frontend), especially if the BI Frontend is already connected to an existing DWH and there is no need to adapt it according to every user story in the BI Frontend. Another interesting question in Organizational structure B) is how to cut the user stories in the DWH Backend. Does it make sense to formulate a user story when there is no concrete user at hand and the DWH is, in the practical sense, developed to establish a “basic information supply”? And if yes, how do we best go about it? An interesting approach in this case is the Feature Driven Development (FDD) approach, described in this Blog article of Agile-Visionary Mike Cohn. The adaption of the FDD approach when developing a DWH might be interesting material for a future article…

As you can see, the answer “It depends” mentioned at the beginning of this blog post is quite valid. What do you think? What is your experience with Agile BI in either Mediokristan or Extremistan? Please feel free to get in touch with me personally or respond with a comment to this blog post. I look forward to your responses and feedback.

(A preliminary version of this blog has been posted in German here. Many thanks to my teammate Nadine Wick for the translation of the text!)

Teradata & WhereScape Test Environment in the Cloud

In this post I outline how I managed to get a cloud based training environment ready in which WhereScape RED, a data warehouse automation tool,  connects to a Teradata database test machine.

A few weeks ago I had to organize a so called “testdrive” for a local WhereScape prospect. The prospect uses a Teradata database appliance. Hence they evaluated to use WhereScape RED based on Teradata too. As a local Swiss based WhereScape partner we received a virtual machine containing a SQL Server based WhereScape RED environment. The training had to be run onsite at the customer’s location, IT-Logix provided their set of training laptops, each containing 4GB or RAM. These were my starting conditions.

First of all I thought about how to deal with Teradata for a training setup. Fortunately, Teradata provides a set of preconfigured VMs here. You can easily download them as zipped files and run it using the free VM Player.

Based on my previous experience with organizing hands-on sessions, e.g. during our local Swiss SAP BusinessObjects user group events, I wanted to use Cloudshare. This makes it much more easier (and faster!) to clone an environment for multiple training participants compared to copying tons of gigabytes to multiple laptops. In addition, the 4GB RAM wouldn’t be enough to run Teradata and WhereScape properly in a performant way. So I had two base VMs (one from WhereScape, one from Teradata) – a perfect use case to use the VM upload feature in Cloudshare for the first time.

I started with this support note which explains how to prepare your local VM and load it up to my Cloudshare FTP folder. From there you can simply add it to an environment:

01_UploadVM1

After having uploaded both VMs it looks like this in Cloudshare:

02_CloudshareEnvironment

I increased the RAM and CPU power a bit, and more important configured the network between the two machines:

Go to “Edit Environment” -> “Edit Networks”:

03_NetworkSettings

Here I had to specify to which virtual network I’d like to connect the VMs. Please keep in mind that this doesn’t provide an automatic DHCP server or similar. Either you create one within your machine or – as in my case – had to set static IPs within the individual VM (both were delivered by using a dynamic IP provided by the VM Player). Changing the IP wasn’t a big thing, neither on Windows nor on Linux.

04_TD_Setting1

But I quickly found out that the Teradata service didn’t run properly anymore afterwards.

First of all I had to create a simple test case to check if I can connect from the WhereScape VM to the Teradata machine. Besides a simple Ping (which worked) I installed the Teradata Tools & Utilities on the WhereScape machine. As I couldn’t establish a proper connection, I had to google a bit. The following article gave me the hint to add a “cop” entry to the host file:

04_TD_Setting2

After a restart of the machine, Teradata was up and running again. This you can verify with the following command “pdestate -a” by the way:

04_TD_Setting3

The next step in WhereScape was to create a new metadata repository on the Teradata database. For this I created a new schema and user in Teradata first and then created the metadata repository using the WhereScape Administrator:

06_WhereScapeSetup

In WhereScape RED I created a connection to point to the new Teradata database:

05_WhereScapeConnection

… and finally loaded a few tables from the SQL Server to Teradata:

07_WhereScape_Data

Once I finished the work, the most important step is to create a snapshot:

08_Snapshot

Based on this snapshot I finally cloned the environment for the number of participants in the testdrive with just a few clicks. After all, every participant had his own (and isolated) environment consisting of a full stack of source database (SQL Server), WhereScape and the target DWH database (Teradata).

Agile Business Intelligence Maturity Model

As outlined in my previous blog agility in business intelligence projects can’t be produced directly. Instead you should invest into professionalism, standardization and automation. In this post I’m showing an overview of concrete building blocks to support you on this way.

In my Agile Business Intelligence Maturity Model (ABIMM) I’ve collected many building blocks and arranged them in a practical sequence. An overview you can find in the following illustration:

Agile Business Intelligence Maturity Model

Agile Business Intelligence Maturity Model

We can extract a few key messages from this model:

  1. You can’t increase agility directly – you can only reduce the amount of needed upfront design. By doing this agility is increased automatically.
  2. A reduction of upfront design leads inevitably to higher risks – risks you need to deal with actively, e.g. by using a version control system or solutions for test automation (cf. my blog post here). As long as such basic infrastructure elements aren’t available you should be very cautious with introducing iterative, incremental procdures like e.g. Scrum. (A very illustrative presentation about Agility requires Safety you can find here)
  3. All beginnings are difficult: The building block “Agile Basics & Mindeset” represents an enormous hurdle in many cases. As long as an organization doesn’t experience a top down transformation towards agile values and principles (cf. e.g. the Agile Manifesto), it doesn’t make much sense to start with it bottom-up.
  4. The gulf can be overcome by buying the necessary tools for test automation, version control and training for employees. This can typically happen within the boundaries of the already existing infrastructure. But to overcome the chasm, todays often heterogenous, multi layered BI tool landscapes aren’t suited very well. That’s one reason why I’ve become a big fan of data warehouse automation and tools like WhereScape. Products like WhereScape RED institutionalize the usage of design patterns in an integrated development environment. Only for this reason e.g. refactoring on the level of the data model and hence iterative data modeling becomes feasible with realistic effort. At the same time tools like WhereScape provide you with an ultra high degree of automation for the deployment of new and changed artefacts.

A more detailed explanation of the Agile BI Maturity Model can be found in my recent article in the German TDWI journal “BI-Spektrum“.

Here you find the English translation of my article!

Many thanks to my company IT-Logix and all the great staff working with me. You are the indispensable foundation for all my BI related work. Don’t forget, you can hire me as a consultant 😉

Following you’ll find the literature list on which the different building blocks and the model itself is based on:

[AmL12] Ambler Scott W., Lines Mark: Disciplined Agile Delivery: A Practitioner’s Guide to Agile Software Delivery in the Enterprise, IBM Press, 2012

[AmS06] Ambler Scott W., Sadalage Pramod J.: Refactoring Databases: Evolutionary Database Design, Addison-Wesley Professional, 2006

[Bel] Belshee Arlo: Agile Engineering Fluency http://arlobelshee.github.io/AgileEngineeringFluency/Stages_of_practice_map.html

[BiM] Memorandum für Agile Business Intelligence: http://www.tdwi.eu/wissen/agile-bi/memorandum/

[Col12] Collier Ken: Agile Analytics, Addison-Wesley, 2012

[CoS11] Corr Lawrence, Stagnitto Jim: Agile Data Warehouse Design: Collaborative Dimensional Modeling, from Whiteboard to Star Schema, DecisionOne Press, 2011

[Hug12] Hughes Ralph: Agile Data Warehousing Project Management: Business Intelligence Systems Using Scrum, Morgan Kaufmann, 2012

[HuR09] Humble Jez, Russell Rolf: The Agile Maturity Model – Applied to Building and Releasing Software, http://info.thoughtworks.com/rs/thoughtworks2/images/agile_maturity_model.pdf, 2009

[Kra14] Krawatzeck Robert, Zimmer Michael, Trahasch Stephan, Gansor Tom: Agile BI ist in der Praxis angekommen, in: BI-SPEKTRUM 04/2014

[Sch13] Schweigert Tomas, Vohwinkel Detlef, Korsaa Morten, Nevalainen Risto, Biro Miklos: Agile maturity model: analysing agile maturity characteristics from the SPICE perspective, in Journal of Software: Evolution and Process, 2013 (http://www.sqs.com/de/_download/agile_maturity_wiley_2013_final.pdf)

Parts of this blog have first been published in German here.

Steps towards more agility in BI projects

“We now do Agile BI too” – such statements we hear often during conferences and while discussing with customers and prospects. But can you really do agility in Business Intelligence (BI) and data warehouse (DWH) project directly? Is it sufficent to introdouce bi-weekly iterations and let your employees read the Agile BI Memorandum [BiM]? At least in my own experience this doesn’t work in a sustainable way. In this post I’ll try to show basic root cause relations which finally lead to the desired agility.

DWHAutomation

If at the end of the day we want more agility, the first step towards it is “professionalism”. Neither an agile project management model nor an agile BI toolset is a replacement for “the good people” in project and operation teams. “Good” in this context means, that the people who work in the development and operation of a BI solution are masters in what they do, review their own work critically and don’t do any beginner’s mistakes.

Yet, professionalism alone isn’t enough to reach agility in the end. The reason for this is that different experts often apply different standards. Hence the next step is the standardization of the design and and development procedures. Hereby the goal is to use common standads for the design and development of BI solutions. Not only within one team, but ideally all over team and project boundaries within the same organization. An important aid for this are design patterns, e.g. for data modeling, the design and development of ETL processes as well as of information products (like reports, dashboards etc.).

Standardization again is a prerequisite for the next and I’d say the most important step towards more agility: The automation of as many process steps as possible in the development and operation of a BI solution. Automation is a key element – “Agile Analytics” author Ken Collier dedicateds even multiple chapters to this topic [Col12]. Because only if we reach an high degree of automation we can work with short iterations in a sustainable way. Sustainable means, that short iterations don’t lead to an increase in technical depts (cf. [War92] and [Fow03]). Without automation, e.g. in the areas of testing, this isn’t achievable in reality.

Now we are close to the actual goal, more agility. If one can release new and changed features to UAT e.g. every two weeks, these can be released to production in the same manner if needed. And this – the fast and frequent enhancement of features in your BI solutions is what sponsors and end users perceive as “agility”.

(this blog was originally posted in German here)

Event hints:

Literature:

[BiM] Memorandum for Agile Business Intelligence: http://www.tdwi.eu/wissen/agile-bi/memorandum/

[Col12] Collier Ken: Agile Analytics, Addison-Wesley, 2012

[War92] Cunningham Ward: The WyCash Portfolio Management System, http://c2.com/doc/oopsla92.html, 1992

[Fow03] Fowler Martin: Technical Debt, http://martinfowler.com/bliki/TechnicalDebt.html, 2003