Teradata & WhereScape Test Environment in the Cloud

In this post I outline how I managed to get a cloud based training environment ready in which WhereScape RED, a data warehouse automation tool,  connects to a Teradata database test machine.

A few weeks ago I had to organize a so called “testdrive” for a local WhereScape prospect. The prospect uses a Teradata database appliance. Hence they evaluated to use WhereScape RED based on Teradata too. As a local Swiss based WhereScape partner we received a virtual machine containing a SQL Server based WhereScape RED environment. The training had to be run onsite at the customer’s location, IT-Logix provided their set of training laptops, each containing 4GB or RAM. These were my starting conditions.

First of all I thought about how to deal with Teradata for a training setup. Fortunately, Teradata provides a set of preconfigured VMs here. You can easily download them as zipped files and run it using the free VM Player.

Based on my previous experience with organizing hands-on sessions, e.g. during our local Swiss SAP BusinessObjects user group events, I wanted to use Cloudshare. This makes it much more easier (and faster!) to clone an environment for multiple training participants compared to copying tons of gigabytes to multiple laptops. In addition, the 4GB RAM wouldn’t be enough to run Teradata and WhereScape properly in a performant way. So I had two base VMs (one from WhereScape, one from Teradata) – a perfect use case to use the VM upload feature in Cloudshare for the first time.

I started with this support note which explains how to prepare your local VM and load it up to my Cloudshare FTP folder. From there you can simply add it to an environment:

01_UploadVM1

After having uploaded both VMs it looks like this in Cloudshare:

02_CloudshareEnvironment

I increased the RAM and CPU power a bit, and more important configured the network between the two machines:

Go to “Edit Environment” -> “Edit Networks”:

03_NetworkSettings

Here I had to specify to which virtual network I’d like to connect the VMs. Please keep in mind that this doesn’t provide an automatic DHCP server or similar. Either you create one within your machine or – as in my case – had to set static IPs within the individual VM (both were delivered by using a dynamic IP provided by the VM Player). Changing the IP wasn’t a big thing, neither on Windows nor on Linux.

04_TD_Setting1

But I quickly found out that the Teradata service didn’t run properly anymore afterwards.

First of all I had to create a simple test case to check if I can connect from the WhereScape VM to the Teradata machine. Besides a simple Ping (which worked) I installed the Teradata Tools & Utilities on the WhereScape machine. As I couldn’t establish a proper connection, I had to google a bit. The following article gave me the hint to add a “cop” entry to the host file:

04_TD_Setting2

After a restart of the machine, Teradata was up and running again. This you can verify with the following command “pdestate -a” by the way:

04_TD_Setting3

The next step in WhereScape was to create a new metadata repository on the Teradata database. For this I created a new schema and user in Teradata first and then created the metadata repository using the WhereScape Administrator:

06_WhereScapeSetup

In WhereScape RED I created a connection to point to the new Teradata database:

05_WhereScapeConnection

… and finally loaded a few tables from the SQL Server to Teradata:

07_WhereScape_Data

Once I finished the work, the most important step is to create a snapshot:

08_Snapshot

Based on this snapshot I finally cloned the environment for the number of participants in the testdrive with just a few clicks. After all, every participant had his own (and isolated) environment consisting of a full stack of source database (SQL Server), WhereScape and the target DWH database (Teradata).

Agile Business Intelligence Maturity Model

As outlined in my previous blog agility in business intelligence projects can’t be produced directly. Instead you should invest into professionalism, standardization and automation. In this post I’m showing an overview of concrete building blocks to support you on this way.

In my Agile Business Intelligence Maturity Model (ABIMM) I’ve collected many building blocks and arranged them in a practical sequence. An overview you can find in the following illustration:

Agile Business Intelligence Maturity Model

Agile Business Intelligence Maturity Model

We can extract a few key messages from this model:

  1. You can’t increase agility directly – you can only reduce the amount of needed upfront design. By doing this agility is increased automatically.
  2. A reduction of upfront design leads inevitably to higher risks – risks you need to deal with actively, e.g. by using a version control system or solutions for test automation (cf. my blog post here). As long as such basic infrastructure elements aren’t available you should be very cautious with introducing iterative, incremental procdures like e.g. Scrum. (A very illustrative presentation about Agility requires Safety you can find here)
  3. All beginnings are difficult: The building block “Agile Basics & Mindeset” represents an enormous hurdle in many cases. As long as an organization doesn’t experience a top down transformation towards agile values and principles (cf. e.g. the Agile Manifesto), it doesn’t make much sense to start with it bottom-up.
  4. The gulf can be overcome by buying the necessary tools for test automation, version control and training for employees. This can typically happen within the boundaries of the already existing infrastructure. But to overcome the chasm, todays often heterogenous, multi layered BI tool landscapes aren’t suited very well. That’s one reason why I’ve become a big fan of data warehouse automation and tools like WhereScape. Products like WhereScape RED institutionalize the usage of design patterns in an integrated development environment. Only for this reason e.g. refactoring on the level of the data model and hence iterative data modeling becomes feasible with realistic effort. At the same time tools like WhereScape provide you with an ultra high degree of automation for the deployment of new and changed artefacts.

A more detailed explanation of the Agile BI Maturity Model can be found in my recent article in the German TDWI journal “BI-Spektrum“.

Here you find the English translation of my article!

Many thanks to my company IT-Logix and all the great staff working with me. You are the indispensable foundation for all my BI related work. Don’t forget, you can hire me as a consultant😉

Following you’ll find the literature list on which the different building blocks and the model itself is based on:

[AmL12] Ambler Scott W., Lines Mark: Disciplined Agile Delivery: A Practitioner’s Guide to Agile Software Delivery in the Enterprise, IBM Press, 2012

[AmS06] Ambler Scott W., Sadalage Pramod J.: Refactoring Databases: Evolutionary Database Design, Addison-Wesley Professional, 2006

[Bel] Belshee Arlo: Agile Engineering Fluency http://arlobelshee.github.io/AgileEngineeringFluency/Stages_of_practice_map.html

[BiM] Memorandum für Agile Business Intelligence: http://www.tdwi.eu/wissen/agile-bi/memorandum/

[Col12] Collier Ken: Agile Analytics, Addison-Wesley, 2012

[CoS11] Corr Lawrence, Stagnitto Jim: Agile Data Warehouse Design: Collaborative Dimensional Modeling, from Whiteboard to Star Schema, DecisionOne Press, 2011

[Hug12] Hughes Ralph: Agile Data Warehousing Project Management: Business Intelligence Systems Using Scrum, Morgan Kaufmann, 2012

[HuR09] Humble Jez, Russell Rolf: The Agile Maturity Model – Applied to Building and Releasing Software, http://info.thoughtworks.com/rs/thoughtworks2/images/agile_maturity_model.pdf, 2009

[Kra14] Krawatzeck Robert, Zimmer Michael, Trahasch Stephan, Gansor Tom: Agile BI ist in der Praxis angekommen, in: BI-SPEKTRUM 04/2014

[Sch13] Schweigert Tomas, Vohwinkel Detlef, Korsaa Morten, Nevalainen Risto, Biro Miklos: Agile maturity model: analysing agile maturity characteristics from the SPICE perspective, in Journal of Software: Evolution and Process, 2013 (http://www.sqs.com/de/_download/agile_maturity_wiley_2013_final.pdf)

Parts of this blog have first been published in German here.

Steps towards more agility in BI projects

“We now do Agile BI too” – such statements we hear often during conferences and while discussing with customers and prospects. But can you really do agility in Business Intelligence (BI) and data warehouse (DWH) project directly? Is it sufficent to introdouce bi-weekly iterations and let your employees read the Agile BI Memorandum [BiM]? At least in my own experience this doesn’t work in a sustainable way. In this post I’ll try to show basic root cause relations which finally lead to the desired agility.

DWHAutomation

If at the end of the day we want more agility, the first step towards it is “professionalism”. Neither an agile project management model nor an agile BI toolset is a replacement for “the good people” in project and operation teams. “Good” in this context means, that the people who work in the development and operation of a BI solution are masters in what they do, review their own work critically and don’t do any beginner’s mistakes.

Yet, professionalism alone isn’t enough to reach agility in the end. The reason for this is that different experts often apply different standards. Hence the next step is the standardization of the design and and development procedures. Hereby the goal is to use common standads for the design and development of BI solutions. Not only within one team, but ideally all over team and project boundaries within the same organization. An important aid for this are design patterns, e.g. for data modeling, the design and development of ETL processes as well as of information products (like reports, dashboards etc.).

Standardization again is a prerequisite for the next and I’d say the most important step towards more agility: The automation of as many process steps as possible in the development and operation of a BI solution. Automation is a key element – “Agile Analytics” author Ken Collier dedicateds even multiple chapters to this topic [Col12]. Because only if we reach an high degree of automation we can work with short iterations in a sustainable way. Sustainable means, that short iterations don’t lead to an increase in technical depts (cf. [War92] and [Fow03]). Without automation, e.g. in the areas of testing, this isn’t achievable in reality.

Now we are close to the actual goal, more agility. If one can release new and changed features to UAT e.g. every two weeks, these can be released to production in the same manner if needed. And this – the fast and frequent enhancement of features in your BI solutions is what sponsors and end users perceive as “agility”.

(this blog was originally posted in German here)

Event hints:

Literature:

[BiM] Memorandum for Agile Business Intelligence: http://www.tdwi.eu/wissen/agile-bi/memorandum/

[Col12] Collier Ken: Agile Analytics, Addison-Wesley, 2012

[War92] Cunningham Ward: The WyCash Portfolio Management System, http://c2.com/doc/oopsla92.html, 1992

[Fow03] Fowler Martin: Technical Debt, http://martinfowler.com/bliki/TechnicalDebt.html, 2003

Testing for BI & DWH

Since ever testing is part of every IT project plan – that’s true as well for Business Intelligence (BI) & Data Warehouse (DWH) projects. The practical implementation of testing in the BI / DWH environment has confronted me with troubles in the past again and again. Often I’ve had the impression that the BI / DWH world is still back in the Stone Age regarding development processes and environments. At least it is significantly behind the maturity level I know from the software engineering domain. The below chart illustrates this gap:

rbra_testing1 (1)

Cultural differences between the software development and BI community (Source: http://agiledata.org/essays/culturalImpedanceMismatch.html)

If there is something tested at all, typically in the BI frontend area things are tested manually. In the DWH backend we see – besides manual tests – self coded test routines, e.g. in the form of stored procedures or dedicated ETL jobs. However the integration into a test case management tool and systematic evaluation of the test results doesn’t happen. This is heavily contrasting with the software engineering domain where automated regression testing combined with modern development approaches like test driven design are applied. At least for some time we find first inputs regarding BI specific testing (cf. the (German) TDWI book here). Concepts and paper are patient though. Where are we with regard to a possible tool support, namely for the area of regression tests?

Since summer 2014 we at IT-Logix are actively looking for better (tool based) solutions for BI specific testing. We do this together with the Austrian company Tricentis. Tricentis develops the Tosca product suite, one of the worldwide leading software solutions for test automation. In a first step we run a proof of concept (POC) for regression tests for BI frontend artefacts, namely typical reports. One of the architectural decisions was to use Excel and PDF export files as a base for our tests. With this choice of a generic file interface the efforts to develop BI product specific tests were omitted. And this way we reduced the implementation effort to about two days in the POC. The goal was to run “Before-After” tests in batch mode. We took 20 reports for the POC case (these were actually SAP BusinessObjects Web Intelligence reports, but you can imagine whatever tool you like as long as you can export to PDF and / or Excel). A current version of the PDF or Excel output of the report is compared with a corresponding reference file. Typical real life situations where you can use this scenario are:

  • recurringly scheduled regression tests to monitor side effects of ongoing DWH changes: The reference files are created somewhen e.g. after a successful release of the DWH. Imagine there are ongoing change requests on the level of your DWH. Then you want to make sure these changes only impact the reports where a change is expected. To make sure all the rest of your reports aren’t concerned by any side effects, you now run your regression tests e.g. every weekend and compare the hereby produced files with the reference files.
  • BI platform migration projects: If you run a migration project for example to migrate your SAP BusinessObjects XI 3.1 installation to 4.1, you’ll want to make sure reports still work and look the same in 4.1 as they did in XI 3.1. In this case you create the reference files in XI 3.1 and compare them with the ones from 4.1. (As the export drivers vary between the two versions, especially the Excel exports are not very useful for this use case. Still, PDF worked pretty fine in my experience)
  • Database migration projects: If you run a database migration project for example migrating all your Oracle databases to Teradata or SAP HANA, then you want to make sure all of your reports show still the correct data (or at least the data as was shown with the original datasource…)
rbra_testing1 (3)

Sample configuration of a test case template using the GUI of Tosca (Source: IT-Logix POC)

Tosca searches for the differences between the two files. For Excel this happens on a cell by cell basis, for PDF we used a text based approach as well as an image compare approach.

rbra_testing1 (2)

Depending on the chosen test mode the differences can be visualized differently (Source: IT-Logix POC)

Using the solution implemented during the POC we could see very quickly which reports were different in their current state compared to the reference state.

Another important aspect of the POC was the scalability of the solution approach as I work primarily with (large) enterprise customers. If I have not only 20 but hundreds of reports (and therefore test cases), I have to prioritize and manage the creation, execution and error analysis of these test cases somehow. Tosca helps with the feature to model business requirements and to connect them with the test cases. Based on that we can derive and report classical test metrics like test case coverage or test execution rate.

rbra_testing1 (1)

Requirements and test cases are tightly related (Source: IT-Logix POC)

In my eyes an infrastructure like Tosca is a basic requirement to systematically increase and keep the quality in BI / DWH systems. In addition advanced methods like test driven development are only adaptable to BI / DWH undertakings if the necessary infrastructure for test automation is available.

In this blog post I’ve shown a first, rudimentary solution for regression tests for BI frontend tools. In a next article I’ll show the possibilities to implement regression test for DWH backend components.

Event recommendation: Learn about a real life scenario to run a SAP BusinessObjects migration project in an agile manner. Hence test automation is key and explained in some more details. Join me during sapInsider’s BI2015 at Nice by mid of June. Find more information here.

(This blog post was first published by me in German here)

Agile (BI) in the context of large enterprises & corporate governance

It has been quite a while since my last blog post. The major reason for this is that my company IT-Logix decided to start its own blog. So I wrote a few articles, all of them in German. Therefore if you are interested in this post in German, you’ll find it here.

At IT-Logix we’re tackling the Agile topic for more than three years now. The main question is how you can apply the agile principles in a BI/DWH environment. You cannot “create” agility directly. It is much more a side effect of a combination of methods, concepts and technologies which are all agility oriented. The starting point is often – this is true for me too – the Agile Manifesto. The values and principles written down in the Agile Manifesto describe the “mindset” which is the most important component on your journey to more agility. Because no method, no concept and no tool solve any problem if the involved people don’t get the “sense” behind it.

If someone hears “agile” they typically think about Scrum. Scrum is a project method which has been around in the software development space for quite a time and now emerges more and more in other  domains – including BI and DWH projects. The circumstance that many people associate Agile and Scrum directly show that the inventors of Scrum were established a pretty good marketing around it. Still, you should consider three things here:

  1. What is called Scrum in certain organisations often is far away from the basic idea of Scrum, namely the agile mindest. Michael James described this recently on Twitter as follows:
    Anti-Agile
    This often leads to frustration – an amusing blog post puts it in a nutshell: http://antiagilemanifesto.com/
  2. Alongside the strong marketing of Scrum we find a proprietary terminology. An iteration is called a sprint, a daily coordination meeting “daily scrum”, a ToDo list becomes a backlog etc. This brings the danger of understanding and hence communication problems between stakeholders inside and outside the project organisation, especially senior management.
  3. Scrum is only one out of multiple agile practices – and for itself alone not always the most suitable method regarding its application in the BI/DWH environment.

Looking at my customer I experience again and again aspects which contradict an iterative project approach. Even just the project setup requires a project request with estimated efforts, a project plan and milestones. From an architectural perspective one needs to consider an already existing overall architecture, typically data models and systems which are already in place. During implementation often multiple teams of the organisation are involved, there are a lot of dependencies and therefore (unexpected) waiting times. How can you incorporate these surrounding conditions into a specific project method and still embrace the values of the Agile Manifesto?

In autumn 2014 I got to know the Disciplined Agile Delivery (DAD) framework of Scott Ambler. In my eyes Scott and his co-authors very well manage to bridge the gap between Scrum as a pure development method and the needs of an enterprise environment. Thereby DAD doesn’t reinvent things but provides a structure to put different methods in a bigger, overarching context.

DAD provides different project lifecycles. The basic “Agile” lifecycle contains many elements out of Scrum but resigns to use a proprietary language. An important aspect in addition is the inception phase where the necessary coordination with the “big picture” can happen.

Iterations a primarily based on a time box approach. That means you fix the time frame and vary the scope of an iteration in a way that it fits into this time slot. Depending on your environement this can become pretty tricky. Especially if other teams from which you depend do not follow this (time boxed) approach. That’s the reason why DAD provide further lifecycles, e.g. the lean lifecycle:

The basic structure remains the same, but the time boxed iterations are missing. Thereby the flexibility increases, but equally the danger of getting bogged down. At this place we need a certain discipline – the name “Disciplined” Agile Delivery has its roots in this reality.

To better get to know DAD we invited Scott Ambler for a two days workshop at IT-Logix by end of January 2015. As with Scrum, you can take a first certification level after such a workshop. Therefore I can proudly call me a Disciplined Agilist Yellow Belt😉

To sum it up: Once more one will need a good portion of “common sense” and not method devoutness. What works in practice counts. The DAD framework gives me and our customers a meaningful guard railing to realise the added values of an agile mindset. At the same time DAD is open enough to care for the individual cutomer situations. My own contribution is to fill the described, generic processes with BI specific aspects and apply it in practice.

Event recommendation 1: As a member of the Zurich Scrum Breakfast Club I meet with other agile practitioners and interested people. The event is organized in the OpenSpace format, more infos about it here. Interested in joining us? Let me know! I can bring along a guest for free.

Event recommendation 2: Join me during the sapInsider conference BI2015 at Nice by mid of June: I’ll explain how I applied the DAD framework to a SAP BusinessObjects migration project. More information about my session here.

To end this blog post a few impressions of our DAD workshop with Scott Ambler:

Foto 5 Foto 4 Foto 2 Foto 1

 

How to speed up connection between Translation Manager, Universe Designer and CMS

At a recent customer project my teammate and I were facing an issue with importing Webi documents and universes into Translation Manager (all on BO4.1 SP4) getting the following error message:

org.apache.axis2.AxisFault: Unable to find servers in CMS, servers could be down or disabled by the administrator (FWN 01014)

There is a KB article describing this error and a solution: http://service.sap.com/sap/support/notes/1879675

One hand we followed the instructions in the KB article and specified the hostname for each server. Although our server is multihomed (that means has more than one network interfaces) we didn’t thought about this before hand because the second network interface goes into a backup LAN and is never used for communication with e.g. the client tools. In addition everything worked fine so far – until we got the issue with Translation Manager. Still, just setting the hostname was not enough. On the server side there is the Windows firewall enabled, therefore we had to assign static request ports to several services. We already did so before our issue with the Translation Manager for the CMS, Input & Output FRS, all Webi Processing Server. We used TCPView to analyze on which ports Translation Manager was opening connections. As long as there were requests on ports which we didn’t specify in the CMC (some random port number e.g. 56487) we narrowed down all the services which Translation Manager establishes a connection. We had to specify a port for all of the following servers:

  • The Adaptive Processing Server (APS) hosting the DSL-Bridge Service
  • The APS hosting the Client Auditing Proxy Service
  • The APS hosting the Search Index Service
  • The APS hosting the Translation Service
  • The WACS (Web Application Container Server)

Besides the issue with Translation Manager we had another issue with creating new universe connections in the Universe Design Tool. When creating a new connection we had to wait up to 30 minutes ( ! ) to get to the wizard page where you can select from the list of available database drivers. Still, after 30 minutes everything worked fine and could create the connection successfully. Based on our experience with Translation Manager we run again TCPView and found that we need to assign a port number to both connection servers (32 and 64 bit) in CMC. Having done this, creating a new connection now works without any waiting time.

After all: If you have firewalls between your BO server and client tools, just assign a port to all servers available and open the port on the firewall (the only exception might be for the Event Server as I’m really not aware of any communication between this server and a component outside the BO server)

Optimizing my dashboard: Creating a visual draft

Remember my last post about the Webi dashboard? As mentioned at the end of that post aimed to show a technical trick to put an interactive kind of a button onto a Webi report. Now this post is the first in a series of posts how to optimize the layout of the initial dashboard. Let’s start with creating a draft of our optimized dashboard layout. The advantage of such a draft is that it is not yet implemented with the actual BI tool but using either pen and pencil or a wireframing tool. I did the later and chose the cloud edition of Balsamiq. To get a quick start you can use a 30 days trial account.

Let me explain briefly how the tool works. After that I’ll explain some of my thoughts behind the chosen layout.
Within the editor you can drag and drop sketched objects like a window container, rectangles, text, buttons etc.

Balsamiq_Editor

My dashboard in the Balsamiq editor

Looking at the available chart objects, I’m not really satisfied:

Balsamiq_Charts

Default chart elements in Balsamiq

Therefore I added my own images representing typical IBCS chart types (IBCS stands for International Business Communication Standards. I wrote a short blog post about IBCS here.). The images are based on the graphomate add-on for SAP DesignStudio and BusinessObjects Dashboards.

IBCS_charts

Typical IBCS chart types (created with graphomate)

After all, my inital dashboard draft looks like this:

Page01.jpeg

My dashboard draft (chart view 1)

At the top you can see some reserved space for an appropriate title. Providing this is a major requirement stated by IBCS as one of the top ten proposals:

I adapted this to a BI specific title concept where I distinguish between general title elements (like the organization or global query filters) and object specific titles, e.g. for the table or a chart. For the table I used the default element of Balsamiq, maybe I will update this later on with an IBCS optimized one. For now it is just a placeholder.

The charts in this first view will show current year values and previous year values (where as the current year will be indicated within the global filters area) for revenue and margin. To make the analysis of available data more straight forward, I decided to add two variance charts, one for absolute values and one with percentage values. Again, this is one of the top 10 elements in IBCS:

You might have discovered the symbolic button to switch the chart view. The second view looks like this:

Page02.jpeg

My dashboard draft (chart view 2)

The header and table areas stay the same. In the lower area with charts I now show historical values for revenue with actual and plan values. Instead of putting everything into one big chart I decided to use small multiples for the top 5 product lines (based on total revenue over time) as well as one chart for all other product lines. Depending on how it will look like in Webi we might decide to show more product lines or add another topic to the dashboard (as we still have space left on the bottom right corner).

In this blog I showed how to create a simple dashboard draft using a wireframeing tool like Balsamiq. In addition I pointed out how to apply two of the top ten IBCS proposals in this conceptual phase.

Update May 5, 2015: Internally at IT-Logix we continued the implementation of the above shown mockup; we used both Design Studio and Web Intelligence. Unfortunately the time for writing a blog post was missing so far. Still, if you are interested in the Web Intelligence part, have a look at the following sapInsider BI2015 session held by my teammate Kristof Gramm: http://bit.ly/1EGKFtA He will show how you can do pretty amazing things with Webi (by the way, using this link you’ll be granted a 300€ discount on the conference registration😉

Follow

Get every new post delivered to your Inbox.

Join 849 other followers