Business Intelligence Blogs

View blogs by industry experts on topics such as SSAS, SSIS, SSRS, Power BI, Performance Tuning, Azure, Big Data and much more! You can also sign up to post your own business intelligence blog.

Data Migration - For BI applications

  • 10 October 2013
  • Author: badriram
  • Number of views: 16469
  • 0 Comments

INTRODUCTION


Knowledge is Power” - something I heard back in my school days. But today's version of this saying in the real world is - “Information is Power; provided it is utilized  & managed in the right way”.

Any skill or information just available as a dump of data or knowledge is of no use; unless implemented for a constructive / progressive purpose; particularly where its utility is for Decision support. Decision Support Systems are those applications which always have critical time & resource requirements because the information available is only useful for that long. Business & Finance Managers rely heavily on fast & highly available Decision Support Systems.

In today's world Decision support is a Business Intelligence solution which involves

  • Gathering enterprise wide data

  • Storing & generating patterns to get an insight of how things are working across the organization and

  • Formulate the strategy going forward.

Technology changes very rapidly and this gives birth to newer & more robust BI / DSS applications. And we can do away with the older / legacy applications & their shortcomings together. But the painful part of technology change here is the data that needs to be migrated from the old to the new. Migration exercises involved

  • Data profiling

  • Complete understanding of the new applications' data requirements

  • Outstanding data quality issues [if any] with the existing information

  • A massive effort to load data through to the new application.

While this may sound like an opportunity to cash in on; for every IT service organization; the pain still remains. This paper deals with the background of what every data migration exercise faces; where are the pit falls of traditional migration projects & how can we make this whole process less painful and more constructive.


BACKGROUND

Change has become as inevitable as breathing in today's fast paced software world. Adaptability to this changing environment would lead to a resource crunch or on the other hand costs would spiral out of control. When BI legacy applications are given up for newer and more robust ones; we are left with Terabytes of information to migrate across these applications. Data migration plans normally don’t factor in unforeseen issues like the below –

  • Outstanding data quality issues that exist and are unresolved in the legacy application; which could lead to issues like incorrect patterns reported out to users / business. This is by far the biggest risk during migration.

  • Data profiling to “fit” the old data to work with the new application; which could end up in re-work on the design of the new application OR “hanging” data i.e. data which can neither be used nor thrown away.

  • Business critical reporting delayed or even worse; unavailable completely – The worst of nightmares an IT team could face would be users / business managers escalating delays in reporting; especially those which are critical to business compliance requirements.

  • Lost or UN-analyzed data could lead to misleading insights and trends which can result in lost or missed opportunities in the market.

This paper outlines a generic Data migration process that will mitigate the risks arising out of the above problems and a quick way of migrating with minimal business / end user impact. The process makes only one assumption –  we have a database / data warehouse as an integral part of the BI application.

 

MIGRATION MADE EASY

Data Migration is as important as developing a new application. It needs to be given parallel attention and worked alongside the new application development. Migration as a project could be planned starting with any of the below phases. The sequence can be re-ordered based on project / business priorities.

  • Critical Outputs or Pre-Output data / KPIs migration & Data feeds from the legacy to the new BI application optionally [but recommended].

  • Quick Reconciliation process between outputs of both applications and the respective warehouses.

  • Plugging in enterprise wide source systems to the new application to catch up and regularize feeds into new warehouse by re-using analysis from (1) where Data feeds were developed.

 

CRITICAL OUTPUTS / KPI DATA MIGRATION & Optional Data Feeds

Activities involved

Every Decision support system is meant to provide some meaningful insights into tons of data and every type of business has its own set of KPIs which get generated / stored and reported in some form. The only variation here would be to either store the KPIs themselves or a Pre-KPI output [i.e. a simplified version of source data to collect KPIs]. There could be rare cases where this information is never stored and raw data is sometimes picked and complex transformation logic is used to generate these KPIs. We can deal with that too.

Every KPI is formed by a single or a set of rules / formulae that are applied to raw data. Normally these rules would not change radically when migrating to a new application. But when they do change this step becomes necessary. Every piece of raw data goes through the following data life cycle and each KPI definitely has at least one of the below outputs stored somewhere in some form.

INPUT

OUTPUT

Source System Feeds

Raw data

Consolidation & Validation

Formatted & cleansed data

Grouping

Reference & Transactional data

Summarization

KPI / Formatted Report

Depending on what form the output is available & its granularity i.e. raw data or summarized pre-KPI output; we will create a one-time throw away prototype of “upgrading” this output to look like the new application's output; based on the new application's requirements and segment the old data by time.

Benefits

  • Reduced dependency on large volumes of data which need to be cleansed / validated again for it to be set for the new application.

  • Downtime reduced for business / users i.e. outputs will be available instantly without any dependency on migration process itself.

  • Developing this throw away prototype to convert the old outputs to the new provides some more insights into existing data quality issues that may have existed and can help resolve outstanding data quality issues; without having to go through problem / incident management.

Migrating KPIs can be prioritized based on the most business critical to the least and can be worked upon quickly to reduce downtime to business reporting. However the volume of this KPI / Pre-KPI data itself would determine whether we do a one time feed or do a regularized feed [small bits at a time]. But the process remains the same! Creating a periodic feed is a more prudent solution. It will become more evident in this approach as you read on.


An Example

For an Insurance company in the UK; Profit by product is a KPI that is needed by top Execs in the organization to forecast where they are headed next. But the Total Reserves for Claims is a KPI that is mandatory for Solvency reporting to the FSA in the UK. We would choose to report Claims Reserves first to keep the company running and then provide Profit by Product to forecast earnings later.

To get the profit by product; we would collect the regional sales / revenue & expenses information and consolidate them to summarize the numbers. But for Claims reserves we would need additional projected claims' payments for the year ahead to bring it out.

Profit for the prior year would not change much even the application has been scaled up. So it would be simple to bring in the overall sales & expenses information directly without any hassle and store it in whatever form the new application accepts it. So the prototype to handle this would create a mapping rule between the storage format of sales figures on the old application to the sales numbers on the new one. Summarizing this information provides the profit.

Claims reserves depend on the old and the new data because it is more of a projection. We need the legacy data and also the new data for this. Which means we would do a time bound mapping of the old data [which is already segmented by time] to select only a subset of it and combine it with the new data to get what we need. 


Data Feeds from Legacy Application

This stage can be considered optional if the complete data set needs to be migrated as a compliance requirement; for ex: to comply with Data Protection Act. Otherwise the previous step takes care of the outputs from history data and we can do away with legacy data. If taken another step forward; the below effort can be used to create a feed from the Legacy sources to the new application as well. This step is the most expensive part of this process; but can be done easily if the requirements for the new application are well placed. What needs be done here is –

  • Create a simple mapping rule between the data on the old warehouse and the new one. Leave out Master data & Reference data keyed in by business & we are left with a handful of transactional data tables for the data profiling exercise. 

  • Interact closely with the Business Analysts to analyze which attributes will flow through, which of these are not needed and which ones need to be derived afresh.

  • Chances are that Source systems would also have migrated and provided new meanings to their attributes or done away with old ones as well. So the mapping rules need to consider these changes as well.

Chances are that 90% of the data migration projects done need a Legacy data feed because they will only need the KPIs from this data. And if the Source systems have become obsolete then this will not be required.




QUICK RECONCILIATION


Activities Involved

The absence of appropriate data reconciliation mechanisms is what leads to Data quality issues in the first place. The best practice is to prudently reconcile the data back to its previous state every time it is transformed or summarized. In our case a reconciliation mechanism needs to be put in place to make sure the previous step has run successfully.

Again like all necessary evils; we have to spend time just to prove that our transformation has worked even though we have tried and tested it. We have to blame it on Data quality measures across the board starting from when it was keyed into digital form; right till it gets reported.

Just like the previous process; even the reconciliation needs to be a regularized check which happens on a periodic basis and reports and anomalies or differences between the old and the new KPI / transactional data obtained from the previous step.

There are 2 forms of reconciliation that can work really well for BI applications –

  • Reconcile based on time

  • Reconcile based on KPI

I would say; go for both. A time based KPI wise reconciliation. We could draw up a simple reconciliation matrix as below.


 

T1

T2

T3

T4

Summary

K1

 

 

 

 

K2

 

 

 

 

K3

 

 

 

 

Summary

 

 

 

 


 


Benefits

  • Since this is an ongoing and periodic process; it is best to share these reconciliation results with the team which represents the users and request for a review of data loaded on that day. This will help keep track of any issues with the migration and how they were fixed in due course; the Agile way!

  • Publishing this to the business users / stakeholders from time to time will increase their confidence in the migration as well. It would re-in force the need to gather additional resources to improve Data quality measures

  • In due course the KPI migration would have completed and we will need to publish a similar result between the Legacy sources and the migrated data. So this will be a forerunner to the main reconciliation process that we need to setup later.


DIRECT Feed from SOURCES to NEW APPLICATION


 

This step relies on the optional analysis performed in Step (1) where Data profiling was performed for attributes that need to pushed to the new application. This analysis will be useful in creating a new feed from the Source Systems to the new application.

 Activities Involved

 Again this stage demands that we work closely with Business Analysts to identify how best to utilize the Legacy sources while they are still around and what attributes to obtain from them.

This stage needs to be planned in such a way that only a smaller subset / segment of the Legacy sources will be fed through after a particular cut off point in time. We have already obtained a large part of our migrated information from the KPI feeds and also the optional data feeds from the Legacy application until a point in time. So we need to agree with the Sources on this.

Once Data profiling analysis is utilized to obtain the mapping rules for every attribute that needs to be brought in and the transformation logic is in place to feed; we move on to developing a reconciliation routine to keep check on Data quality across feeds.


Benefits

  • Re-work on data profiling & reconciliation has been reduced; if & when new feeds are needed to plug in legacy sources to the new application.

  • This does NOT hinder the outputs of the historical data in any way. It is just like any other feed into the warehouse. So we will always be good with the outputs until a point in time. Any data qualities issues that arise will focus only a smaller subset of data that was fed recently.


SOME BEST PRACTICES FOR MIGRATION


 

  • Always involve the Business / Stakeholders in the Data profiling & Reconciliation exercises since this will provide a clear picture of what they are looking for. Do this as early as possible; like during the development of the new application itself.

  • Take care that the reference data that was used in the Legacy application is error free & well tuned for the new & robust application. For this perform as many regression tests as possible.

  • Decommission any unused & unwanted sources of information to avoid any confusion during analysis / implementation.

  • Take a first hand view of what an auditor or an external compliance organization thinks about your Data quality measures and set standards across the organization to stay up to the mark.

  • NEVER look at modifying your design / code for even the smallest of issues that crop up either in Reconciliation or End user reporting.

  • Encourage frequent Knowledge sharing sessions between IT & business team to help make the entire organization see a single version of the truth.

  • Please look at documenting anything that goes beyond the requirements specification as this will be really useful for Support / Maintenance teams in the future. Maintain a separate log for Known / Outstanding issues and their immediate resolution for the benefit of the Level 1 support team.

  • Encourage application and technology enhancements like Mobility & Smart phone apps that help get closer to the real customer / consumer / user.


 


 

Print
Tags:
Rate this article:
No rating

badrirambadriram

Other posts by badriram

Please login or register to post comments.