Data rich and knowledge poor. That's a common cliche we hear today and is the mantra or driving force behind the contemporary BI movement. But as more and more companies begin to explore and utilize the wealth of data they have at their fingertips, they are finding new challenges to old problems. In this blog series we are going to explore master data, master data management and the new Master Data Services (MDS) available in SQL Server 2012. This first post is foundational in that we define a baseline of knowledge and concepts which are important to understanding the what, how's and whys of MDS 2012.
What is Master Data?
Before we dive-in headfirst, let us define master data. When speaking about master data we refer to the nouns or building-blocks of a business. This is the non-transactional data that is analogous to dimensional data in a data warehouse environment. Master data represents things like a customer, product, employee or even a store location. These things describe or give context to a business transaction such as an customer ordering a product.
The nature of master data, means that it is often created, shared and ultimately maintained across both departmental and system or application boundaries. In Figure 1, I've illustrated this concept using the definition of a customer which is common to most businesses.
Figure 1: Master Data in the Enterprise
With functioning understanding of master data established, we are going to pivot and begin exploring the driving factors behind the rise of importance of both master data and master data management.
The Importance of Data Quality
As far back as 2004, Gartner Research pointed out that poor data quality was a leading cause of project failure. In more recent studies, surveys have found that companies which span both industry sectors and the globe report average annual losses of approximately $8.2 (USD) million due to data quality issues.
The challenge of data quality is two-fold. First there is no magic technological solution to fix data quality solution. By this I mean there's no software or appliance that you can build or buy that magically fixes your data quality issues. In order to meet the challenges of data quality you must approach it through a combination of people, process and technology.
The second challenge is an extension or more accurately a result of the first. Because data quality if not a technology issue and it requires both people and process to correct, it cannot be solely defined or delegated as an IT problem. To be successfully managed, data quality must be addressed as a business or enterprise opportunity.
Together these challenges as well as the large impact data quality can have on the bottom line has led to a significant increase of renewed interest in data quality projects.
Master Data's Role in Data Quality
In a large number of businesses, master data was relegated to a problem the data warehouse team handle as it integrated data from both across the enterprise and disparate systems. With the renewed interested in data quality however, a new focus has been placed on master data.
Businesses are finding that significant financial costs are surfacing due the decentralized and often times inaccurate nature of master data. The momentum this has created has driven the interest in establishing a "single version of the truth", which in turn addresses data quality by establishing an authoritative source for what we defined as the building blocks of a business.
Data Governance & Master Data Management
Managing master data or data governance is a broad topic for which there are volumes and volumes of literature and research available. A deep dive is beyond the scope of this blog, but it is valuable to define the key concepts.
Master Data Management (MDM) is nothing more than a collection of tools, policies, processes and procedures that facilitate the delivery of single consistent view of uniquely identifiable master data across the enterprise. MDM is defined as part of the Data Governance process.
Data Governance at the 10,000 foot level, is where an organization defines the ownership of data. Policies and procedures are put into place for how that data will be accessed and managed. Quality standards and policies like data retention are also defined. Processes to handle tasks such as change management and disaster recover are also outlined.
Part of the defining the ownership, includes identifying the person or group of people that are referred to as the data stewards. Each entity commonly gets its own steward since this person is expected to be the subject-matter expert on the data (i.e. for example the product guru). The responsibility for maintaining the data on a continuing basis falls to these data stewards.
If you are interested in exploring this area in more detail I will point you to the often cited book by David Loshen, "Master Data Management".
In the next blog post, we will leave the conceptual realm and take a look at a few different architectures that support the management of master data. We will also get our first look at Master Data Services 2012 and the features/benefits it brings to the table.
Till next time!!