Information Quality Governance
Information governance is a relatively new discipline. Its purpose is to standardize the use of core and extended metadata across data domains. The need for metadata standardization is important because of the proliferation of IT systems and metadata as well as the evolution of Big Data. Industries such as medical devices and pharmaceuticals need strict controls over data definitions and usage to adhere to governmental regulations and to reduce risk. These industries have been leaders in information governance. In the past decade, other industries have been working to
improve the efficiency and quality of data use because of the proliferation of applications using data and the advent of Big Data. Data quality has always been a cornerstone for process improvement, but in IT ecosystems, with Big Data and automation, process improvement efforts depend on newer analytical tools and methods. In the absence of effective governance, severe issues arise that impact the ability of an organization to manage information across data domains and business processes. Examples include inconsistent data capture from multiple intake sources; inconsistent data quality rules for accuracy, consistency, and other quality dimension gaps; and unclear roles and responsibilities for data ownership, allowing unauthorized metadata changes.
The application of governance is important to ensure consistent definitions and use of shared metadata. Organizations have thousands of data elements (i.e., metadata). Some are more important than others because they are used across several IT applications and processes. As an example, customer profiles are used for quoting, order, delivery, and invoicing. Metadata is associated with business processes. This association is called a data domain, and there are different data domains with unique data owners (e.g., customer, procurement, manufacturing, finance, and others), with metadata used to control their work. An assumption is that metadata fields have business and IT owners by functional group. As an example, marketing controls customer and contact metadata; manufacturing controls production metadata; and supply chains control metadata related to suppliers, logistics, and inventories.
Governance is facilitated by classifying metadata into core and the extended data elements used by functional groups to govern their data domains. Core data elements are used by several IT applications and require strict control by their owners and conformance to quality dimensions like accuracy, consistency within a database over many transactions and across various databases (i.e., synchronicity), timeliness (i.e., availability), completeness, adherence with a required format, uniqueness (i.e., one instance). The data lineage must also be known from source to consuming systems, and clear roles and responsibilities must be controlled through policies and standards. These are coordinated by a leadership governance council that approves how core data elements are defined and used across the organization.
The process for building information governance begins with a proof of concept using impactful business use cases. Maturity increases as the number of new use cases expands to include more core metadata that is managed though a governance framework. Table 10.5 describes a
Data Governance Maturity
simplified version of five maturity stages: assessment, governance framework, reporting structure, improving performance, and sustaining the improvements. In the assessment phase, a core stakeholder council is created, and a few data domains are chosen to prove the governance concept using a few major business problems impacted by poor data quality. Relevant persona and use cases are created to align to the business benefits associated with higher metadata quality. Examples include reducing customer returns caused by inaccurate customer address, or reducing high field-service costs incurred when technicians are sent to service equipment at a wrong location. Preliminary scorecards are also created to start measuring metadata quality dimensions.
Building the governance framework requires solutioning the persona and use cases, expanding the leadership governance and working councils based on data domains, defining and assigning roles and responsibilities, and creating policies, standards, definitions, and business rules. This information is formally incorporated into a collaborative IT platform that controls metadata ownership, policies, standards, reporting, definitions, and related information as well as CRUD. Roles and responsibilities are used to control access to the collaborative platform. This begins with a leadership council that approves policies and investments aligned with data quality that provide impactful business benefits. The next level is the working council.
The working council is composed of data domains. As an example, in the manufacturing data domain, there would be owners of the bill of materials, testing, routing, and production metadata. The working council includes data stewards who are aligned by data domain. Data stewards approve data policies, standards, and definitions. They also coordinate data cleanup, create data models, and develop the policies, standards, and other supporting collateral for council approval. Collectively, the working council determines which roles create, review, update, and delete metadata. Based on its governance actions, the working council also recommends investments to the leadership council to solve poor data quality issues that impact business owners. Additional members of the working council may include various IT roles associated with maintaining the ecosystem or managing its information as well as team members who maintain and ensure data are available. Team members execute policies, standards, and procedures with the business teams. Working councils help improve data quality to reduce service errors, delivery errors, and other business process errors by measuring and reporting the dimensional quality of metadata and engaging process improvement professionals.
In the reporting phase, metadata quality dimensions are profiled and correlated to metrics of the impactful use cases using a scorecard like the one shown in Table 10.4. Exception reporting identifies outliers or non- random patterns that require investigation. Then projects are created to improve one or more data quality dimensions. Information governance then becomes a closed loop with reporting made to the councils and projects approved to improve data quality. This momentum leads into the improve performance phase. The scorecard ratings become progressively higher as projects are completed. As part of this phase, end-to-end metadata lineage across the IT ecosystem is mapped using automated algorithms to understand metadata flows from source to consuming systems (i.e., who is using the metadata and how they are using it). The number of use cases continues to expand across the data domains. Business benefits accrue and serve as a basis for further investment to improve data quality.
In the sustain phase, data governance is transitioned into the organization and data quality information is integrated into the collaborative platform that coordinates data access and CRUD actions by role. The data domains and all of the councils formally coordinate all data governance through the collaborative platform. This limits the ability of one group to make unilateral changes to metadata rules and other attributes that may adversely impact other domains and business owners. Governance is now formal and is the basis for approved policies, reporting, and control of definitions, ownership, key performance indicators, and remediation plans for root-cause analysis.
A lack of ownership and standardization causes metadata to be interpreted differently in IT applications and their functional groups, and this leads to an inconsistent taxonomy. Inconsistent taxonomy results in inconsistent metadata usage (e.g., building inconsistent customer and sales account profiles or creating different profiles for the same customer in different IT applications). This results in proliferation of duplicate and inaccurate transaction records, which can cause operational issues (e.g., referencing information for CRUD, or identifying the root causes for process issues). As an example, when there are different customer profiles with different shipping addresses for the one customer, the likelihood of shipping products to the wrong location increases, as does the time to look up information. Localized interpretation of metadata also creates different metadata definitions and models.