An integrated organisation-wide data quality management and information governance framework: theoretical underpinnings

Siaw-Teng Liaw

School of Public Health & Community Medicine, UNSW Medicine and General Practice Unit, South Western Sydney Local Health District, Australia

Christopher Pearce

Inner Eastern Medicare Local, Melbourne, Australia

Harshana Liyanage

Department of Primary Care & Clinical Informatics, University of Surrey, Guildford, UK

Gladys SS Liaw

ZxCel Consulting, Melbourne, Australia

Simon de Lusignan

Department of Primary Care & Clinical Informatics, University of Surrey, Guildford, UK

Cite this article: Liaw S-T, Pearce C, Liyanage H, Liaw GSS, de Lusignan S. An integrated organisation-wide data quality management and information governance framework: theoretical underpinnings. Inform Prim Care. 2014;21(4):199–206.

Copyright © 2014 The Author(s). Published by BCS, The Chartered Institute for IT under Creative Commons license

Author address for correspondence:

Siaw-Teng Liaw

Academic General Practice Unit,

Fairfield Hospital, Cnr of Polding Street and

Prairievale Road, Prairiewood, NSW 2176, Australia.



Introduction Increasing investment in eHealth aims to improve cost effectiveness and safety of care. Data extraction and aggregation can create new data products to improve professional practice and provide feedback to improve the quality of source data. A previous systematic review concluded that locally relevant clinical indicators and use of clinical record systems could support clinical governance. We aimed to extend and update the review with a theoretical framework.

Methods We searched PubMed, Medline, Web of Science, ABI Inform (Proquest) and Business Source Premier (EBSCO) using the terms curation, information ecosystem, data quality management (DQM), data governance, information governance (IG) and data stewardship. We focused on and analysed the scope of DQM and IG processes, theoretical frameworks, and determinants of the processing, quality assurance, presentation and sharing of data across the enterprise.

Findings There are good theoretical reasons for integrated governance, but there is variable alignment of DQM, IG and health system objectives across the health enterprise. Ethical constraints exist that require health information ecosystems to process data in ways that are aligned with improving health and system efficiency and ensuring patient safety. Despite an increasingly ‘big-data’ environment, DQM and IG in health services are still fragmented across the data production cycle. We extend current work on DQM and IG with a theoretical framework for integrated IG across the data cycle.

Conclusions The dimensions of this theory-based framework would require testing with qualitative and quantitative studies to examine the applicability and utility, along with an evaluation of its impact on data quality across the health enterprise.

Keywords: big data, data quality, electronic health records, information, integration, governance, organisation, clinical.


There has been significant investment in eHealth1 and new forms of research taking advantage of information technology, so-called eResearch.2 This has come about as a result of health reform emphasising the need for more efficient integrated care underpinned by electronic health records (EHRs) and clinical information systems (CISs) to monitor and improve the safety and quality of patient care in ways which paper records cannot. Increasingly, data contained in single EHRs/CISs (or data repositories with data from multiple EHRs/CISs) are being used as major sources of information to create knowledge about health practice, both overseas3 and in Australia.4 Indeed, it has been suggested that systematic interrogation of large high-quality social media datasets5,6 or CIS7 may well replace current scientific research methods to create knowledge such as randomised controlled trials.

We increasingly work within an information ecosystem, where secondary processors of data may take us beyond the current data paradigms and ensure that semantically integrated data within a health system are of sufficient quality. For example, the Australian national eHealth record system, including the personally controlled EHR (PCEHR), will require semantically interoperable data aggregated from disparate organisational EHR/CIS. Data and information governance, along with knowledge management, is critical to ensure that the integrity of health data and information is maintained to ensure their fitness for purpose. However, whilst information governance (IG) is mostly in place, data quality (DQ) is left as a side issue. Good data quality management (DQM) and IG, embedded in good corporate and clinical governance, is therefore an unrealised priority. The burgeoning numbers of health data repositories and increasing volumes of health data within them make it timely to align good DQM and IG with the objectives of the health organisation, establishing roles and responsibilities to ensure high-quality data to support the delivery of safe and effective patient care and monitoring the impact, quality and safety of the care.

This article builds on a recent systematic review, which concluded that locally relevant clinical indicators and the use of EHRs could support clinical governance8 and a previous review of DQ and DQM.9 These and related studies and reviews suggested the need for good data and information governance and quality management in organisations that routinely collect data in EHRs, along with a robust theoretical framework. This article describes the development of the theoretical framework within the framework of clinical and corporate management and governance.

Box 1 Definitions of key terms

Curation: The activity of managing and promoting the use of data from their point of creation, to ensure they are fit for contemporary purpose, and available for discovery and reuse.a

Information Ecosystem: A network that is continuously sharing information, optimising decisions, communicating results and generating new insights for businesses.b

Data Quality Management: An activity that involves definition of DQ standards, definition of data collection strategies and assessment of collected data using DQ indicators.d

Data Governance: An activity that specifies who holds the decision rights and accountability for an organisation’s decisions about its data assets.e

Information Governance: An activity that ensures necessary safeguards for, and appropriate use of, patient and personal information.f

Data Stewardship: An activity that attends to and takes the past into account to influence the future, stretching from data planning to sampling, from data archive to use and reuse. This includes the care of data and information infrastructure, and involves data definitions, data requirements and quality assurance as well as user feedback, redesign and data exchange.c

aLord P, Macdonald A, Lyon L, et al. From Data Deluge to Data Curation. UK e-science All Hands meeting 2004, 371–5.

bDavenport TH, Barth P and Bean R. How ‘big data’ is different. MIT Sloan Management Review. 2004; 54;1: 43.

cKarasti H, Baker KS and Halkola E. Enriching the notion of data curation in e-science: data managing and information infrastructuring in the long term ecological research (LTER) network. Computer Supported Coop Work. 2006;15:321–58.

dWeidema BP and Wesnæs MS. Data quality management for life cycle inventories—an example of using data quality indicators. Journal of Cleaner Production. 1996;4(3):167–74.

eKhatri V and Brown CV. Designing data governance. Communications of the ACM. 2010;53(1):148–52.

fNHS Definition of Information Governance. URL:


We searched PubMed, Medline, Web of Science, ABI Inform (Proquest) and Business Source Premier (EBSCO) using the terms listed and defined in Box 1, with a focus on developing theoretical underpinnings for clinical governance,8 DQ and DQM,9 using these two reviews as starting points with a view to incorporating IG.

The search was theory driven and conducted iteratively, including only papers with an emphasis on theory or conceptual frameworks. We analysed the scope of DQM and IG processes in the health information ecosystem, theoretical frameworks and determinants of the processing, quality assurance, presentation and sharing of high-quality data across the enterprise to ensure the information is fit for care monitoring, coordination and improvement.

Health, data and technical perspectives must be combined to ensure that health data products are fit for purpose. This approach is consistent with the International Standards Organisation (ISO) definition of quality as the totality of features and characteristics of an entity that bears on its ability to satisfy stated and implied needs (ISO 8402-1986, Quality Vocabulary).

The quality of routinely collected data is affected by errors of omission and commission at many points described by a conceptual framework (Figure 1) to assess the fitness of routinely collected data for research, audit and quality assurance purposes.10 Mapping the DQM and IG processes within this framework requires a formal system of metadata, covering both primary and secondary variables. Outputs from routinely collected datasets should include information on data provenance and processing methods across the data cycle.10 DQM and IG must address data creation and capture/collection and proceed through the data cycle to data curation, presentation of the data product and user guidance and support. The ISO 9000 series of quality standards emphasises the prevention of defects through the planning and application of best practices at every stage of the business – from design through to installation and servicing (

The health objective is safe and effective care of the patient and population groups and the collection of good quality data to describe the process and content of health care provided. The data cycle begins with the creation and collection of data as part of clinician–patient interaction within the confidential therapeutic relationship. The technical and business objectives are to ensure that corporate data are good enough to support the health objective. Good corporate and clinical governance requires good IG across the DQ cycle to ensure that the organisational environment, information and culture supports and facilitates the achievement of the health objective and collection of good data at point of care. Approaches and mechanisms to achieve this must be flexible as they will vary according to the sector, institution or other contextual constraints.8

Organisational systems theory11 can explain how organisations interpret their data, information and knowledge to guide organisational strategy, policy and operations. However, as noted in recent difficulties with implementation of large complex adaptive information and communication technology ICT systems in organisations such as the English National Health Service,12 the approach should be phenomenological1315 and realist12,1618 to understand the contextual and intrinsic factors that influence and determine the success of large-system transformation. The sociotechnical approach to technology diffusion describes an iterative process, leading to mutual transformation of the users and the technology.13,14 Ciborra19 used the host–guest relationship to illustrate the complex and multidirectional actions and reactions as a result of introducing ICT systems into host organisations; the technology can act as both host and guest in various contexts to produce unpredicted or unintended impacts on the actors and system. The sociotechnical and realist ‘context-mechanismsimpacts’ principles apply to DQM and IG as well as to ICT policies and strategies.

Figure 1. The data production cycle (reprinted with permission)

Drawing on the Habermasian concept of communicative action,20 health data and information can be conceptualised as a product of a communicative action by a health professional interacting and making and acting on decisions with a patient in a Habermasian lifeworld (Box 2). The presence (or absence) of data and information provided by patients and filtered by clinicians is captured in the database schemas and metadata rules in the CIS, a part of the ‘system’ where ‘strategic action’ takes place. The information, which may be rich narratives or presence/absence of standard terms, can then be translated into formats that can be used for other purposes such as decision support, quality improvement and research.

Box 2 Habermas, communicative action and data

Strategic Action: Treats actors as objects, or data points for manipulation by the system.

System: The structured elements of society that are governed by rules.

Communicative Action: Meaningful interactions between persons, derived from their experience of the lifeworld.

Lifeworld: The stock of experiences and competencies used to negotiate the world.

The interpretive schemes of Giddens’ structuration theory,21 which describes the transition of information between structure and communication, further explain the capture of communicative actions, making them and their products available to the system. Data and information are vehicles by which the process and effects of human interaction and communicative action are placed in a system construct and captured in a CIS, where it can shape an organisation/system and associated processes and protocols. This host–guest relationship19 at the data level emphasises that data are not only the product of an interaction for consumption by clinicians, researchers or policy makers but are also agents within the care process, influencing the cycle of care and promoting continuous quality improvement within an informational space and ecosystem.22 Pór described the knowledge ecosystem as a ‘triple network’ that comprises (i) a people network of productive conversations, (ii) a knowledge network of ideas, information and inspiration, supported by (iii) a technology network of knowledge bases and communication links to nurture collective intelligence and systemic wisdom.23

This multilevel conceptual framework with people, technology and data and knowledge dimensions facilitates the understanding of actions on the data as well as the use of data to achieve organisational and system objectives. Data are not just a technical creation, but are also a social construct for personal, professional and organisational purposes. This should guide directors and senior management of organisations to structure the IG and DQM roles, responsibilities and accountabilities of staff, develop processes, and understand technology and social requirements to support decision-making processes and authority for data-related matters in the ‘lifeworld’.


Combining this trans-theoretical framework with our experience with General Practice Networks and Medicare Locals in Australia24 and Clinical Commissioning Groups in England,25 we emphasise the need for alignment and integration across the data production cycle and propose three models of health care organisations, which may explain their affinity to adopt integrated DQM and IG:26

1. The corporate model, in which patients are clients or customers, and the primary concern of the organisation is economic.

2. The organisation as an orchestrator of providers which, like an airport that organises airlines and passengers, devolves some responsibility to the health care providers.

3. The organisation as a community of practice.

The IG structures, ethical frameworks and DQM methods will vary with function; size, from organisation-based ‘small data’ to large national ‘big-data’ repositories; and whether the use of data is primary (e.g. for clinical care) or secondary. We believe this primary/secondary distinction to be artificial and, building on existing work,24 propose a hierarchy of use of data and information (Box 3). All uses are important, but need to be prioritised to create ‘fit for purpose’ data.

The need for alignment and integration across the data cycle

There is little focus on the data creator or collector beyond general statements to ensure that information collected and created is appropriate to business needs. This increasing isolation of data managers from data creators and collectors, who are usually clinicians and scientists, is consistent with our own experience in Australia and England and has major impacts on the quality of routinely collected data in both hospital27 and general practice28 settings. We lack a support system and professional culture that value strategies to support busy clinicians to collect and create comprehensive and consistent health data for use in decision support systems, to conduct population health research or monitor safety and quality of care. By understanding the ‘lifeworld’ as source and the ‘sociotechnical system’ as interpreter within a realist and phenomenological enquiry framework, we can conceptualise a theory-driven approach to align DQM, IG and organisational objectives. DQM, IG and the objectives of the health organisation or system must be aligned logically and operationalised within the information ecosystem of the organisation and system (Figure 2).

Box 3 Conceptual framework of organisational purpose, data, DQ and governance

At the corporate governance level, if the IG is good but DQM is poor, we cannot make good decisions because we do not know how good the data are. If the DQM is good but the IG is poor, the organisation is not well governed as the clinical and corporate governors cannot make good decisions and manage the risks to the organisation. At the operational level, a mal-aligned and unaligned DQM and IG structure and process within the information ecosystem means that there are data in the system but we do not know whether they are fit for purpose or have great difficulty accessing the information to conduct research and quality monitoring, audit safety and effectiveness and spend money on things that do nothing for patient care (Figure 2).

The optimum situation is to nest DQM and organisational objectives within the information ecosystem, which is governed by an IG framework overseen by an IG authority. Our experience is that most organisations are partially aligned or mal-aligned in Australia8,27 and England29 where IG policy is technology focused with some professional responsibilities, and DQM initiatives happening elsewhere in the organisation or driven by the use of data for pay-for-performance30 or other quality initiatives or technical developments such as a unique English National Health Service identifier and laboratory links.3

Figure 2. Alignment of data quality management, information governance and health system objectives

An IG framework must align DQM with organisational objectives

We propose a utilitarian governance framework to assist health services to structure and document their DQM roles, responsibilities and accountabilities to address the accuracy, fidelity and integrity of data as well as the precision of tools used. The DQM and IG process needs to consider six key activities; the relevant actors (clinician, administrator, data manager, planner, CIO, CEO, Board director, etc.) need to ask six key questions in relation to these activities:

1. Data collection and utility (why am I collecting/recording these data?)

2. Data provenance including metadata (how did data come to be?)

3. Errors with data extraction, linkage, processing and translation (do data look right?)

4. Triangulate and validate data iteratively (who or what can I check the accuracy with?)

5. Traceability (where did data come from?)

6. Curation (how do I look up earlier data?).

The use of a DQ matrix, comprising DQM roles and decision activities,31,32 can guide the organisation to assign roles and responsibilities in a consistent and systematic manner (Box 4).

The columns of the matrix indicate the roles on DQM. The rows of the matrix identify the responsibilities, qualified by DQ questions. The cells of the matrix are filled with the responsibilities, that is, specify degrees of authority between roles and decision areas. This process will determine who is accountable, responsible for, informed and consulted about the task. The model underlying this matrix requires further research and refinement.

A number of models for IG exist,3234 including those promoted by the Data Governance Institute ( Most IG programmes make new or align existing rules, decision rights and accountabilities for DQM and information-related processes such as data creation, collection, extraction, linkage, processing, curation and presentation of data. Implementing these rules will need consensus models and standard operating procedures that describe explicitly who can take what actions with what information, when, under what circumstances and using what methods.3234 Clear roles and responsibilities and a mandate to carry out DQ improvement initiatives are determinants of successful DQM and IG programmes within the five ‘simple rules’ for successful large-system transformation.16

An IG ‘authority’ should exist in any health care organisation. In a large organisation, including government agencies, it may be a designated board committee with a distributed leadership or, in a small organisation such as a general practice, an individual designated as a ‘data quality officer’. This role to guide the development, implementation and oversight of a DQM and IG programme should have the resources and delegated authority to implement and monitor an enterprise-wide (or general practice wide) programme and ensure transparency of processes. Specialist DQ stewards ( trained in health informatics, a multidisciplinary discipline that integrates the information, biological and clinical sciences, have been proposed to implement and support DQM programmes. These ‘cultural brokers’ can bridge the conceptual gap between clinical data creators and users and non-clinical data stakeholders such as health information managers or technical staff.12 Existing organisational units such as institutional Human Research Ethics Committees and/or Clinical Councils can be tasked specifically with proactive patient advocacy roles and responsibilities.

Success of DQ programmes should be measured in terms of health systems achieving their goals, better able to measure quality and achieve their health objectives, whilst providing patients a positive experience. To paraphrase Darzi35 in his progress report on the NHS:

Today, with the UK NHS budget approaching £2 billion a week, more staff, and improvements in the quality and availability of information, quality can be at the heart of everything we do in the NHS. It means moving from high quality care in some aspects to high quality care in all.

Box 4 DQM matrix for health organisations or system


We have extended the current work on DQM and IG with a unifying theoretical framework. The phenomenological and realist analytic lens of the framework emphasises human agency and reasoning from both cognitive and affective perspectives, applicable to both patient and clinician. There are good theoretical reasons for aligned and integrated DQM and IG across the enterprise to achieve optimal data utility and quality. Regulatory bodies should require service providers to see their IG responsibilities as making high-quality data available to their health ecosystem partners to monitor delivery of health system objectives and to further improve DQ across the health enterprise and across the data production cycle from data creation and collection to data management, curation and use. The dimensions of this theoretical framework would require testing with qualitative and quantitative studies to examine the applicability and utility, along with an evaluation of its impact.


1. National Health & Hospital Reform Commission. A Healthier Future For All Australians – Final Report of the National Health and Hospitals Reform Commission–June 2009. Canberra: Commonwealth of Australia, 2009.

2. Commonwealth of Australia. National Collaborative Research Infrastructure Strategy. Canberra 2006. Available from:

3. de Lusignan S and Chan T. The development of primary care information technology in the United Kingdom. The Journal of Ambulatory Care Management 2008;31(3):201–10. Available from: PMid:18574377.

4. Pearce C, Shearer M, Gardner K and Kelly J. A division’s worth of data. Australian Family Physician 2011;40(3):167–70. PMid:21597524.

5. Goetz T. Sergey Brin’s Search for a Parkinsons Cure. Wired Magazine, 22 June 2010. Available from:

6. Boyd D and Crawford K. Six provocations for big data. A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society. Oxford Internet Institute, 2011.

7. Stewart W, Shah N, Seina M, Paulus R and Walker J. Bridging the inferential gap: the electronic health record and clinical evidence. Health Affairs (MIllwood) 2007;26(2):w181–91. Available from:

8. Phillips C, Pearce C, Hall S, Travaglia J, de Lusignan S, Love T, et al. Can clinical governance deliver quality improvement in Australian general practice and primary care? A systematic review of the evidence. The Medical Journal of Australia 2010;193(10):602–7. PMid:21077818.

9. Thiru K, Hassey A and Sullivan F. Systematic review of scope and quality of electronic patient record data in primary care. British Medical Journal 2003;326(7398):1070–4. doi: 10.1136/bmj.326.7398.1070.

10. de Lusignan S, Liaw S, Krause P, Curcin V, Vicente M, Michalakidis G, et al. Key concepts to assess the readiness of data for International research: data quality, lineage and provenance, extraction and processing errors, traceability, and curation. IMIA Yearbook of Medical Informatics. Netherlands: IOS Press BV, 2011, 112–21. PMid:21938335

11. Weick K. The Impermanent Organisation. Chichester, UK: Wiley, 2009.

12. Greenhalgh T, Russell J, Ashcroft R and Parsons W. Why national eHealth programs need dead philosophers: Wittgensteinian reflections on policymakers reluctance to learn from history. Millbank Quarterly 2011;89(4).

13. Berg M. Patient care information systems and health care work: a sociotechnical approach. International Journal of Medical Informatics 1999;55(2):87–101. Available from:

14. Berg M, Aarts J and van der Lei J. ICT in health care: sociotechnical approaches. Methods of Information in Medicine 2003;42(4):297–301. PMid:14534625.

15. Ciborra C. Hospitality and IT. Amsterdam: Universiteit van Amsterdam, 1999.

16. Best A, Greenhalgh T, Lewis S, Saul J, Carroll S and Bitz J. Large-system transformation in health care: a realist review. Millbank Quarterly 2012;90(3):421–56. Available from: PMid:22985277; PMCid:PMC3479379.

17. Greenhalgh T, Voisey C and Robb N. Interpreted consultations as ‘business as usual’? An analysis of organisational routines in general practices. Sociology of Health & Illness 2007;29(6): 931–54. Available from: PMid:17986023.

18. Pawson R and Tilley N. Evidence-Based Policy: A Realist Perspective. London: Sage, 2006.

19. Ciborra C. Hospitality and IT. Amsterdam: Universiteit van Amsterdam, 1999.

20. Habermas J. The Theory of Communcative Action v2; Lifeworld and System: A Critique of Functionalist Reason. Cambridge: Polity Press, 1987.

21. Giddens A. The Constitution of Society: Outline of the Theory of Structuration. Cambridge: Polity Press, 1984.

22. Davenport T and Prusak L. Information Ecology. Oxford, UK: Oxford University Press, 1997.

23. Pór G. Nurturing systemic wisdom through knowledge ecology. The Systems Thinker 2000;11(8):1–5.

24. Pearce C, Shearer M, Gardner K, Kelly J and Xu TB. GP Networks as enablers of quality of care: implementing a practice engagement framework in a General Practice Network. Australian Journal Primary Health 2012;18(2):101–4. Available from: PMid:22551830.

25. Blumenthal D and Dixon J. Health-care reforms in the USA and England: areas for useful learning. Lancet 2012;380(9850): 1352–7. Available from:

26. McCrickerd J. Metaphors, models and organisational ethics in health care. Journal of Medical Ethics 2000;26(5): 340–5. Available from: PMid:11055036; PMCid:PMC1733288.

27. Liaw ST, Chen HY, Maneze D, Taggart J, Dennis S, Vagholkar S, et al. Health reform: is routinely collected electronic information fit for purpose? Emergency Medicine Australasia 2012;24(1):57–63.

28. Liaw S, Taggart J, Dennis S and Yeo A. Data quality and fitness for purpose of routinely collected data – a case study from an electronic practice-based research network (ePBRN). American Medical Informatics Association Annual Symposium 2011. Washington DC: Springer Verlag, 2011. Available from: PMid:22313561.

29. de Lusignan S, Chan T, Theadom A and Dhoul N. The roles of policy and professionalism in the protection of processed clinical data: a literature review. International Journal of Medical Informatics 2007;77(5):291–304.

30. Sutcliffe D, Lester H, Hutton J and Stokes T. NICE and the Quality and Outcomes Framework (QOF) 2009–2011. Quality in Primary Care 2009;20(1):47–55.

31. Weber K, Otto B and Osterle H. One size does not fit all – a contingency approach to data governance. Journal of Data & Information Quality (JDIQ) 2009;1(1):4. Available from:

32. Wende K. A Model for Data Governance – Organising Accountabilities for Data Quality Management. 18th Australasian Conference on Information Systems (5–7 Dec, Toowoomba: Australia), 2007.

33. Rthlin M. Management of Data Quality in Enterprise Resource Planning Systems. BoD-Books on Demand, 2010.

34. Wende K and Otto B. A contingency approach to data governance. Proceedings of International Conference on Information Quality, 2007. pp. 163–76.

35. Darzi A. Our NHS – secured today for future generations. High Quality Care For All – NHS Next Stage Review Final Report. London: Department of Health, 2008.


  • There are currently no refbacks.

This is an open access journal, which means that all content is freely available without charge to the user or their institution. Users are allowed to read, download, copy, distribute, print, search, or link to the full texts of the articles in this journal starting from Volume 21 without asking prior permission from the publisher or the author. This is in accordance with the BOAI definition of open accessFor permission regarding papers published in previous volumes, please contact us.

Privacy statement: The names and email addresses entered in this journal site will be used exclusively for the stated purposes of this journal and will not be made available for any other purpose or to any other party.

Online ISSN 2058-4563 - Print ISSN 2058-4555. Published by BCS, The Chartered Institute for IT