Of the many initiatives emerging from the January 2013 HLCM retreat, the Committee agreed to include as a priority the improvement of the UN system’s capacity and ability to present UN system data, including improving the system’s capacity to implement standards for data presentation. In addition, the Committee agreed, as part of its improvement of its working methods, to invite experts to address specific priority areas. The Committee, therefore, welcomed Mr Adam Bly, a noted specialist in working with public and private sector institutions to analyse data in new and unique ways, and the CEO and founder of Seed, Inc. and Visualizing.org.
Thanking the Chair, Mr Bly acknowledged the challenges facing UN organizations as they work to modernize and change, and hoped that his presentation would point towards a foundation for making change easier. He stressed that what he had to present was not a single event, but a fundamental transformation in the way organizations think of data, and the skills needed to manage data, asserting that the ability to perform these tasks effectively will define successful organizations, including governments, NGO’s and the private sector. He noted that in some regards, the UN is already leading in the data revolution, if not in a concerted way across the system. The presentation aimed to take the Committee through the process of innovating with data and how data can be the foundation for innovation.
Mr Bly noted that we live in an era of complexity, and that to look at the world without complexity missed the key point. For example, he noted that to understand health requires an understanding of the interrelationship between disease and such factors as, say, climate models, which in turn requires an analysis of energy composition, which in turn, drives greenhouse gas emissions. These, in turn, can be impacted by education and population dynamics, which circles back to disease. He followed this by introducing two other aspects of the global environment: the velocity, or rate of change, and austerity, i.e. the financial pressures on institutions.
All three aspects – complexity plus velocity plus austerity – define the era of “big data”. To provide some perspective on the “big” in “big data”, Mr Bly noted that 2.5 quintillion bytes of data are created every day, with 90% of data in the world today created in the last two years. This volume of data presents three opportunities for institutions – smarter decision-making, a new language for collaboration and new knowledge and innovations. Any innovation can take time to become integrated into organizations and effect management culture changes. We are at the point where the innovation, the ability to manage large amounts of data, is happening and we can only speculate on the ultimate impact this will have on organizations. The presentation noted that big data was not a “technology” revolution, but that the fundamental tools are mathematics, science and design.
Thinking about data begins with a needs-analysis, focusing on the decisions facing each of us that could benefit from an evidence-based approach. Mr Bly introduced the concept of a data continuum, where on one end of the spectrum reside activities and actions that are easily measureable and the un-measurable occupying the other end. In between, the seemingly un-measureable present enterprises with opportunities to test and experiment with measuring different aspects of an organization’s activities. He suggested that practitioners start by assuming that the data can assist in decision-making, and then test those assumptions utilizing the tools available.
However, implementing a data-driven decision-making approach assumes that an organization has the appropriate data available. Data available to organizations generally fall into four categories, with each category presenting opportunities and challenges. Proprietary data, the first category, represents data that each organization collects, defines and houses. Often, institutions, as they grow, begin to lose track of the data resident in their systems. Organizations can also purchase data, the second category. Frequently, data purchased is structured differently from in-house repositories, and therefore can present challenges when trying to integrate in a unified way.
The third category, data exhaust, is data generated through other activities. An example is data generated through the use of social media and mobile communications, such as actions on services like Facebook or Twitter. These actions serve as proxies for other behaviours, and although using this data can present privacy concerns, these data streams can offer exciting possibilities for understanding population activities. The forth and finally category is open data, which is freely available to anyone and generally originates from governments and organizations like the United Nations. A key challenge for the UN and its organizations that make data available is to increase the usage of these data sets. Mr Bly suggested that simply making the data available is not enough to ensure its effective use, and that developing tools that provide analysis could increase its value.
Creating a data-driven environment starts with a complete understanding of the data available within the institution. Institutions must first inventory and characterize existing data, which includes its format, structure, taxonomy, frequency of updating and location (i.e. where it is stored and who has access). Furthermore, institutions will need to understand the relationship between this data and activities, an exercise known as “mapping” the data. Mr Bly stressed that these activities are business-related, and not solely an ICT function.
Only after an organization has inventoried and mapped its data can it begin the process of extracting value, which is achieved through a process of analysis using a variety of mathematics and science processes (e.g. correlation analysis, natural language processing, complex systems science, algorithm design, anomaly detection, et al). Visualization tools, which utilize a design-first approach, presents aggregated data in graphic form, allows for the detection of patterns and trends not otherwise easily recognized. The presentation demonstrated both analytical and visualization approaches using examples from the Rio+20 Conference, the MyWorld project (data.myworld2015.org) and the private sector. These examples demonstrated how visualizations can assist in solving a variety of business problems.
Finally, the presentation noted that beyond analysis of internal business analysis, the analytics and visualization methods described could also be applied to communicating with a specific audience. Mr Bly concluded by emphasising that almost all enterprises, including agencies of the UN system, could benefit from a data-driven business decision-making process.
During the discussion, members of the Committee explored several aspects of the presentation. One comment noted that agencies face a challenge as they depend on governments for data of all types, which does not prove very reliable. Questions included mechanisms used to gather data from populations that do not have access to modern communication technology, the profile of staff needed to effectively employ advanced data analytics and visualization techniques, risks that agencies may face as they make data publicly available and ways that agencies can avoid the institutional pitfalls of managing large data repositories, including data silos that develop within organizations. Mr Bly noted the challenges in collecting data from populations that are unconnected, but that some projects, such as MyWorld, are having some limited success with manual data collection methodologies. Regarding the skills needed by organizations, Mr Bly suggested that UN organizations may consider developing capacity in the mathematics and science disciplines needed for these types of analysis. He encouraged agencies to begin the process of inventory and mapping of internal data sets, stressing the importance of a uniform taxonomy so that the same terms are used to describe the same things across agencies.
Mr Bly accepted that risks exist for the presentation of data, however there can also be many benefits. He suggested that agencies work towards involving the public in data analysis, especially in the generation of hypothesis that can, in turn, be tested utilizing the analytical tools presented. Risks can also be mitigated by ensuring the data is presented along with any appropriate caveats. He also suggested that agencies work together when collecting data, and not duplicate field data collection activities. Overcoming silos can also present challenges. Senior-level encouragement to bring datasets together is one place to start, with the goal to make the right data available to the right people at the right time.
The HLCM Chair concluded the session by thanking Mr Bly for his enlightening presentation, noting that all of these tools are increasingly fundamental to organizations of the UN system, which must explore how to employ these capabilities. The Chair suggested that a first step should be the adoption by agencies of open data policies, followed by a concerted effort to begin an inventory and mapping process, followed by the development of taxonomies, so that the system understands the data it has available.
Organizations acknowledged the challenges in embarking on a project of this nature, but agreed on the importance of doing so, further recognizing its linkage to the High-Level Committee on Programmes and that some efforts in this area are already likely in progress through entities such as, inter alia, the UN Statistics Division of the UN Secretariat, which makes data available through its data.un.org website and the UN Geographical Information Working Group (UNGIWG).
Agreed to create a working group that would explore this area further and propose common action as part of its Strategic Plan, with respect to open data policies, inventory and mapping of data, and development of taxonomies.