Close this search box.

We are creating some awesome events for you. Kindly bear with us.

EXCLUSIVE- Discussion on Big Data challenges – unstructured data, privacy, sharing, integration

EXCLUSIVE- Discussion on Big Data challenges – unstructured data

Senior executives dealing with ICT from public sector, education and health care organisations in Singapore came together at the OpenGov Leadership Breakfast Dialogue, ‘The Big in Big Data – Managing the unmanageable’ in Singapore on the 10th of November. Two hours of interactive discussion yielded fascinating insights into a range of issues related to collection, storage, sharing and analysis of Big Data.

Christopher Aw (below, second from right), Regional Lead, Public Sector Programs, MarkLogic initiated the dialogue talking about changes in ‘model’, ‘mentality’ and ‘mission’ in the public sector. Data models have evolved from hierarchical to relational to an era, where massive volumes of data, whether it is in military intelligence or patient healthcare records, is unstructured. Storing operational data in relational data models is losing its utility in making sense of the data and gaining insights from it. Mr. Aw quoted a number of 12%, as the proportion of enterprise data which was in highly structured databases, as of 2014.

Mentality is shifting from a system-centric approach, where there are many different applications, each with a different back-end, being fed data from multiple sources. The current trend is in favour of a data-centric approach, where all the data is processed in one place, being loaded and indexed from multiple ever-changing sources and the output is delivered to the right user in the right format in real time.

In terms of ‘mission’, Mr. Aw said that IT is merging with operations. It is leading to requirements such as Joint Metadata catalogs for enabling simultaneous search of disparate databases.

These trends necessitate a shift in the way data is dealt with, the manner of its collection and storage.

Henry Chao, Former Deputy CIO and Deputy Director, Office of Information Services, Centers for Medicare & Medicaid Services in the US shared his experience leading the creation of the insurance marketplace as part of the implementation of the implementation of the Patient Protection and Affordable Care Act (ACA) or Obamacare. The Act was hoping to provide affordable coverage to additional 20 million Americans.

Mr. Chao broke down the timeline from the signing of the ACA in March 2010 to the launch of in October 2013. Such as a vastly ambitious project represented a range of challenges regularly faced during implementation of large-scale ICT projects in the public sector.

There was uncertainty in the scope. It was forever changing. Regulations had to be written that laid out how the programme would operate. For a long, the team operated on a monthly stipend, making it difficult to award contracts. Connections had to be established with a whole bunch of federal and state level agencies. The traditionally months-long process of insurance underwriting had to be reduced to a few seconds. Over 1600 different insurance products had to be integrated and tested.

The information provided by the applicants filling up the online forms had to be preserved, while shifting some of the applicants to state programme more beneficial for them.

It was completely new set of complex problems that had never been dealt with before. Mr. Chao highlighted the key question of when are you going to have enough information to build the critical pieces and have a minimum viable product. The team adopted its own brand of agile development, parallel development of business processes and communication plans with stakeholders.

 If they had to absorb the shocks and the volatility of ever-changing requirements, refactoring the relational model as many times as the application code would make the process significantly more challenging. During the last three months, there are 180 builds, an average of 2 a day. In that scenario, you don’t want to be encumbered by a changing relational model. A NoSQL database could help in tackling these issues.

Next, Klaus Felsche (above left), Former Director of Analytics from the Department of Immigration and Border Protection in Australia spoke about data needing to support decisions, actions and services. He presented a 3-step process of seeing, understanding and acting to enable evidence-based decision-making and solve business problems.

Governments collect huge amounts of extremely valuable data, which could be potentially used to improve services and consequently the lives of the citizens. Mr. Felsche said, “We now need to construct systems where we don’t know all of the questions data can answer. We don’t even know some of the questions we need to answer.”

But a pre-requisite for this is the ability to collect and store information in a way that it is available for analysis. Mr. Felsche brought up what he called ‘invisible data’. That is data that cannot be used. It might be lying on a tape in a vault somewhere or in some inaccessible part of the network.

Questions and discussion

The first question posed to the delegates was, ‘What are some of the biggest data challenges in your organisation’. 40% responded that it was the difficulty in accessing information. Increasing efforts to manage data and challenges posed by manual aggregation of data garnered 25% and 15% of votes respectively.

Rupert Gwee (below right), Director, Human Resources Transformation Office, National Service Affairs Directorate, Human Resources Division, Ministry of Home Affairs spoke about data not being organised and tagged properly, making it difficult to analyse it. Forward screening is what is required. You have to come up the concept and then articulate the requirements. In other words, it is about having a clear problem statement and figuring out what data sources will have to be grouped together. Otherwise people do not know how to organise the information. It is not collected in a way that would be useful.

The issue of data silos and privacy also surfaced in the subsequent interesting discussion. Vivien Chow (below), Director Applied Innovation and Partnership, Government Technology Agency (GovTech), responded that integration of data from different sources is required. But the different agencies are at different stages of understanding how to anonymise the data. Sometimes the data set on its own might be adequately anonymised. But it could be reidentified when combined with other data sets. The hurdle of effective anonymisation has to be surmounted before encouraging data sharing.

To tackle the issue, GovTech is working on a proof-of-concept (POC) for homomorphic encryption, which can enable analysis of encrypted data. If the POC is successful, it can be used across the government agencies.

Big data Singapore Nov 2016

Peter Tan Chin Seng (below right), Principal Architect, National Architecture Office, Integrated Health Information Systems (IHIS) Pte Ltd said that even after anonymisation of identifiers, data can be re-identified sometimes. Especially in the boundary cases. For instance, say 99-year olds with a specific condition.

Mr. Seng also talked about a problem with early technology adoption, as happened in Singapore’s hospitals. Now that there is a lot of data which is difficult to integrate because of change in data structures and existing systems being based on the older ones. Efforts to change in order to harmonise data can face resistance.

         Current technology is capable of redacting certain part of the information, such as edge cases, on the fly, whereas before separating out that one vector was difficult. Then the data can be shared without compromising privacy. Encryption, anonymisation and redaction are the three keys to this. Earlier you would need to encrypt the entire database. Now it can be done based on certain criteria.

Big data Singapore Nov 2016

         The conversation veered to the presentation of data. Paul Gagnon, Director, E-Learning, IT Systems and Services, Nanyang Technological University – Lee Kong Chian School of Medicine said that his biggest challenge is to find the best possible way to display data to the front-end user, so that they can find information quickly. Different groups of stakeholders can have widely varying opinions on it and factoring in everyone’s needs or demands can prove to be a tough challenge.

Mr. Felsche said in response that putting out a viable product is important, as subsequent iterations can keep improving it. Trying to satisfy everyone can place government in a gridlock.  

The next query to the delegates was about how they manage unstructured data (documents, attachments, pictures, sound, video etc). Here, a majority, 67%, replied that some of the unstructured data is included but it is not possible to include all.

Dr. John Kan, Chief Information Officer, Agency for Science, Technology and Research described the current approach of designing a content management system around the requirements, so that the essential data at least is classified and stored properly. Sometimes, you might need to choose what to manage.

Mr. Seng said that currently transactional systems are mostly relational. So, for dealing with data, metadata is being indexed into blocks. But this needs improvement.

The right metadata can be critical in managing data. Mr. Gwee provided another angle to this aspect:

         Sometimes, there is over-analysis on how to use data. Basic analysis can suffice for most government needs. But when you want to move to the next level, that requires a paradigm shift. He gave the example of 300,000 to 360,000 people crossing the Johor–Singapore Causeway every day. The volume is huge and it is not like the airlines, where identity of the passengers is verified in advance. Managing that flow, while avoiding intrusive methods, demands smart approaches. Sometimes, simple ideas could be the smartest and heavy crunching might not be required every time. Like tracking army operations by keeping a tab on purchase of food items from provision shops by the soldiers’ wives.

When asked about the most important IT priorities, responses were split, with 40%, 30% and 20% for digital transformation and innovation, improving efficiencies and costs and developing/ deploying customer-facing applications respectively.

It was pointed out by Mr. Mohamad Azman Jaffar, Deputy Director Information Technology, Public Service Division – Prime Minister's Office that transformation and innovation sort of encompasses the other options. Lim Soo Tong, Chief Information Officer, Jurong Health Services concurred, saying that it is their mission.  

Also, earlier efficiencies and costs were the primary focus for ICT. Now that is no longer the case. Senior managements demand digital transformation to support business objectives.

Data can be used to make the business case here for transformational initiatives, from a holistic perspective. Possibilities of early intervention from predictive analytics and cross-pollination resulting in new viewing angles, through combinations and association of different data sets were also discussed. The dialogue moved to the critical role of interagency collaboration, for instance, between the Ministry of Health and Social services, to achieve these kinds of objectives.

Around 67% of attendees said that their mission critical data resides in multiple Relational Database Management Systems (RDBMS). In most Singapore public sector organisations, that data is already on enterprise document management systems and shared.

Dr. Kan said that it was important to know the initial process which generated the data in the RDBMS to know what was included, what was left out.

Concluding the dialogue, Mr. Aw talked about the process of continuous learning and improvement. There are still many gulfs to be traversed and potholes to be avoided. But there is no avoiding data. Data, in ever-increasing volumes, velocity and variety will continue to expand its role in how governments function and governments have to evolve, adapt and progress.


Qlik’s vision is a data-literate world, where everyone can use data and analytics to improve decision-making and solve their most challenging problems. A private company, Qlik offers real-time data integration and analytics solutions, powered by Qlik Cloud, to close the gaps between data, insights and action. By transforming data into Active Intelligence, businesses can drive better decisions, improve revenue and profitability, and optimize customer relationships. Qlik serves more than 38,000 active customers in over 100 countries.


As a Titanium Black Partner of Dell Technologies, CTC Global Singapore boasts unparalleled access to resources.

Established in 1972, we bring 52 years of experience to the table, solidifying our position as a leading IT solutions provider in Singapore. With over 300 qualified IT professionals, we are dedicated to delivering integrated solutions that empower your organization in key areas such as Automation & AI, Cyber Security, App Modernization & Data Analytics, Enterprise Cloud Infrastructure, Workplace Modernization and Professional Services.

Renowned for our consulting expertise and delivering expert IT solutions, CTC Global Singapore has become the preferred IT outsourcing partner for businesses across Singapore.


Planview has one mission: to build the future of connected work. Our solutions enable organizations to connect the business from ideas to impact, empowering companies to accelerate the achievement of what matters most. Planview’s full spectrum of Portfolio Management and Work Management solutions creates an organizational focus on the strategic outcomes that matter and empowers teams to deliver their best work, no matter how they work. The comprehensive Planview platform and enterprise success model enables customers to deliver innovative, competitive products, services, and customer experiences. Headquartered in Austin, Texas, with locations around the world, Planview has more than 1,300 employees supporting 4,500 customers and 2.6 million users worldwide. For more information, visit


SIRIM is a premier industrial research and technology organisation in Malaysia, wholly-owned by the Minister​ of Finance Incorporated. With over forty years of experience and expertise, SIRIM is mandated as the machinery for research and technology development, and the national champion of quality. SIRIM has always played a major role in the development of the country’s private sector. By tapping into our expertise and knowledge base, we focus on developing new technologies and improvements in the manufacturing, technology and services sectors. We nurture Small Medium Enterprises (SME) growth with solutions for technology penetration and upgrading, making it an ideal technology partner for SMEs.


HashiCorp provides infrastructure automation software for multi-cloud environments, enabling enterprises to unlock a common cloud operating model to provision, secure, connect, and run any application on any infrastructure. HashiCorp tools allow organizations to deliver applications faster by helping enterprises transition from manual processes and ITIL practices to self-service automation and DevOps practices. 


IBM is a leading global hybrid cloud and AI, and business services provider. We help clients in more than 175 countries capitalize on insights from their data, streamline business processes, reduce costs and gain the competitive edge in their industries. Nearly 3,000 government and corporate entities in critical infrastructure areas such as financial services, telecommunications and healthcare rely on IBM’s hybrid cloud platform and Red Hat OpenShift to affect their digital transformations quickly, efficiently and securely. IBM’s breakthrough innovations in AI, quantum computing, industry-specific cloud solutions and business services deliver open and flexible options to our clients. All of this is backed by IBM’s legendary commitment to trust, transparency, responsibility, inclusivity and service.