Search
Close this search box.

We are creating some awesome events for you. Kindly bear with us.

EXCLUSIVE – Innovation and informed-decision making through high quality open data on Data.gov.sg

EXCLUSIVE - Innovation and informed-decision making through high quality open data on Data.gov.sg

Above: Lin Zhaowei, Consultant, Data Science Division, Government Technology Agency of Singapore; Photo courtesy: Data Science Division, Government Technology Agency of Singapore

In April 2015, Prime Minister Lee Hsien Loong described his vision of Singapore as ‘a safe and secure data market place.  A place where companies can easily conduct testing, and extract insights on market research, on consumer trends.  A place where data can be shared in order to unlock value and innovation.  A place where the Government releases many data sets to the public to build applications and services.’

Open data sharing is one of the key priorities in Singapore’s Smart Nation journey.  Opening up datasets and APIs (Application programming interface) can encourage citizens to make better informed decisions in their daily lives. It can drive innovation, with developers creating new applications and services for fellow citizens. It is about catalysing a civic innovation movement from the ground up.

Data.gov.sg was launched in 2011 as an open data repository of government datasets. However, the datasets and APIs were not as polished and standardised initially as they could be, nor as relevant and understandable to the layman.  The government subsequently shifted focus from quantity to quality of datasets, to ensure that data is machine-readable, and that it is easily understandable for the public. The public beta of the new portal was launched in July 2015.

Snapshot from Data.gov.sg

Snapshot from Developers.data.gov.sg

Data.gov.sg currently has 990 datasets on the website, which is expected to increase to 1000 in a few weeks. The higher frequency datasets, primarily environment and transport data, are on the Developers’ page, launched in April 2016. The last quarter of 2016 saw an average of around 200,000 page views and 2,000 to 3,000 downloads of datasets every month.

OpenGov met with Lin Zhaowei, Consultant, Data Science Division, Government Technology Agency of Singapore to learn more about how the objectives of data.gov.sg are being achieved. He talked about ensuring the quality, usability and usefulness of data, privacy concerns, Open Data Licence and API terms, and increasing awareness about the availability and potential of open data.

Data visualisations and dashboards

In 2015, we launched a portal with a new look and new data standards. We wanted to make the data more understandable to people, which is why at the same time we also launched data visualisations and dashboards.

For the layman, charts are the main touchpoint. They are not going to be interested in 10,000 rows of data. They just want to be able to see the rough trends and that’s what we hope to accomplish through the visualisations. At the same time, because the data is already in a tidy, machine-readable format, power users can work with it.


Snapshot from Data.gov.sg

Snapshot from Data.gov.sg

For example, in 2015, there was a bout of haze. We had a dashboard showing real-time PSI readings, the idea was to present data in a way that is meaningful to people. Right now, we have a general Singapore at-a-glance dashboard. We select datasets that we think would be interesting to people, for example, population, PM2.5, dengue incidences. These have direct relevance for general citizens.

We have dashboards based on 9 topics, economy, education, environment, finance, health, infrastructure, society, technology and transport.

Ensuring data quality

We placed our data quality guide on our Github page. The guide was first disseminated  to agencies and refined based on feedback and consultations. During the second half of 2016, we released the guide to the public as well. Members of the public can suggest enhancements, which helps us ensure that our data is of the right quality and presented in a consistent way.

Last year, we needed to migrate datasets from the old site to the new one. At the launch of the new portal, we had around 100+ datasets that we cleaned up on our own, based on our new data quality standards. Then we asked the agencies to implement the standards and submit datasets based on our requirements. We worked through examples with the agencies, explaining what the data is used for and why it should be structured this way.

We also held workshops, as well as individual consultations with the agencies. After they submit the data to us, we do another round of checks before we publish the data to make sure that it is consistent and of the right quality.

Say you have transport related datasets, based on vehicle type, such as cars, motorcycles and so on. We have to ensure that spellings are consistent. For instance, spelling motorcycle with a hyphen in one set and without one in another dataset, would cause issues in combining or comparing datasets. We have to avoid such problems and make it easy for users. We want our data to be useful and we want people to be able to use it without having to waste time cleaning the data.
         
         
         

Data ownership and privacy

Data is collected, governed and protected under strict safeguards according to personal data management rules that all government agencies comply with. These rules largely mirror the Personal Data Protection Act which governs personal data protection in the private sector, but are written in context of the public service.

One-stop shop for data

Previously, any agency which wanted to provide real-time data would have their own webpage, and sign-up process for users. So, if I need data from agencies A, B and C, I would need to sign up for three different websites.

With data.gov.sg, we wanted to create a one-stop portal. You can sign up here and access data from different agencies, without having to jump around to figure out which agency has which data.

It is part of our mandate to help members of the public find and use information they need from the different government agencies. We are the middleman in that sense. We are working to get more APIs on this website. We are working with individual agencies to figure how to retrieve the data and update it here, with minimal lag.  
         
         
         

There’s another benefit to having data from different agencies on one website. It can spark fresh ideas and innovation. Say I am interested in traffic data. I come to this site and I see that there is rain and wind data available. Can I use that as well and draw insights?

Open Data Licence and Terms of use

The Ministry of Finance worked on the Open Data Licence, in consultation with many agencies including GovTech We drew inspiration from the open data definition, provided by Open Knowledge International. Readability was one of the things we considered. If we make it too complex and filled with legalese, users are not going to read it or understand it.

Just as we do with our datasets, with our licences also, we want to make it easily understandable. We went through many revisions to ensure that the language is accessible. Jargon is not used unless it is absolutely necessary.

Previously there was no standardised licence for data published by government agencies. Certain agencies, like the Singapore Land Authority are selling some of the information, such as map data. Each agency had their own version of a data licence. There is a bunch of licenses that you have to read and understand when using published data.

         So, we wanted to make it clear that if it was published as open data, then it is free to use, this encourages more people to use data. It also adheres to the international norms on what open data really is.  
         
         
         

Previously, you had to inform Data.gov.sg if you wanted to use the data for commercial purposes. That requirement has been eliminated. But we have an attribution clause and we would like it if people attribute us when they use the data.

Outreach

Outreach is a continual process for us. The Data.gov.sg blog aims to illustrate interesting trends and highlight meaningful applications of data. For example, our data science team recently penned a piece on how the Circle Line rogue train was caught with data, while another piece studied data on the NEA grading and hygiene level of hawker food.

We also make it a point to actively engage with citizen groups, such as data science meet-up groups. We have spoken at a couple of these meet-ups to talk about what we have been doing and also to gather feedback on what other data scientists think we should be doing.  

We also held a data visualisation workshop for journalists recently, to go through the basic techniques on using data for data journalism. The response to these engagements have been good. We can’t quantify the outcome yet, but it’s something that we will continue to do.

We have an upcoming competition for students in our universities, polytechnics, junior colleges and institutes of technical education. It is called the National Data Visualisation Video Challenge. Participants have to analyse and use open data to create a short video presentation. We hope to see collaboration between students in different disciplines, such as those in the computer science, analytics and visual media. We want the students to discover ways to present data in a creative and layman-friendly format to their friends and families, who might be intimidated by data.
         
         
         

The way ahead

We have stabilised the platform over the past year and a half. Our main concern now is getting the content out there. We want to work more closely with agencies, in order to respond more quickly to requests from the public, so that we can release more useful datasets.

We also want to increase the number of APIs and improve their reliability, minimising disruptions.

Work will continue for improving the website. We will enhance the user experience by adding functionality, such as allowing people to compare datasets.

As we wrap up our migration of the data, we will start assessing how useful the data is. We will start looking at generating reports for each agency, how frequently their data is accessed, how many people are downloading it. If no one is looking at certain datasets for say a year and a half after publication, we want to stop maintaining them. We want to make sure that the data we are publishing is actually of use.

PARTNER

Qlik’s vision is a data-literate world, where everyone can use data and analytics to improve decision-making and solve their most challenging problems. A private company, Qlik offers real-time data integration and analytics solutions, powered by Qlik Cloud, to close the gaps between data, insights and action. By transforming data into Active Intelligence, businesses can drive better decisions, improve revenue and profitability, and optimize customer relationships. Qlik serves more than 38,000 active customers in over 100 countries.

PARTNER

CTC Global Singapore, a premier end-to-end IT solutions provider, is a fully owned subsidiary of ITOCHU Techno-Solutions Corporation (CTC) and ITOCHU Corporation.

Since 1972, CTC has established itself as one of the country’s top IT solutions providers. With 50 years of experience, headed by an experienced management team and staffed by over 200 qualified IT professionals, we support organizations with integrated IT solutions expertise in Autonomous IT, Cyber Security, Digital Transformation, Enterprise Cloud Infrastructure, Workplace Modernization and Professional Services.

Well-known for our strengths in system integration and consultation, CTC Global proves to be the preferred IT outsourcing destination for organizations all over Singapore today.

PARTNER

Planview has one mission: to build the future of connected work. Our solutions enable organizations to connect the business from ideas to impact, empowering companies to accelerate the achievement of what matters most. Planview’s full spectrum of Portfolio Management and Work Management solutions creates an organizational focus on the strategic outcomes that matter and empowers teams to deliver their best work, no matter how they work. The comprehensive Planview platform and enterprise success model enables customers to deliver innovative, competitive products, services, and customer experiences. Headquartered in Austin, Texas, with locations around the world, Planview has more than 1,300 employees supporting 4,500 customers and 2.6 million users worldwide. For more information, visit www.planview.com.

SUPPORTING ORGANISATION

SIRIM is a premier industrial research and technology organisation in Malaysia, wholly-owned by the Minister​ of Finance Incorporated. With over forty years of experience and expertise, SIRIM is mandated as the machinery for research and technology development, and the national champion of quality. SIRIM has always played a major role in the development of the country’s private sector. By tapping into our expertise and knowledge base, we focus on developing new technologies and improvements in the manufacturing, technology and services sectors. We nurture Small Medium Enterprises (SME) growth with solutions for technology penetration and upgrading, making it an ideal technology partner for SMEs.

PARTNER

HashiCorp provides infrastructure automation software for multi-cloud environments, enabling enterprises to unlock a common cloud operating model to provision, secure, connect, and run any application on any infrastructure. HashiCorp tools allow organizations to deliver applications faster by helping enterprises transition from manual processes and ITIL practices to self-service automation and DevOps practices. 

PARTNER

IBM is a leading global hybrid cloud and AI, and business services provider. We help clients in more than 175 countries capitalize on insights from their data, streamline business processes, reduce costs and gain the competitive edge in their industries. Nearly 3,000 government and corporate entities in critical infrastructure areas such as financial services, telecommunications and healthcare rely on IBM’s hybrid cloud platform and Red Hat OpenShift to affect their digital transformations quickly, efficiently and securely. IBM’s breakthrough innovations in AI, quantum computing, industry-specific cloud solutions and business services deliver open and flexible options to our clients. All of this is backed by IBM’s legendary commitment to trust, transparency, responsibility, inclusivity and service.