We are creating some awesome events for you. Kindly bear with us.

Multi-agency Data Taskforce led by NSW DAC and ACS makes recommendations to support data sharing while preserving privacy

Multi-agency Data Taskforce led by NSW DAC and ACS makes recommendations to support data sharing while preserving privacy

On October 2, a whitepaper was released on Data Sharing Frameworks by a Data Taskforce led by the Australian Computer Society (ACS), and the NSW Data Analytics Centre (DAC). The taskforce was created to address the overarching challenge of developing privacy preserving frameworks which support automated data sharing to facilitate smart services creation and deployment.

The Taskforce has met more 6 times since June 2016, with representatives from ACS, the NSW DAC, Standards Australia, the office of the NSW Privacy Commissioner, the NSW Information Commissioner, the Federal Government’s Digital Transformation Agency (DTA), the Commonwealth Scientific and Industrial Research Organisation (CSIRO), Data61, the Department of Prime Minister and Cabinet, the Australian Institute of Health and Welfare, SN-NT DataLink, South Australian Government, Victorian Government, West Australian Government, Queensland Government, Gilbert and Tobin, the Communications Alliance, the Internet of Things Alliance, Objective, Telstra, IBM, Mastercard, and Microsoft.

The report notes that “Underpinning the transformation to a smarter, truly digital economy is the ability to share data beyond the boundaries of an organisation, company, or government agency. Future smart services for homes, factories, cities, and governments rely on sharing of data between individuals, organisations, and governments.”

But many data custodians remain hesitant to share data due to concerns about appropriate use and interpretation of data, unintended consequences of sharing data, concerns about accidental release of sensitive data and adherence to privacy legislation.

Current inter-agency data sharing in the NSW government and cross-jurisdictional data sharing

OpenGov reached out to Dr. Ian Opperman CEO and Chief Data Scientist, NSW DAC, with some questions and a request for further information on the work of the Taskforce.

In response to our query, Dr. Opperman explained that at this time, inter-agency data sharing is predominantly undertaken by NSW Data Analytics Centre. Cross jurisdictional sharing is limited.

The NSW Data Analytics Centre complies with 50 pieces of State legislation and additional Commonwealth Government legislation which Privacy and the Data Sharing Act in relation to data sharing.

In 2016 a Statutory Guideline was issued by the NSW Privacy Commission which reinterpreted data sharing and analytics as ‘research’. A Privacy Code is currently under development for some NSW Data Analytics Centre projects to address this issue while more broadly NSW Data Analytics Centre is being repositioned within policy as its projects are directly at enabling the sponsor agency to deliver on their core business of improving services, and NSW Data Analytics Centre having trusted user status.

Government agencies have different level of digital and data maturity, and quality, and many have historical datasets which is challenging for data sharing.

Goals and challenges

The frameworks developed by the Taskforce will seek to address technical, regulatory, and authorising frameworks. The intention is to identify, adopt, adapt, or develop frameworks for data governance, privacy preservation, and practical data sharing which facilitates smart service creation and cross jurisdictional data sharing between governments.

The four focus areas are: cross jurisdictional open data sharing, governance, privacy, and practical data sharing.

The Taskforce is looking towards legislation, principles, policies, practice and standards in similar jurisdictions such as the United Kingdom and European Union. The approach adopted by the Taskforce is to identify best practice where it is known to exist; consider existing models in an Australian privacy context or identify ‘whitespace’ opportunities to develop frameworks for Australia.

In the path towards the development of the framework the Taskforce identified five key challenges:

  • Defining the characteristics of data sets which meaningfully span the spectrum covering open data; highly aggregated personal data sets; lightly aggregated personal data sets and data sets which contain personally identifiable information (excluding health information).
  • Characterisation of “smart service” types – and the associated limitations and obligations of service providers – based on the data sets used to create them.
  • Regulatory Clarification – developing a clear, concise statement of the legal and policy frameworks which enable data sharing for smart services types based on the underlying data sets used.
  • Identification of Personally Identifiable Data – developing an unambiguous test for the presence of personally identifiable information within a sets of data sets.
  • Development of Trusted Data Sharing Frameworks – Whilst not universally true, many data custodians are hesitant to share data. This is often due to concerns about appropriate use and interpretation of data, concerns about unintended consequences of sharing data, concerns about accidental release of sensitive data and concerns about adherence to legislation. Frameworks for trusted data sharing would help address these challenges.

Information and personal information

Information has been described in this paper in terms of the inverse of the probability of an event occurring out of a set of possible events. The less likely an event is to occur, the more information it carries. News of an unexpected event in politics or international affairs carries a great deal of information.

Personal information (also called personally identifying information (PII) or personal data) covers a very a broad range of information about individuals. Data protection laws in different jurisdictions (including States and Territories within Australia) have adopted different definitions. Courts in those jurisdictions have interpreted these definitions in inconsistent ways.

In NSW, according to the Privacy and Personal Information Protection Act (1998) No 133: “… personal information means information or an opinion (including information or an opinion forming part of a database and whether or not recorded in a material form) about an individual whose identity is apparent or can reasonably be ascertained from the information or opinion”.

This is a very broad definition and in principle, covers any information that relates to an identifiable, living individual. In general, after looking at definitions across jurisdictions, the paper notes that a crucial element of most definitions is that personal information must be ‘about an individual …. who is reasonably identifiable’. Whether an individual is reasonably identifiable requires a context specific inquiry.

Data sets that do not identify particular individuals may be used to create personally identifiable information if other data sets are accessed which enable identification of the individuals to whom the shared data sets relate. This other information might be available either internally – for example, by looking up another data set or externally, such as re-identification of individuals through matching of data sets through use of searchable databases such as ASIC records, Land Titles Office property records or through search engines.

The Taskforce goes on use a use a hypothetical parameter, the ‘Personal Information Factor’ (PIF), which is a result of the personal information content of each of the individual data sets used to create a service, functions which operate on the data sets to produce insights and models, individual knowledge of the observer of the insights or models and Additi
onal information available to the observer that the observer could bring to the insights or models.

A two-dimensional framework for services

The whitepaper presents a two-dimensional framework for service types, with two axes of Personal Information Factor and access control (Services Based on non-Personal data, Highly Aggregated Data, ightly Aggregated Data and Personally Identifiable Data).

Services Based on Freely Available Data, Based on Data Available for a ‘Nominal Fee’, Data Available for a Commercial Fee and on Data available to Selected or Qualified Users.

data-taskforce-Service-types-according-to-persona-information-factor-(PIF)-and-access

Service types according to persona information factor and access (Source: Data Sharing Frameworks- Technical White Paper, page 55)

Conclusions and recommendations

The first recommendation is the clarification of existing legal frameworks around privacy needs to include quantified descriptions of acceptable levels of risk in ways which are meaningful for modern data analytics.

Regulatory complexity often obstructs sharing of data. It is easy to read ‘not allowed’ into existing regulations at one or more levels. The ambiguity about the presence of personal information in data sets highlights the limitations of most existing regulatory frameworks. The inability of human judgment to determine ‘reasonable’ likelihood of reidentification when faced with sets of large complex data limits the ability to appropriately apply the regulatory test.

The Taskforce also recommends the development of a framework which supports anonymisation of data which in turn facilitates sharing.

The areas which have the greatest potential to drive productivity in Australia are also the areas which require access to the most sensitive and personal data sets – health, superannuation, human services, and education.

New technologies – determining minimum cohort size, differential privacy[1], homomorphic encryption[2], and privacy preserving linkage – all address concerns associated with re-identification of individuals from linked data sets, and yet all are at relatively early stages of development. In all parts of the world, there is currently only very high-level guidance, nothing quantitative, as to what ‘anonymised’ means, hence many organisations must determine what “anonymised” means to them based on different data sets. Maturing these technologies by encouraging pilot projects and safe trials would benefit all jurisdictions.

Recommendation 3 is the development of a test for the existence of Personally Identifiable Data. Information is created when data sets are joined. Collating data from millions of sensors operating at billions of cycles per second is fundamentally incompatible with relying on human judgements to determine the existence of personally identifiable information. Creating a nationally acceptable test will greatly increase the scope for smart services whilst still leaving room for judgement in risky situations.

Recommendation 4 is to establish agreed standards for minimum cohort size based on data type. In order to protect individual privacy and to acknowledge concerns about “likely” or “reasonably” re-identification, minimum cohort sizes should be agreed and communicated for different levels of data value. This would help data joining and minimise challenges around use of widely varying levels of aggregation.

The fifth recommendation, which is complementary to Recommendation 4, is to have Agreed standards for Obfuscation / Perturbation. This can not only help provide confidence that data has been robustly de-identified, it can also help with the creation of minimum cohort sizes.

Development and promotion of open data enablers is the sixth recommendation. In support of Recommendation 2, in-depth guidelines should be developed on anonymisation and de-identification that, like those issued by the UK Office of the Information Commissioner, consider a balanced approach to the risk of harm resulting from any reidentification.

The final recommendation is the establishment and maintenance of a dataset of issues arising from Privacy Impact Assessments. The taskforce notes that much of the data being shared has been collected with some form of express or implied consent, for some specific purpose. Respecting this consent, while supporting sharing, will be a major challenge in establishing effective ‘privacy preserving’ frameworks.

Read the complete whitepaper here

[1]Differential privacy involves the preservation of data sets, sufficient to answer queries accurately, but reducing the likelihood of disclosing personal information. 

[2] Homomorphic encryption allows computations to be carried out on encrypted data, generating an encrypted result which, when decrypted, matches the result of operations performed on the plaintext. It can enable the chaining together of different services without exposing the data to each of those services. 

PARTNER

Qlik’s vision is a data-literate world, where everyone can use data and analytics to improve decision-making and solve their most challenging problems. A private company, Qlik offers real-time data integration and analytics solutions, powered by Qlik Cloud, to close the gaps between data, insights and action. By transforming data into Active Intelligence, businesses can drive better decisions, improve revenue and profitability, and optimize customer relationships. Qlik serves more than 38,000 active customers in over 100 countries.

PARTNER

CTC Global Singapore, a premier end-to-end IT solutions provider, is a fully owned subsidiary of ITOCHU Techno-Solutions Corporation (CTC) and ITOCHU Corporation.

Since 1972, CTC has established itself as one of the country’s top IT solutions providers. With 50 years of experience, headed by an experienced management team and staffed by over 200 qualified IT professionals, we support organizations with integrated IT solutions expertise in Autonomous IT, Cyber Security, Digital Transformation, Enterprise Cloud Infrastructure, Workplace Modernization and Professional Services.

Well-known for our strengths in system integration and consultation, CTC Global proves to be the preferred IT outsourcing destination for organizations all over Singapore today.

PARTNER

Planview has one mission: to build the future of connected work. Our solutions enable organizations to connect the business from ideas to impact, empowering companies to accelerate the achievement of what matters most. Planview’s full spectrum of Portfolio Management and Work Management solutions creates an organizational focus on the strategic outcomes that matter and empowers teams to deliver their best work, no matter how they work. The comprehensive Planview platform and enterprise success model enables customers to deliver innovative, competitive products, services, and customer experiences. Headquartered in Austin, Texas, with locations around the world, Planview has more than 1,300 employees supporting 4,500 customers and 2.6 million users worldwide. For more information, visit www.planview.com.

SUPPORTING ORGANISATION

SIRIM is a premier industrial research and technology organisation in Malaysia, wholly-owned by the Minister​ of Finance Incorporated. With over forty years of experience and expertise, SIRIM is mandated as the machinery for research and technology development, and the national champion of quality. SIRIM has always played a major role in the development of the country’s private sector. By tapping into our expertise and knowledge base, we focus on developing new technologies and improvements in the manufacturing, technology and services sectors. We nurture Small Medium Enterprises (SME) growth with solutions for technology penetration and upgrading, making it an ideal technology partner for SMEs.

PARTNER

HashiCorp provides infrastructure automation software for multi-cloud environments, enabling enterprises to unlock a common cloud operating model to provision, secure, connect, and run any application on any infrastructure. HashiCorp tools allow organizations to deliver applications faster by helping enterprises transition from manual processes and ITIL practices to self-service automation and DevOps practices. 

PARTNER

IBM is a leading global hybrid cloud and AI, and business services provider. We help clients in more than 175 countries capitalize on insights from their data, streamline business processes, reduce costs and gain the competitive edge in their industries. Nearly 3,000 government and corporate entities in critical infrastructure areas such as financial services, telecommunications and healthcare rely on IBM’s hybrid cloud platform and Red Hat OpenShift to affect their digital transformations quickly, efficiently and securely. IBM’s breakthrough innovations in AI, quantum computing, industry-specific cloud solutions and business services deliver open and flexible options to our clients. All of this is backed by IBM’s legendary commitment to trust, transparency, responsibility, inclusivity and service.

Send this to a friend