Making 8000 data sets open quickly, and changing our ways of working to deliver this can bring with it risks. Within Defra we have experts in understanding and managing these risks and over the last few weeks we have been working to to put in place a simple process to mitigate them.
Why are we doing this?
Defra has long been committed to open data. The Secretary of State, Liz Truss, has accelerated Defra’s transition to becoming a more open, porous organisation by challenging the Department to release 8,000 datasets in 12 months. It’s about more than publishing data though: the 8,000 datasets challenge brings with it a focus on how we use open data ourselves as a Department, a need to engage better with those who are interested in our data and start to strategically assess how we collect and use data.
How are we publishing our data?
We are working to ensure that all our data is published as machine readable, structured, open data and for our data to be published with open data certificates. Where we can’t do this (for example, if we have a historic archive of PDFs that form part of a wider archive of machine readable data)) we might what we have in its current format, under an open licence, and work to improve it subsequently. Engaging with users of our data is going to be essential to help us continue to improve the quality of our data..
Defra has many arms length bodies and each is delivering #OpenDefra within the context of their organisation.
What are the risks?
We consider 4 broad areas of risk which need to be managed to make a data set open:
- The data includes personal information or other identifiers, or can be combined with other data sets to produce personally identifiable information (for example passport numbers).
- The data has used third party data or other intellectual property rights (including: patents, logo's, trademarks, and design rights) that Defra does not have the legal right to share.
- Publishing the data could compromise national security, defence or public safety.
- Publishing the data could compromise commercial or legal proceedings including enforcement.
We have huge quantities of data, and it is owned by many teams in our different organisations. We need to ensure data owners can quickly work out if it is safe to publish a data set they are responsible for themselves and minimise the amount of input needed from the experts where it is not required.
The Open Data Risk Assessment
The open data risk assessment is our solution to this. It is a quick and easy way for people to assess whether there is a risk in publishing a data set as open data.
The open data risk assessment is a series of checks used to ensure datasets are safe to release as open data. For example, where we need to check that we are not breaking any personal or other confidentialities, or infringing on third party IP rights.
Our data owners and holders across Defra will be able to use this tool, in their teams, to assess the risk associated with releasing a data set. If the risk is assessed as low they can publish the data set, but for data that is assessed as higher risk by the team in their initial assessment, a more detailed risk assessment will undertaken with advice from our experts.
The more detailed risk assessment will challenge the initial assessment, to see what is making the publication of this particular data higher risk, and determine whether the risks can be mitigated. This might include the redaction of certain data points, anonymisation or aggregation of data, or contract renegotiation.
We will be publishing the assessment template in the next two weeks and will welcome feedback.
Benefits of this approach
By pushing out the accountability for this initial risk assessment on data release we are empowering our data owners and holders across Defra to make decisions. In many instances, the local data and policy teams who have responsibility for a data asset will have a better understanding of the context in which it’s been created, and the potential risks associated with the data. The approach will enable us to release data quickly and safely at local levels, while making most effective use our ‘risk specialists’ (data managers, intellectual property specialists, lawyers, etc).
We hope to remove some of the perceived barriers and excuses for not publishing data, and help our colleagues feel more comfortable that they understand the risks associated with data publishing. This will make it easier for our organisations and teams to make their data open by default.
Finally, we will encourage our data owners and holders to publish their risk assessments and details of data they decide not to publish. This will help people understand what Defra can and can’t do and why, and perhaps lead to new solutions.