Guidance

Ministry of Justice: Data First

Data First is ana ambitiouspioneering data-linkingdata-linking, research and academic engagement programme led by the Ministry of Justice and funded by ADR UK.

Data First aimsunlocks to unlock the potential of the wealth of data already created by the Ministry of Justice (MOJ),(MOJ) by linkingmaking linked administrative datasets from across the justice system available for research. The programme is led by MOJ and enablingfunded accreditedby researchers,Administrative fromData withinResearch governmentUK (ADR UK), an investment by the Economic and academia,Social toResearch accessCouncil (ESRC).

Data from the datacourts, prison and probation services in anEngland ethicaland Wales have been linked to enable new and responsibleinnovative way.analysis of user journeys, interactions, and outcomes across the justice system. The projectprogramme willis also enhanceenhancing the linking of justice data with other government departments.departments, including education data from the Department for Education’s (DfE) National Pupil Database (NPD).

Data First enables researchers across government and academia to access these datasets in an ethical and responsible way via secure platforms in the ONS Secure Research Service and SAIL Databank

By working in partnership with academic experts to facilitate and promote research in line with evidence priorities set out in the justiceMOJ space,Areas of Research Interest (ARI) Data First willis creategenerating anew sustainableinsights bodyto inform the development of knowledgegovernment onpolicy and drive real progress in improving justice systemoutcomes.

General users,programme theirinformation

The interactionsData First user guide provides further information about the programme, including the processes for accessing the data for research. The privacy and data protection statement provides information about how we use and share data.

Datasets

Data catalogues are available for all Data First datasets, providing information on the variables contained within each. These data catalogues are currently draft versions that provide basic details of each dataset and will be updated soon with final versions.

Data First has shared six datasets from administrative sources across the criminal,courts, prison and probation services in England and Wales: magistrates’ courts, the Crown Court, prisoner custodial journeys, probation services, and the family and civil courts.

The cross-justice system linking dataset can be used to join these six different datasets at a person level. This linking dataset also contains a table which can be used to join magistrates’ courts and theirCrown needs,Court pathwaysdata at a case level.

Separately, data on criminal histories from the Police National Computer (PNC) have been linked to education and outcomessocial acrosscare adata rangein England from the DfE NPD as part of publicthe services.MOJ-DfE data share. Please contact DataLinkingTeam@justice.gov.uk or data.sharing@education.gov.uk for the latest available metadata for the MOJ-DfE data share.

MOJ cross-justice system datasets

Applying for data access  

ThisData willFirst providedatasets greatercan insightbe accessed through the ONS Secure Research Service (SRS) or SAIL Databank (except for the MOJ-DfE data share, which is only available through the ONS SRS).

Requests to informaccess data through the developmentONS SRS require completion of MOJthe policiesSecure Access to Data Form here: Application form for secure access to data

Guidance for completing the application form can be found in the Data Sharing Guidance, and the list of datasets and access routes can be found here. Further information on the process overall is included within the Data First user guide above.

TheTo programmeaccess data within the SAIL Databank please apply though SAIL.

A register of external research projects which have been approved to use MOJ data is ledavailable to view here.

Analytical outputs

Statistical and social research publications using Data First data have been delivered by MOJ andanalysts fundedor in collaboration with other government departments. Outputs have also been produced by ADR UKUK-funded (AdministrativeResearch Fellows.  These publications can be found below:

Splink: Data linkage at scale

Through Data First, the MOJ has developed a free and open-source software library to enable data linkage at scale. This software has been used to link some of the largest datasets held by MOJ as part of Data First.

Splink is now in its third version. It is a freely available, open-source Python package that is:

  • faster and more accurate than other free tools
  • able to link hugelarge datasets, of tens of millions orof records or more
  • developed with advice from academic experts in data linkage
  • able to produce a wide range of interactive data visualisations that help to build effective models, explain linkage predictions, diagnose problems and quality assure models
  • compatible with multiple databases and big data processing engines, meaning it can run on a wider range of computer systems

You can find out more on the the Splink website, where you can download and start using Splink. You can also also ask us a question or  or raise an issue on on the public public GitHub repository. We’d. Splink beare very happy to hear from researchers interested in using Splinkthe software for their work.

GeneralAwards projectand informationRecognition

Datasets

Analytical outputs

Application2024, form

MOJ: Data First,First application form for secure access to data Team

 

Contact

Contact the Data First team at at datafirst@justice.gov.uk if if you would like further information or have any queries.

Published 30 June 2020
Last updated 1410 OctoberMay 20222024 + show all updates
  1. General user information has been updated to reflect new datasets and linkages. Updates to the User Guide and data catalogues will follow. The order of sections of the document has changed. New contact information has been added.

  2. Splink information added.

  3. Data First Family Court data catalogue updated.

  4. Data First prisoner custodial journey data catalogue updated.

  5. Analytical outputs section added.

  6. User guide updated and Data First probation data catalogue, Data First criminal courts, prisons and probation linking data catalogue published.

  7. User guide updated and Data First Family Court data catalogue published.

  8. User guide, privacy statement, Data First magistrates' court defendant data catalogue, Data First Crown Court defendant data catalogue and Data First criminal courts and prisons linking data catalogue updated.

  9. User guide updated and Data First prisoner custodial journey data catalogue published.

  10. User guide updated and Data First linked magistrates’ and Crown Court data catalogue published.

  11. Documents updated and Data First Crown Court defendant data catalogue published.