skip to primary navigationskip to content

Dr Paul Schofield

Experimental and informatic approaches to understanding human disease using model organisms.
Dr Paul  Schofield

Reader in Biomedical Informatics

Office Phone: +44 (0) 1223 333878, Fax: +44 (0) 1223 333840

Research Interests

Integration and exploitation of Big Data for human health

Comparing phenotypes between species potentially provides invaluable insights into the pathobiology and etiology of human disease. Phenotypic characterisation of, for example, mouse and zebrafish mutants can provide information that can be used to prioritise gene lists derived from human genome-wide association studies, allow the dissection of loci involved in copy number-variation lesions, and provide functional validation of disease gene candidates, as well as insights into basic biological processes. The ability to cross the species divide has long been a thorny problem, as human and model organism phenotypes are described using different formal ontologies and conceptual approaches. To address this, we are working to develop a series of ontologies and tools that use those ontologies, allowing the seamless integration of phenotypic data between species.  We are now applying semantic approaches to the integration of large public datasets including patient electronic health records, drug effect data and the phenotypes of mutant model organisms. This work is concentrating on the use of this data to develop new therapeutic approaches to human disease, for example through the repositioning of existing drugs.

Data sharing initiatives

Data access and integration have become central to modern biology and using our experience we are developing with the German Federal Radiation Protection agency (BfS) and the MELODI (  initiative, a public database for primary experimental  and epidemiological data from radiation biology: STORE. This database will provide a platform for international data sharing and uses state of the art informatics to maximise data discovery and recovery.


Prof Robert Hoehndorf, Computational Biology, King Abdullah University of Science and Technology, Saudi Arabia
Prof George Gkoutos, Centre for Computational Biology, College of Medical and Dental Sciences, University of Birmingham, UK
Prof Peter Robinson, Institut für Medizinische Genetik und Humangenetik Charité - Universitätsmedizin Berlin
Prof John Sundberg, The Jackson Laboratory, Bar Harbor, Maine, USA
Dr Bernd Grosche, Bundesamt fuer Strahlenschutz, Neuherberg, Germany

Key Publications

Boudellioua I, Mahamad Razali RB, Kulmanov M, Hashish Y, Bajic VB, Goncalves-Serra E, Schoenmakers N, Gkoutos GV, Schofield PN, Hoehndorf R,( 2017),  Semantic prioritization of novel causative genomic variants, PLoS computational biology, 13:e1005500.

Gkoutos GV, Schofield PN, Hoehndorf R, (2017), The anatomy of phenotype ontologies: principles, properties and applications, Briefings in Bioinformatics, Brief Bioinform bbx035

Schofield, P.N and Bard, J.B. (2015) Human anatomy informatics. Commentary 2.1. in Gray’s Anatomy, 41st Edition. ed Standring. S. Elsevier.

Hoehndorf R, Schofield PN, Gkoutos GV, (2015), Analysis of the human diseasome using phenotype similarity between common, genetic, and infectious diseases, Scientific reports, 5, 10888

Groza T, Kohler S, Moldenhauer D et al., (2015), The Human Phenotype Ontology: Semantic Unification of Common and Rare Disease, American Journal of Human Genetics, 97, 111-124

Hoehndorf R, Schofield PN, Gkoutos GV, (2015), The role of ontologies in biological and biomedical research: a functional perspective, Briefings in Bioinformatics, 10.1093/bib/bbv011

Hoehndorf R, Gruenberger M, Gkoutos GV, Schofield PN, (2015), Similarity-based search of model organism, disease and drug effect phenotypes, Journal of Biomedical Semantics, 6, 6

Hoehndorf R, Slater L, Schofield PN, Gkoutos GV, (2015), Aber-OWL: a framework for ontology-based data access in biology, BMC bioinformatics, 16, 26

Moeller M, Hirose M, Mueller S, Roolf C, Baltrusch S, Ibrahim S, Junghanss C, Wolkenhauer O, Jaster R, Kohling R, Kunz M, Tiedge M, Schofield PN, Fuellen G, (2014), Inbred mouse strains reveal biomarkers that are pro-longevity, antilongevity or role switching, Aging Cell, 13, 729-738

Ibn-Salem J, Kohler S, Love MI et al., (2014), Deletions of chromosomal regulatory boundaries are associated with congenital disease, Genome Biology, 15, 423

Kohler S, Doelken SC, Mungall CJ, et  al., (2014), The Human Phenotype Ontology project: linking molecular biology and disease through phenotype data, Nucleic Acids Research, 42, D966-974

Hoehndorf R, Hancock JM, Hardy NW, Mallon AM, Schofield PN, Gkoutos GV, (2014), Analyzing gene expression data in mice with the Neuro Behavior Ontology, Mammalian Genome, 25, 32-40

Hoehndorf R, Hiebert T, Hardy NW, Schofield PN, Gkoutos GV, Dumontier M, (2014), Mouse model phenotypes provide information about human drug targets, Bioinformatics, 30, 719-725

Sundberg JP, Roopenian DC, Liu ET, Schofield PN, (2013), The Cinderella effect: searching for the best fit between mouse models and human diseases, The Journal of Investigative Dermatology, 133, 2509-2513

Hoehndorf R, Schofield PN, Gkoutos GV, (2011), PhenomeNET: a whole-phenome approach to disease gene discovery, Nucleic Acids Research, 39, e119

Boudellioua I, Mahamad Razali RB, Kulmanov M, Hashish Y, Bajic VB, Goncalves-Serra E, Schoenmakers N, Gkoutos GV, Schofield PN, Hoehndorf R, (2017), Semantic prioritization of novel causative genomic variants, PLoS computational biology 2017, 13:e1005500.

Plain English

Large archives of medical and genetic data are becoming more and more useful in understanding and curing diseases. These include clinical electronic health records and databases of experiments with non-human model organisms, such as the mouse. We have developed computer software that can understand and analyse large sets of clinical and experimental data using artificial intelligence and machine learning in order to identify the mutations associated with disease in humans. Discovery of the genetic variants underlying both rare and common genetic disease will help us understand the mechanisms underlying disease, with the aim of improving diagnosis and therapy

Above: An overview over the disease–disease similarity network generated by considerinig the sematic relatedness between phenotypic profiles of 6220 human diseases segmented into six disease modules obtained by filtering for disease categories in Disease Ontology.

The graph is based on a force-directed layout using the similarity between diseases as attraction force. Nodes are colored according to the top-level DO category in which they fall: cyan–disease of cellular proliferation, blue–nervous system and mental disease, red–cardiovascular disease, yellow–metabolic disease, green–infectious disease, magenta–immune system disease, brown–integumentary disease, pink–musculoskeletal disease, gray–urinary system disease. From Hoehndorf et al. (2015)

The human diseaseome can be viewed and queried on