Improved genetic database an “invaluable resource” for global disease research

31 January 2017

The growing popularity and use of free access scientific databases, is crucial to research into animal and human diseases, say researchers from The Pirbright Institute, who help manage one such database, working in partnership with scientists from the blood cancer charity, Anthony Nolan. The IPD-MHC (Immuno Polymorphism Database - Major Histocompatibility Complex) Database project is run by scientists for scientists and enables the study of MHC - the set of highly variable proteins located on the surface of cells which are central to controlling the immune response. The project works to collect, name and categorise the many varied genetic sequences of MHC proteins and is supported by the Biotechnology and Biological Sciences Research Council (BBSRC) and The European Bioinformatics Institute. It has grown significantly since its launch in 2003 and thanks to successful international collaboration, now hosts information on 70 animal species. Dr John Hammond, who leads the Immunogenetics Group at Pirbright and manages the database on behalf of the Institute, said: “In relation to MHC research, there are a potentially confusing array of sequences, so there is a real need for standardisation in regard to nomenclature, or the naming, of MHC sequences. “Initiatives such as this and others like it, provide an invaluable resource to researchers; enabling the efficient sharing of data, helping avoid duplication of effort and providing the tools required to facilitate complex analysis. All of this helps save time and money and is vital in helping strengthen our ability to understand immune responses and fight disease”.

The growing popularity and use of free access scientific databases, is crucial to research into animal and human diseases, say researchers from The Pirbright Institute, who help manage one such database, working in partnership with scientists from the blood cancer charity, Anthony Nolan.

The IPD-MHC (Immuno Polymorphism Database - Major Histocompatibility Complex) Database project is run by scientists for scientists and enables the study of MHC - the set of highly variable proteins located on the surface of cells which are central to controlling the immune response.

The project works to collect, name and categorise the many varied genetic sequences of MHC proteins and is supported by the Biotechnology and Biological Sciences Research Council (BBSRC) and The European Bioinformatics Institute. It has grown significantly since its launch in 2003 and thanks to successful international collaboration, now hosts information on 70 animal species.

Dr John Hammond, who leads the Immunogenetics Group at Pirbright and manages the database on behalf of the Institute, said: “In relation to MHC research, there are a potentially confusing array of sequences, so there is a real need for standardisation in regard to nomenclature, or the naming, of MHC sequences.

“Initiatives such as this and others like it, provide an invaluable resource to researchers; enabling the efficient sharing of data, helping avoid duplication of effort and providing the tools required to facilitate complex analysis. All of this helps save time and money and is vital in helping strengthen our ability to understand immune responses and fight disease”.

Experts, selected by members of the MHC committee for the relevant species, select and organise the submitted sequences and provide an official name for each allele (the different versions of a gene). This ensures the quality of the data and provides more specialised information. This high quality data can then be analysed with confidence.

The IPD-MHC Database is now the key resource in its field; with over 5,000 pages viewed and an average of 1,500 unique visitors per month. As the database has grown in size and complexity it has however created major challenges in regard to maintaining and organising the information and incorporating new allele submissions - of which there are now over 7,000 featured on the database.

Dr Hammond said: “New DNA sequencing technologies have vastly increased the number of sequences submitted to public databases. The funding we received from BBSRC helped us develop a new version of the database, which is more flexible and better able to cope with expansion.

“We also have better analysis tools that can handle more complex requirements and will support the work of categorisation too. An article about the database and the significance of the improvements we’ve made was recently published in Nucleic Acids Research. I’m hopeful that this will further raise its profile, as the more scientists that contribute the better the content will be. Routine updates are planned as it is vital the project keeps pace with the growing importance and complexity of this area of research”.

More information about the IPD-MHC Database project, the collaborators and the development team, is available from the project website.

Associated scientists

Prof John Hammond

Group Leader