The IPD-IMGT/HLA Database is a DNA repository for HLA and HLA-related genes, used by laboratories and researchers around the world, primarily for immunogenetics, immunotherapy and for matching leukemia patients with potential donors. Information on the database can be found here: https://www.ebi.ac.uk/ipd/imgt/hla/ and https://hla.alleles.org/nomenclature/index.html.
My role as a database curator is to curate the database itself and the data that is being sent to it from submitters around the world. Text files called flatfiles are sent to us, with all of the relevant information needed to name a new DNA sequence or confirm an existing one. It will include a description of the sequence, how it was obtained, information on the cell it originated from, who submitted it and the DNA sequence. Submissions are either sent through our online submission tool, but some companies submit separately in bulk for us to process. This data is released to the public every quarter, however, data is also stored locally in an Oracle database.
I have addressed my teams lack of visualisation tools for this data and have set out to build a Power BI report from scratch, to help visualise different statistics surrounding the IPD-IMGT/HLA Database. I am hoping this report will be beneficial to my team and upper management.
Walkthrough I: Cumulative Number of Submissions
Walkthrough II: Cumulative Number of Submissions - By Source
Walkthrough III: Number of Genes/Alleles
Walkthrough IV: Processing Time