Untitled

Overview:

The IPD-IMGT/HLA Database is a DNA repository for HLA and HLA-related genes, used by laboratories and researchers around the world, primarily for immunogenetics, immunotherapy and for matching leukemia patients with potential donors. Information on the database can be found here: https://www.ebi.ac.uk/ipd/imgt/hla/ and https://hla.alleles.org/nomenclature/index.html.

My role as a database curator is to curate the database itself and the data that is being sent to it from submitters around the world. Text files called flatfiles are sent to us, with all of the relevant information needed to name a new DNA sequence or confirm an existing one. It will include a description of the sequence, how it was obtained, information on the cell it originated from, who submitted it and the DNA sequence. Submissions are either sent through our online submission tool, but some companies submit separately in bulk for us to process. This data is released to the public every quarter, however, data is also stored locally in an Oracle database.

I have addressed my teams lack of visualisation tools for this data and have set out to build a Power BI report from scratch, to help visualise different statistics surrounding the IPD-IMGT/HLA Database. I am hoping this report will be beneficial to my team and upper management.

Completed Report

Project Description and Aims

Pages:

Completed Report

Project Description and Aims

Walkthrough I: Cumulative Number of Submissions

Walkthrough II: Cumulative Number of Submissions - By Source

Walkthrough III: Number of Genes/Alleles

Walkthrough IV: Processing Time

Walkthrough V: Pending and Waiting

Walkthrough VI: SQL Update