My aim for this project:
- Create a report for management and my team to track varying statistics within the IPD-IMGT/HLA Database submissions pipeline.
- Include visuals which are usually presented in department presentations, reducing the time it takes to manually rebuild these graphics in excel each time.
- I want to automate/semi-automate this process. I want to be able to run a single script, which updates all of the tables with new data. Allowing for quick up-to-date information on the whole database. This can be used by upper management and my team.
- I want to track and compare between different years, to highlight areas where the submission naming process has be streamlined as well as highlight areas which need improving.
- Highlight areas of the database which need cleaning and maintaining.
- Ideas:
- track the number of HLA submissions received over the course of a year, compare this to previous years.
- track the number of HLA submissions, broken down by their source (Histogenetics, DKMS, ANRI, external).
- track the number of submissions remaining in pending, compared to what is waiting for additional information.
- show a breakdown of the genes that were named each year, maybe broken down by source and year.
- compare the length of sequences sent in (are they full length or CDS only).
- world map showing where submissions are being sent from.
Completed Report
Walkthrough I: Cumulative Number of Submissions