This is my very first Power Bi project, here I am learning everything from scratch, there is a lot of bad DAX and some bad decisions that will get rectified later on, this is me learning as I go.
The first visual I wanted to create is a graph showing the cumulative number of HLA submissions received each year. I chose this as it is often used in presentations and provides a clear idea of how the database has grown in a given year compared to another year. This was originally made in Excel using a pivot table, but this would take a while to generate. I wanted to be able to create a visual that would be easy to update with new information and provide clear useful information.
To start this off, I would need to retrieve a dataset to work with. For this I created a SQL that would retrieve the necessary information for this visual.
I wanted to get the ID of each submission, the date it was submitted and who submitted, (subid, submitted and source respectively), this was taken from the submission_entry table. To make the spooled file easier to work with, I added “||chr(9)||” between each select as this would create a tab separations, a csv format. Which would be easier to import into Power BI.
The SQL statement is shown below
spool submissions.csv
set header off
set newpage none
select subid||chr(9)||submitted||chr(9)||source from submission_entry where hostdb = 'HLA';
spool off;
This produced a table which had the following format (but with a few thousand more rows):
10076560 | 04-Jan-2022 | HISTOGENETICS |
---|---|---|
10076561 | 04-Jan-2022 | HISTOGENETICS |
10076562 | 04-Jan-2022 | HISTOGENETICS |
10076563 | 04-Jan-2022 | HISTOGENETICS |
Next was to import this all into Power Bi and start making the visuals.
For this graph, I want to count the number of submissions (subid) that have been added to the database starting at 0, and cumulatively add until you get to the total number of submissions to-date.
This is the initial DAX formula that I tried:
Amount Cumulative =
CALCULATE(
COUNT(Sheet1[SubID]),
'Calendar'[Legend] <= MAX ('Calendar'[Legend])
)
I was hoping this would count the number of submissions from the start of the calendar date to the end of the calendar date, the resulting graph did not look like this, and instead seemed to count the number of submissions each year over the time period…
I then tried to play around with this by splitting this graph into different time eras and tried 1984-2009 (I should have used a slicer to just change the time rather than adding a filter in DAX):
Cumulative_Total_1984_2009 =
SUMX (
FILTER (
Sheet1,
Sheet1[SUBMITTED] >= DATE(1984, 1, 1) && Sheet1[SUBMITTED] <= DATE(2009, 12, 31)
),
CALCULATE (
COUNT ( Sheet1[SubID] ),
FILTER (
ALLSELECTED ( Sheet1[SUBMITTED] ),
Sheet1[SUBMITTED] <= MAX ( Sheet1[SUBMITTED] ) && Sheet1[SUBMITTED] >= MIN ( Sheet1[SUBMITTED] )
)
)
)