This query is used to to provide summary statistics for drug era end dates (drug_era_end_date) across all drug era records: the mean, the standard deviation, the minimum, the 25th percentile, the median, the 75th percentile, the maximum and the number of missing values. No input is required for this query.
Input:
<None>
Sample query run:
The following is a sample run of the query. The input parameters are highlighted in blue
--Oracle
SELECT min(tt.end_date) AS min_date
, max(tt.end_date) AS max_date
, avg(tt.end_date_num) + tt.min_date AS avg_date
, (round(stdDev(tt.end_date_num)) ) AS stdDev_days
, tt.min_date + (PERCENTILE_DISC(0.25) WITHIN GROUP( ORDER BY tt.end_date_num ) )
AS percentile_25_date
, tt.min_date + (PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY tt.end_date_num ) )
AS median_date
, tt.min_date + (PERCENTILE_DISC(0.75) WITHIN GROUP (ORDER BY tt.end_date_num ) )
AS percential_75_date
/*for RedShift
PERCENTILE_DISC(x) value from row with a
ROW_NUMBER() OVER (PARTITION BY part_col ORDER BY order_col)= CEILING(COUNT(*) OVER (PARTITION BY part_col) * x
*/
FROM
( SELECT (t.drug_era_end_date - MIN(t.drug_era_end_date) OVER()) AS end_date_num,
t.drug_era_end_date AS end_date,
MIN(t.drug_era_end_date) OVER() min_date
FROM drug_era t
) tt GROUP BY tt.min_date
;
Output:
Output field list:
Field |
Description |
drug_era_id |
A system-generated unique identifier for each drug era. |
drug_era_end_date |
The end date for the drug era constructed from the individual instance of drug exposures. It is the end date of the final continuously recorded instance of utilization of a drug. |
percentile_25_date |
|
median_date |
|
percentile_75_date |
|
Sample output record:
Field |
Description |
drug_era_id |
|
drug_era_end_date |
|
percentile_25_date |
|
median_date |
|
percentile_75_date |
|
|
|