This query is used to to provide summary statistics for condition occurrence start dates (condition_occurrence_start_date) across all condition occurrence records: the mean, the standard deviation, the minimum, the 25th percentile, the median, the 75th percentile, the maximum and the number of missing values. No input is required for this query.
Input:
<None>
Sample query run:
The following is a sample run of the query.
with start_rank as
(
SELECT condition_start_date-'0001-01-01' as num_start_date,condition_start_date,
sum(1) over (partition by 1
order by condition_start_date asc rows between unbounded preceding and current row)
as rownumasc
FROM (SELECT condition_start_date FROM condition_occurrence limit 14)
),
other_stat as
(
SELECT count(condition_start_date) as condition_start_date_count,
min(condition_start_date) as condition_start_date_min,
max(condition_start_date) as condition_start_date_max,
to_date('0001-01-01', 'yyyy/mm/dd')+ cast(avg(condition_start_date-'0001-01-01')
as int) as condition_start_date_average,
stddev((condition_start_date-'0001-01-01')) as condition_start_date_stddev
FROM condition_occurrence
WHERE condition_start_date is not null
)
select
(
SELECT count(condition_start_date)
FROM condition_occurrence
WHERE condition_start_date is null
) AS condition_start_date_null_count,
* FROM other_stat,
(
SELECT to_date('0001-01-01', 'yyyy/mm/dd')+cast(avg(condition_start_date-'0001-01-01')
as int) as condition_start_date_25percentile
FROM (select *,(select count(*) from start_rank) as rowno from start_rank)
WHERE (rownumasc=cast (rowno*0.25 as int) and mod(rowno*25,100)=0) or
(rownumasc=cast (rowno*0.25 as int) and mod(rowno*25,100)>0) or
(rownumasc=cast (rowno*0.25 as int)+1 and mod(rowno*25,100)>0)
) AS condition_start_date_25percentile,
(
SELECT to_date('0001-01-01', 'yyyy/mm/dd')+cast(avg(condition_start_date-'0001-01-01')
as int) as condition_start_date_median
FROM (select *,(select count(*) from start_rank) as rowno from start_rank)
WHERE (rownumasc=cast (rowno*0.50 as int) and mod(rowno*50,100)=0) or
(rownumasc=cast (rowno*0.50 as int) and mod(rowno*50,100)>0) or
(rownumasc=cast (rowno*0.50 as int)+1 and mod(rowno*50,100)>0)
) AS condition_start_date_median,
(
SELECT to_date('0001-01-01', 'yyyy/mm/dd')+cast(avg(condition_start_date-'0001-01-01')
as int) as condition_start_date_75percentile
FROM (select *,(select count(*) from start_rank) as rowno from start_rank)
WHERE (rownumasc=cast (rowno*0.75 as int) and mod(rowno*75,100)=0) or
(rownumasc=cast (rowno*0.75 as int) and mod(rowno*75,100)>0) or
(rownumasc=cast (rowno*0.75 as int)+1 and mod(rowno*75,100)>0)
) AS condition_start_date_75percentile
Output:
Output field list:
Field |
Description |
condition_start_date_null_count |
Number of condition occurrences where start date is null |
condition_start_date_count |
Number of condition occurrence start dates |
condition_start_date_min |
The earliest start date of a condition occurrence |
condition_start_date_max |
The latest start date of a condition occurrence |
condition_start_date_average |
The average start date (spanning from the earliest to the latest date and counted by days) |
condition_start_date_stddev |
The standard deviation of start dates, in number of days (spanning from the earliest to the latest date and counted by days) |
condition_start_date_25percentile |
A start date where 25 percent of the other end dates are earlier |
condition_start_date_median |
A start date where half of the other end dates are earlier and half are later |
condition_start_date_75percentile |
A start date where 75 percent of the other end dates are earlier |
Sample output record:
Field |
Value |
condition_end_date_null_count |
0 |
condition_end_date_count |
224523674 |
condition_end_date_min |
2003-01-01 |
condition_end_date_max |
2011-11-08 |
condition_end_date_average |
2008-11-30 |
condition_end_date_stddev |
651.27 |
condition_end_date_25percentile |
2007-10-30 |
condition_end_date_median |
2009-05-06 |
condition_end_date_75percentile |
2010-05-03 |
|
|