This query is used to provide summary statistics for the age across all observation records stratified by gender (gender_concept_id): the mean, the standard deviation, the minimum, the 25th percentile, the median, the 75th percentile, the maximum and the number of missing values. The age value is defined by the earliest observation date. Age is summarized for all existing gender_concept_id values.
Input:
Sample query run:
The following is a sample run of the query.
Oracle only
SELECT gender
, count(*) AS num_people
, min( age ) AS min_age
, max( age ) AS max_age
, round( avg( age ), 2 ) AS avg_age
, round( stdDev( age ), 1 ) AS stdDev_age
, PERCENTILE_DISC(0.25) WITHIN GROUP( ORDER BY age )
AS percentile_25
, PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY age )
AS median
, PERCENTILE_DISC(0.75) WITHIN GROUP (ORDER BY age )
AS percential_75
FROM /* person, gender, age */
( SELECT person_id, NVL( concept_name, 'MISSING' ) AS gender
, extract( year FROM first_observation_date ) - year_of_birth AS age
FROM -- person, first observation period date
( SELECT person_id
, min( observation_period_start_date ) AS first_observation_date
FROM observation_period
GROUP BY person_id
)
JOIN person USING( person_id )
LEFT OUTER JOIN vocabulary.concept ON concept_id = gender_concept_id
WHERE year_of_birth IS NOT NULL
)
GROUP BY gender
Output:
Output field list:
Field |
Description |
gender |
Gender concept name |
num_people |
Number of people with specific gender |
min_age |
Minimum age across observation of people with specific gender |
max_age |
Maximum age across observation of people with specific gender |
avg_age |
Average age across observation of people with specific gender |
stdDev_age |
Standard deviation of age across observation within specific gender |
percentile_25 | 25th percentile age across observation within specific gender | median | Median age across observation within specific gender | percentile_75 | 75th percentile age across observation within specific gender |
Sample output record:
Field |
Description |
gender |
MALE |
num_people |
1607472 |
min_age |
0 |
max_age |
103 |
avg_age |
40.78 |
stdDev_age |
18.60 |
percentile_25 | 29 | median | 45 | percentile_75 | 55 |
|
|