Changes

From Genome Analysis Wiki
Jump to navigationJump to search
Line 30: Line 30:  
Individuals who either never smoked, or on whom we have no data (e.g., someone was a former smoker but former smoking was never assessed) will be excluded from analysis.  Only cigarettes will be included in the estimate.  If preferable, repeated measures designs (longitudinal data) can use all assessments by scaling and correcting for covariates within waves of assessment, then averaging across assessments.
 
Individuals who either never smoked, or on whom we have no data (e.g., someone was a former smoker but former smoking was never assessed) will be excluded from analysis.  Only cigarettes will be included in the estimate.  If preferable, repeated measures designs (longitudinal data) can use all assessments by scaling and correcting for covariates within waves of assessment, then averaging across assessments.
   −
For studies that collect a quantitative measure of CPD, where the respondent is free to provide any integer (e.g., 13 CPD), '''we will bin responses into the following bins: 1-10, 11-20, 21-30, 30+.''' If some study collected binned responses from the outset, and those bins happen to differ from ours (e.g., 1-5, 6-15, etc.), then we will simply use whatever bins the study has collected. Please contact Scott if your study does something completely different.
+
For studies that collect a quantitative measure of CPD, where the respondent is free to provide any integer (e.g., 13 CPD), '''we will bin responses into the following bins: 1-10, 11-20, 21-30, 31+.''' If some study collected binned responses from the outset, and those bins happen to differ from ours (e.g., 1-5, 6-15, etc.), then we will simply use whatever bins the study has collected. Please contact Scott if your study does something completely different.
 +
 
 +
Please note, however, that when we report descriptive statistics about our phenotypes we will want to report the original participant responses. Even though we'll bin the data for analysis, we'll still report quantitative CPD (when possible) when we describe each study's phenotype.
    
=== (2) Smoking Initiation ===
 
=== (2) Smoking Initiation ===
Every study had some useable measure of whether a respondent has ever regularly smoked.  Almost all asked directly.  Some have necessary information for this variable (e.g., 100 cigs lifetime? Ever smoked every day for 2 weeks straight?).
+
Every study had some usable measure of whether a respondent has ever regularly smoked.  Almost all asked directly.  Some have necessary information for this variable (e.g., 100 cigs lifetime? Ever smoked every day for 2 weeks straight?).
   −
Note that we’re among the first groups conducting such meta-analyses, and our analysis pipeline is currently restricted to continuous traits. Until methods are developed for binary traits, it is proposed that we analyze smoking initation as a continuous trait.
+
Note that we’re among the first groups conducting such meta-analyses, and our analysis pipeline is currently restricted to continuous traits. Until methods are developed for binary traits, it is proposed that we analyze smoking initiation as a continuous trait.
    
=== (3) Pack Years ===
 
=== (3) Pack Years ===
Number of cigarettes per day, divided by 20, then multiplied by the number of years the person has smoked.
+
Number of cigarettes per day, divided by 20, then multiplied by the number of years the person has smoked. For this measure please use the quantitative CPD, and not the binned responses discussed above under the CPD heading. If your study collected binned responses from the outset, please use the midpoint of the range in calculating Pack Years. For example, individuals stating they smoked 11-20 CPD would be assumed to have smoked 15.5 on average
    
=== (4) Age of Initiation of Smoking ===
 
=== (4) Age of Initiation of Smoking ===
The age an individual first became a regular smoker.
+
The age an individual first became a regular smoker. Please check for obvious outliers and remove them (5 years old or younger).
    
=== (5) Average drinks per week, either as a current drinker or former drinker ===
 
=== (5) Average drinks per week, either as a current drinker or former drinker ===
Individuals who either never drank, or on whom we have no data (e.g., someone was a former drinker but former drinking was not assessed) will be excluded from analysis.  All types of liquor will be combined in the total estimate.  If preferable, repeated measures designs (longitudinal data) can use all assessments by scaling and correcting for covariates within waves of assessment, then averaging across assessments.   
+
Individuals who either never drank, or on whom we have no data (e.g., someone was a former drinker but former drinking was not assessed) will be excluded from analysis.  Please combine all types of liquor in the total estimate.  If preferable, repeated measures designs (longitudinal data) can use all assessments by scaling and correcting for covariates within waves of assessment, then averaging across assessments.   
 
  −
There was some cross-study variability on this measure.  Some studies specified avg drinking during a specific window, such as the 12 months or last one month; most made no such specification. Two studies forced the respondent to select ranges.
     −
=== Ordinal versus Quantitative Phenotypes ===
+
If your study forced the respondent to report ranges (e.g., 1-5, 6-10, 11-15, 16-20, etc.) please simply use the midpoint of the range. For example, if one range is 1-5 DPW, we assume they drink 2.5 DPW on average. Then use these midpoints in all subsequent analysis.
For studies with ordinal measures, such as questions that force the respondent to report ranges (e.g., 1-5, 6-10, 21-30 cigarettes per day), please use midpoint of the range. These midpoints should then be treated as if they were quantitative variables in all further analysis.
      
== Covariates ==
 
== Covariates ==
235

edits

Navigation menu