****************************************************************************** ******************************************************************************* ******************************************************************************* * * UNIVARIATE IMPUTATION ******************************************************************************* use "C:\Users\tosi\Desktop\shareW1&W2.dta", clear keep maxgrip mergeid country wave gender age_int hhsize interview adl *la prima cosa da fare č sempre chiedersi se ci sono missing *per costruzione e se si devono imputare deterministicamente *attraverso le atre info che abbiamo drop if interview!=1 | age_int<0 *pattern dei valori mancanti recode adl(min/-1=.) misstable summarize misstable patterns maxgrip adl misstable nested maxgrip adl misstable tree maxgrip adl *per semplificare abbiamo solo una variabile incompleta *adl č il numero di limitazioni nelle attivitā quotidiane (di conteggio). *Possiamo usarla come variabile continua oppure come ordinale mean adl tab adl recode adl(.=0) *abbiamo circa 6000 valori mancanti, circa il 9% misstable summarize ************************************************ mi set mlong mi register imputed maxgrip mi register regular mergeid country wave gender age_int hhsize adl mi passive: gen age2 = age_int^2 *********************************************** *********************************************** * * IMPUTATION MODEL 3: PMM * *********************************************** /* PMM = Predictive Mean Matching *********************************************** mi impute pmm used one nearest neighbor to draw from. That is, it replaced missing values with an observed value whose linear prediction was the closest to that of the missing value. Using only one nearest neighbor will typically result in high variability of the MI estimates. You canincrease the number of nearest neighbors from which the imputed value is drawn. */ *test per assunzione della distribuzione normale mvtest normality maxgrip adl, univa biva *possiamo preferire il metodo PMM per rilassare le assunzioni di normalitā *knn() numero di osservazioni vicine con cui imputare i missing *utilizziamo adl come variabile categoriale e aggiungiamo anche age^2 mi impute pmm maxgrip i.country wave gender age_int age2 i.adl hhsize, add(5) knn(1) rseed(2232) ************************************** * diagnostica ************************************** midiagplots maxgrip midiagplots maxgrip, plottype(hist) separate mi xeq 0 1 3 5: summarize maxgrip mi des ************************************* * modello analitico ************************************* /*The right header column the number of observations used, the average relative variance increase (RVI) due to nonresponse, the largest fraction of missing information (FMI), a summary about parameter-specific degrees of freedom (DF), and the overall model test that all coefficients, excluding the constant, are equal to zero. */ mi estimate, dots : ologit adl maxgrip i.country wave i.gender age_int hhsize mimrgns gender, cmdmargins predict(outcome(6)) mimrgns, dydx(gender) cmdmargins predict(outcome(6)) *** AME=-0.0075 *** mi xeq 0: ologit adl maxgrip i.country wave i.gender age_int hhsize mi extract 0,clear ologit adl maxgrip i.country wave i.gender age_int hhsize margins gender, predict(outcome(6)) margins, dydx(gender) predict(outcome(6)) *** AME= -.0020 *** /*ricordiamoci che Ologit č un modello logistico per variabili ordinali dove la probabilitā di osservare l'outcome i corrisponde alla funzione lineare delle probabilitā di osservare i all'interno di un determinao range. Quindi si assume che l'effetto dei predittori sia proporzionale all'interno dei diversi cutpoints (da 0 a 1, o da 1 a 2). */ *********************************************** *********************************************** * * IMPUTATION MODEL 3b: PMM * *********************************************** * PMM = Predictive Mean Matching 5 casi vicino *********************************************** mi set mlong mi register imputed maxgrip mi register regular mergeid country wave gender age_int hhsize adl age2 ********************************************** *in questo caso utilizziamo adl come variabile continua mi impute pmm maxgrip i.country wave gender age_int adl hhsize, add(5) knn(5) rseed(2232) ********************************************** * diagnostica ********************************************** midiagplots maxgrip midiagplots maxgrip, plottype(hist) separate mi xeq 0 1 3 5: summarize maxgrip mi des list maxgrip age_int _mi_id _mi_miss _mi_m if _mi_id ==5 mi estimate, dots : ologit adl maxgrip i.country wave i.gender age_int hhsize mimrgns gender, cmdmargins predict(outcome(6)) mimrgns, dydx(gender) cmdmargins predict(outcome(6)) *** AME=-0.0080 *** ******************************************************************* *In realtā essendo ADL una variabile di conteggio *potremmo preferire una regressione di poisson * L'EFFETTO MARGINALE in questo tipo di modello č la DIFFERENZA * TRA I VALORI (O I PUNTEGGI) PREDETTI (!!) dal modello ******************************************************************* mi estimate, dots : poisson adl maxgrip i.country wave i.gender age_int hhsize mimrgns gender, cmdmargins mimrgns, dydx(gender) cmdmargins *** AME=-0.865 *** mi extract 0,clear poisson adl maxgrip i.country wave i.gender age_int hhsize margins gender, margins, dydx(gender) *** AME=-0.1520 ***