************************************************************** ************************************************************** * *********************** * INTRODUZIONE A STATA 13 * *********************** ************************************************************** * linguaggio case-sensitive ************************************************************** ************************************************************** * aprire un file e settare la memoria (solo per dataset grandi) ************************************************************** *impostare la directory dove trovare i file cd "G:\Il mio Drive\Multifonte\AA 2024_25" use easySHARE_rel8-0-0.dta, clear *se volessimo riaprire il file impostando la memoria *e pulendo la matrice e la memoria dalle analisi precedenti clear clear matrix clear mata set maxvar 30000 use easySHARE_rel8-0-0.dta, clear notes *************************************************** * linguaggio: help per descrizione dei comandi *************************************************** help describe *************************************************** *visualizzare i dati *************************************************** describe mergeid int_year, numbers *************************************************** *visualizzare la matrice dati ************************************************** br br mergeid hhid eurod country ************************************************** * ordinare la matrice dati ************************************************** sort mergeid wave br mergeid wave ************************************************** *visualizzare i casi ************************************************** set more off list mergeid int_year ************************************************* * qualche descrittiva * if è la condizione, ossia var==.. oppure > < ************************************************ sum female age wave,detail summarize female if int_year < 2010 summarize female if int_year >= 2010 ************************************************** ************************************************** * tabella univariata ************************************************** tab female tab1 country female ssc install fre fre female int_year ************************************************* * grafici descrittive univariate ************************************************* *categorica *********** graph bar, over(female) bar(1,fc(gs9)) graphregion(c(white)) //// ylab(0(10)60) title(Gender distribution) graph bar (count), over( female) horiz bar(1,fc(gs9)) graphregion(c(white)) //// title(Gender distribution) ********* *ordinale ********* graph bar if isced1997>-1, over(isced1997_r, sort(1) descending) horiz bar(1,fc(gs9)) graphregion(c(white)) //// ylab(0(10)40) title(Distribution of educational degree) graph bar if isced1997>-1, over(isced1997_r) horiz bar(1,fc(gs9)) graphregion(c(white)) //// ylab(0(10)40) title(Distribution of education) graph bar , over(isced1997_r) horiz bar(1,fc(gs9)) graphregion(c(white)) //// ylab(0(10)40) title(Distribution of education) ********* *continua ********* hist age if age>0, graphregion(c(white)) color(gs9) lc(black) percent ******************************************************* * tabella bivariata ******************************************************** tab country female, row chi tab2 country female wave, chi ************************************************* *grafico: donne intervistate a seconda del paese ************************************************* graph bar female, over(country, label(angle(45))) bar(1,fc(gs9)) graphregion(c(white)) **************** *medie di gruppo **************** tabstat age, by(female) st(mean min p25 p50 p75 max n) *test su una coda e su due code ttest age, by(female) ********************************* *grafico: medie con intervalli ********************************* bys female: egen mean_age=mean(age) /* creiamo variabili per: la media e intervallo di confidenza notare che gen e egen supportano funzioni diverse */ findit egenmore bys female: egen se_age=semean(age) gen ci_age1=mean_age+1.96*se_age gen ci_age2=mean_age-1.96*se_age tab1 ci_age1 ci_age2 two (scatter mean_age female) || (rcap ci_age2 ci_age1 female, lc(black) ), xscale(r(-0.5(0.5)1.5)) xlab(0 1) /// xtitle("gender") ytitle("age") graphregion(c(white)) legend(order(1 "mean age" 2 "C.I.")) *************************************************** * scatterplot *household income & age *************************************************** scatter thinc_m age if thinc_m>0 & thinc_m<400000 & age>0, symbol(Oh)mc(gs9) jitter(5) *quadratic function two (scatter thinc_m age if thinc_m>0 & thinc_m<400000 & age>0, symbol(Oh)mc(gs9) jitter(5) ) (qfitci thinc_m age if thinc_m>0 & thinc_m<400000 & age>0) *mediane two (scatter thinc_m age if thinc_m>0 & thinc_m<400000 & age>0, symbol(Oh)mc(gs9) jitter(5) ) (mspline thinc_m age if thinc_m>0 & thinc_m<400000 & age>0) *medie per età graph bar (mean) thinc_m, over(age) bar(1,fc(gs9)) graphregion(c(white)) ************************************ * correlazione ************************************ corr age thinc_m ************************************************ *decodifica stringhe ************************************************ decode female,gen(sex) tab sex tab sex,nol mi br sex encode sex, gen(sesso) tab sesso tab sesso,nol *elimina variabili drop sesso sex *elimina casi drop in 1/5 ************************************************************************** * generare variabili nuove ************************************************************************** gen sesso=female tab sesso female ****************** * if + condizione gen sex=female if wave==1 tab sex female,mi drop sex sesso ***************** *dummy variables ***************** tab female,gen(sesso) tab sesso1 sesso2,mi gen sesso=(female==1) tab sesso female drop sex sesso* *************************************************** *genera nuova variabile con ricodifica e etichette *************************************************** recode female (0=2 "male") (1=1 "female"),gen(sex) tab sex female,nol recode age (min/50=0) (50.1/60=1) (60.1/70=2) (70.1/80=3) (80.1/90=4) //// (90.1/100=5)(100.1/max=6),gen(age_cat) tab age_cat wave, col ********************************************************** * genera nuova variabile con funzioni (media) per gruppi ********************************************************** bys female: egen age_mean=mean(age) if age!=. & age>49.99 tab age_mean female,mi *media del campione egen male_age=mean(age) if female==0 tab male_age *sovrascrivere variabili replace male_age=age if female==0 fre male_age *valori massimi per gruppo (paese) bys country: sum age bys country: egen age_max=max(age) *valore massimo tra un set di variabili: da ac002d1 a ac002d7 egen ac_max=rowmax(ac002d1-ac002d7) *non missing observation bys country:egen ac_nomiss=count(ac002d1-ac002d7) tab ac_nomiss country *valori minimi per gruppo (paese) bys country: sum age bys country: egen age_max=min(age) *z-score: media 0, sd=1 egen zage=std(age) hist zage sum zage ****************************************** * generare piu' variabili contempraneamente ****************************************** tab1 ac002d1-ac002d7 forvalues x=1/7 { gen activity_`x'=1 if ac002d`x'==1 } * ********************* *ridurre dataset ********************* keep mergeid age_mean if female==1 clear use easySHARE_rel8-0-0.dta, clear **************************************************** * ORDINARE UN DATASET PER... *************************************************** sort mergeid wave br mergeid wave ******************************************************* * vedere le wave in cui viene intervistato l'individuo ******************************************************* bys mergeid: gen num=_n tab num wave ************************************************ *vedere il numero di osservazioni per individuo ************************************************ bys mergeid: egen num_tot=max(num) tab num_tot br mergeid wave num num_tot ******************************************************* * vedere quando gli individui entrano nella survey ******************************************************* sort mergeid wave gen baseline=1 if mergeid!=mergeid[_n-1] tab country wave if baseline==1 ********************************************** * creare transizioni nel tempo * transizione da sposato o in partnership * a divorziato in 2 wave consecutive ********************************************** sort mergeid wave gen divorce=1 if mar_stat==5 & (mar_stat[_n-1]==1 | mar_stat[_n-1]==2) & mergeid==mergeid[_n-1] ********************************************** * transizione da sposato o in partnership * a divorziato in 2 wave consecutive * O NON CONSECUTIVE ********************************************** gen divorce2=. forvalues x=1/7 { replace divorce2=1 if mar_stat==5 & (mar_stat[_n-`x']==1 | mar_stat[_n-`x']==2) & divorce2==. & mergeid==mergeid[_n-`x'] } * tab divorce divorce2,mi ************************************************ * creare una variabile per tutti coloro che * transitano al divorzio in un momento nel tempo ************************************************ sort mergeid wave gen divorce3=divorce replace divorce3=1 if divorce3[_n+1]==1 & divorce3==. br mergeid wave divorce divorce3 replace divorce3=1 if divorce3[_n+1]==1 & divorce3==. & mergeid==mergeid[_n+1] replace divorce3=1 if divorce3[_n-1]==1 & divorce3==. & mergeid==mergeid[_n-1] ********************************************************* * per vedere cambiamenti nel tempo in variabili continue ********************************************************* recode eurod(min/-1=.),gen(depress) sort mergeid wave gen depress_change=depress-depress[_n-1] if mergeid==mergeid[_n-1] hist depress hist depress_change ************************************************* ************************************************* *trovare e istallare comandi scritti da utenti ************************************************* findit reclink2 ssc instal reclink2 ************************************************ *ricodifica piu' item simili contemporaneamente ************************************************ tab ac002d1 recode ac002d*(min/-1=.) ************************************************************* * * una variabile "activity"=1 se una delle attività in ac002==1 * i loop sono utili quando vogliamo modificare/ analizzare più * variabili contemporaneamente * ************************************************************* gen activity=. forvalues x=1/7 { replace activity=1 if ac002d`x'==1 & activity==. } * ******* *oppure forvalues x=1/7 { gen activity_`x'=1 if ac002d`x'==1 } * ************************************************************************ ************************************************************************ *************** /* ESERCIZIO LAB 1 *************** Vogliamo studiare la salute psicologica di genitori vs. senza figli intervistati nell'ultima wave *********************************************************************** Q1: Selezione del campione Q2: Come definire l'outcome di interesse Q3: Quali analisi sono possibili *********************************************************************** *********************************************************************** * * E se volessimo studiare le differenze tra i * genitori che hanno un figlio che vive a meno * di un KM e quelli che hanno figli che vivono * piu' lontano? Quale sarebbe il campione selezionato? * **********************************************************************/