******************** ***Bayesian logistic regression ******************** ***Change directory cd "/Users/amaral/Documents/TAMU/TXRDC_workshop" ***Load data use lbw, clear describe low age smoke ******************** ***Classical logistic regression ******************** logit low age smoke ******************** ***Bayesian logistic regression ******************** ***Uniformative prior: normal with mean=0 and variance=10,000 ***We could use a flat prior ***The variance has to be relative to the scale of the dependent variable ***We need to specify prior for all coefficients set seed 14 bayesmh low age smoke, likelihood(logit) /// prior({low:age} {low:smoke} {low:_cons}, normal(0,10000)) ***Result above is not very different from logistic regression, ***because we have an uniformative prior ***Concise version (short cut) set seed 14 bayesmh low age smoke, likelihood(logit) prior({low:}, normal(0,10000)) ***Save results for later bayesmh, saving(logit_bmh_mcmc, replace) estimates store logit_bmh ***Model estimated with bayes: set seed 14 bayes: logit low age smoke ******************** ***MCMC diagnostics for all parameters ******************** ***Trace plot ***Good when it looks homogenous bayesgraph trace _all ***Autocorrelation plot ***Good when it reaches zero after some lag numbers bayesgraph ac _all ***Density plot ***We want the overall density, ***the density for the first half and ***the density for the second half ***to be similar bayesgraph kdensity _all ***All important graphs together ***Usually researchers use this command ***to see all graphs together bayesgraph diagnostics _all ***We can also generate graphs for specific parameters bayesgraph diagnostics {low:_cons} bayesgraph diagnostics {low:age} bayesgraph diagnostics {low:smoke} bayesgraph diagnostics {low:age} {low:smoke} ******************** ***Scatterplot matrix ******************** ***There is high correlation between constant and age ***This generates inefficiency in the model, ***which could affect smoke coefficient as well bayesgraph matrix _all ******************** ***Efficiency of MCMC estimates ******************** ***Effective sample size (ESS) ***How many independent observations we have ***within MCMC sample size ***Efficiency = ESS / MCMC sample size ***Efficiency closer to 1 is better ***Efficiency > 0.1 is good ***Efficiency < 0.01 is a concern ***If 0.01 > efficiency < 0.1, ***we have to look at MCSE (digits of precision) ***Do we want more digits of precision? ***It depends on the scales of our parameters of estimation bayesstats ess ***Efficiency estimated above is not high, ***because of the correlation between age and constant ***Maybe we should specify not such an uniformative prior ***for some of our coefficients ***Number of independent observations to estimate ***each coefficient (ESS) is low ***We should try to decrease MCSE, ***as well as increase MCMC sample size ******************** ***Posterior estimates ******************** bayesstats summary bayesstats summary {low:} bayesstats summary {low:smoke} ******************** ***Posterior estimate for odds ratios ******************** bayesstats summary (OR_smoke:exp({low:age})) bayesstats summary (OR_smoke:exp({low:smoke})) bayesstats summary (OR_age:exp({low:age})) (OR_smoke:exp({low:smoke})) ******************** ***Model with separated prior for each coefficient ******************** bayesmh low age smoke, likelihood(logit) /// prior({low:age}, normal(0,10000)) /// prior({low:smoke}, normal(5,10)) /// prior({low:_cons}, flat) ******************** ***Increase MCMC sample size to 100,000 ******************** set seed 14 bayesmh low age smoke, likelihood(logit) /// prior({low:}, normal(0,10000)) /// mcmcsize(100000) dots bayesgraph diagnostics _all ******************** ***Change variance of prior distribution ******************** bayesmh low age smoke, likelihood(logit) prior({low:}, normal(0,10000)) bayesmh low age smoke, likelihood(logit) prior({low:}, normal(0,.1))