************************************ ************************************ ***PARTIAL CORRELATION, MULTIPLE REGRESSION, AND CORRELATION (chapter 15) ************************************ ************************************ ************************************ ***Clear memory ************************************ clear all ************************************ ***Windows ************************************ ***Start saving results window log using "C:\course\progs\Stata15.log", replace text ***Shortcut for data folders global data = "C:\course\data" ***Shortcut for output folders global output = "C:\course\output" ************************************ ***Macintosh ************************************ ***Start saving results window log using "/course/progs/Stata15.log", replace text ***Shortcut for data folders global data = "/course/data" ***Shortcut for output folders global output = "/course/output" ************************************ ***Opening commands ************************************ ***Tell Stata to not pause for "more" messages set more off ***Change directory cd "$data" ************************************ ***Append different years ************************************ ***Open 2016 GSS use "GSS2016.dta", clear ***Append 2010 GSS append using "GSS2010.dta" ***Append 2004 GSS append using "GSS2004.dta" ***Verify year tab year, missing tab year, m ************************************ ***Generate variables ************************************ ***Generate dummy variable for Democrats vs. Republicans ***"Independents" will be missing tab partyid, m tab partyid, m nolabel gen democrat=. replace democrat=1 if partyid>=0 & partyid<=2 replace democrat=0 if partyid>=4 & partyid<=6 label variable democrat "Political party" label define party 1 "Democrats" 0 "Republicans" label values democrat party tab partyid democrat, m tab democrat, m ************************************ ***Complex survey design ************************************ svyset [weight=wtssall], strata(vstrat) psu(vpsu) singleunit(scaled) ************************************ ***Ordinary least squares (OLS) regression ************************************ ***Repondent's income by age and years of schooling svy: reg conrinc age educ if year==2016 ***Standardized regression coefficients ***(i.e., standardized partial slopes, beta-weights) ***It does not allow the use of GSS complex survey design reg conrinc age educ if year==2016, beta ************************************ ***Logarithm of income ************************************ ***Dependent variable does not have normal distribution hist conrinc if year==2016, freq normal ***Summary statistics of income sum conrinc if year==2016, d ***Generate the logarithm of income gen lnconrinc = ln(conrinc) ***Log of income has a distribution closer to normal hist lnconrinc if year==2016, freq normal ***Log of respondent's income by age and years of schooling svy: reg lnconrinc age educ if year==2016 ***Standardized regression coefficients ***(i.e., standardized partial slopes, beta-weights) ***It does not allow the use of GSS complex survey design reg lnconrinc age educ if year==2016, beta ************************************ ***Interpret coefficients with log of income ************************************ ***When x increases by 1, ***y increases by 100*[exp(coefficient)-1] percent, ***controlling for the effects of all other independent variables ***Example of coefficient for age di 100*(exp(0.0157926)-1) ***When coefficient has a small magnitude, ***we can use 100*coefficient di 100*(0.0157926) ***Example of coefficient for years of education di 100*(exp(0.1229175)-1) di 100*(0.1229175) ************************************ ***Dummy variables for age and education ************************************ ***Age does not have a normal distribution hist age if year==2016, freq normal ***Education does not have a normal distribution hist educ if year==2016, freq normal ***Generate age group variable ***18-24; 25-34; 35-49; 50-64; 65+ egen agegr = cut(age), at(18,25,35,50,65,90) tab agegr, m table agegr, contents(min age max age count age) ***Generate dummy variables for age tab agegr, gen(agegr) tab agegr agegr1, m tab agegr agegr2, m tab agegr agegr3, m tab agegr agegr4, m tab agegr agegr5, m ***Generate dummy variables for education ***Use "degree" variable tab degree, m tab degree, gen(educgr) tab degree educgr1, m tab degree educgr2, m tab degree educgr3, m tab degree educgr4, m tab degree educgr5, m ************************************ ***OLS regression with log of income and dummy independent variables ************************************ ***35-49 as reference group (agegr3): largest sample size tab agegr ***High school as reference group (educgr2): largest sample size tab degree ***Regression svy: reg lnconrinc agegr1 agegr2 agegr4 agegr5 educgr1 educgr3 educgr4 educgr5 if year==2016 ************************************ ***Interpret coefficients with log of income ************************************ ***When x increases by 1, ***y increases by 100*[exp(coefficient)-1] percent, ***controlling for the effects of all other independent variables ***Example of coefficient for educgr3 (junior college) ***compared to educgr2 (high school) di 100*(exp(0.2367316)-1) ***When coefficient has a small magnitude, ***we can use 100*coefficient di 100*(0.2367316) ***Since the coefficient for agegr1 (18-24) has a large magnitude, ***compared to agegr3 (35-49), ***we cannot use 100*coefficient di 100*(exp(-1.166963)-1) di 100*(-1.166963) ************************************ ***Standardized regression coefficients ************************************ ***(i.e., standardized partial slopes, beta-weights) ***It does not allow the use of GSS complex survey design reg lnconrinc agegr1 agegr2 agegr4 agegr5 educgr1 educgr3 educgr4 educgr5 if year==2016, beta ************************************ ***Export results to Word with outreg2 command ************************************ ***If your Stata doesn't have the outreg2 command, ***type "ssc install outreg2" to install it. ************************************ ***2004 model ************************************ ***Coefficients svy: reg lnconrinc agegr1 agegr2 agegr4 agegr5 educgr1 educgr3 educgr4 educgr5 if year==2004 ***Export to Word outreg2 using "$output\OLS.rtf", replace word // Windows outreg2 using "$output/OLS.rtf", replace word // Macintosh ***Standardized coefficients reg lnconrinc agegr1 agegr2 agegr4 agegr5 educgr1 educgr3 educgr4 educgr5 if year==2004, beta ***Export to Word outreg2 using "$output\OLS.rtf", append word stat(beta) // Windows outreg2 using "$output/OLS.rtf", append word stat(beta) // Macintosh ************************************ ***2010 model ************************************ ***Coefficients svy: reg lnconrinc agegr1 agegr2 agegr4 agegr5 educgr1 educgr3 educgr4 educgr5 if year==2010 ***Export to Word outreg2 using "$output\OLS.rtf", append word // Windows outreg2 using "$output/OLS.rtf", append word // Macintosh ***Standardized coefficients reg lnconrinc agegr1 agegr2 agegr4 agegr5 educgr1 educgr3 educgr4 educgr5 if year==2010, beta ***Export to Word outreg2 using "$output\OLS.rtf", append word stat(beta) // Windows outreg2 using "$output/OLS.rtf", append word stat(beta) // Macintosh ************************************ ***2016 model ************************************ ***Coefficients svy: reg lnconrinc agegr1 agegr2 agegr4 agegr5 educgr1 educgr3 educgr4 educgr5 if year==2016 ***Export to Word outreg2 using "$output\OLS.rtf", append word // Windows outreg2 using "$output/OLS.rtf", append word // Macintosh ***Standardized coefficients reg lnconrinc agegr1 agegr2 agegr4 agegr5 educgr1 educgr3 educgr4 educgr5 if year==2016, beta ***Export to Word outreg2 using "$output\OLS.rtf", append word stat(beta) // Windows outreg2 using "$output/OLS.rtf", append word stat(beta) // Macintosh ************************************ ***CLOSING COMMANDS ************************************ ***Save data save "Stata15.dta", replace ***Save log log close