************************************ ************************************ ***SOCI 420: ADVANCED METHODS OF SOCIAL RESEARCH ***ASSOCIATION BETWEEN VARIABLES MEASURED AT THE INTERVAL-RATIO LEVEL (chapter 13) ************************************ ************************************ ************************************ ***CLEAR MEMORY ************************************ clear all ************************************ ***CREATE SHORTCUTS AND LOG FILE ************************************ ***Shortcut for folders global codes = "H:\course\codes" global data = "H:\course\data" global output = "H:\course\output" ***Start saving results window log using "$codes\Stata13.log", replace text ************************************ ***OPENING COMMANDS ************************************ ***Tell Stata to not pause for "more" messages set more off ***Open 2021 GSS use "$data\GSS2021.dta", clear ***Complex survey design svyset [weight=wtssnrps], strata(vstrat) psu(vpsu) singleunit(scaled) ************************************ ***GENERATING VARIABLES ************************************ ***Generate age group variable egen agegr = cut(age), at(18,25,45,65,90) ***Create label for variables label variable agegr "Age group" ***Create labels for categories label define agecode 18 "18-24" 25 "25-44" 45 "45-64" 65 "65-89" ***Assign labels for categories of specific variables label values agegr agecode ***Verify new variable tab agegr, m table agegr, stat(min age) stat(max age) stat(count age) ************************************ ***SCATTERPLOT - Income by age ************************************ ***Scatterplot without regression line twoway scatter conrinc age ***Scatterplot with regression line twoway scatter conrinc age || lfit conrinc age, /// ytitle(Respondent's income) xtitle(Age) twoway (scatter conrinc age) (lfit conrinc age), /// ytitle(Respondent's income) xtitle(Age) ***Save graph graph export "$output\age-income_scatter.png", replace ***Regression coefficients ***Least-squares regression model ***They can be reported in the footnote of the scatterplot ***Income = F(Age) svy, subpop(if conrinc!=.i): reg conrinc age ************************************ ***LINE GRAPH - Mean income by age ************************************ ***Generate variable with mean income by age bysort age: egen mincage=mean(conrinc) sum mincage, d ***Line graph of income by age twoway line mincage age [aweight=wtssnrps], /// ytitle("Mean respondent's income") ylabel(0(20000)100000) ***Save graph graph export "$output\age-income_line.png", replace ***Regression coefficients ***Least-squares regression model ***They can be reported in the footnote of the scatterplot ***Generate age squared gen agesq=age * age ***Income = F(Age, Age squared) svy, subpop(if conrinc!=.i): reg conrinc age agesq ************************************ ***TABLE - Mean income by age group ************************************ ***Use "aweight" to get sample size by age group tabstat conrinc [aweight=wtssnrps], by(agegr) stat(mean sd n) ***Regression coefficients ***Reference category: 45-64 ***Income = F(Age groups) svy, subpop(if conrinc!=.i): reg conrinc ib45.agegr ************************************ ***PEARSON'S r ************************************ ***Respondent's income, age corr conrinc educ [aweight=wtssnrps] pwcorr conrinc age [aweight=wtssnrps] // same as above pwcorr conrinc age [aweight=wtssnrps], sig // with significance test ***Coefficient of determination (r-squared) di .1974^2 ************************************ ***Correlation matrix ************************************ ***Note: educational attainment variable is ordinal, not interval-ratio ***Total number of cases count if conrinc!=.i & age!=.i & age!=.n & educ!=.d & educ!=.n ***Respondent's income income, age, education pwcorr conrinc age educ [aweight=wtssnrps], sig ***Coefficient of determination (r-squared) ***Respondent's income and age di .1974^2 ***Coefficient of determination (r-squared) ***Respondent's income and education di .3406^2 ************************************ ***CLOSING COMMANDS ************************************ ***Save data save "$data\Stata13.dta", replace ***Save log log close