************************************ ************************************ ***SOCI 420: ADVANCED METHODS OF SOCIAL RESEARCH ***BIVARIATE ASSOCIATION FOR NOMINAL- AND ORDINAL-LEVEL VARIABLES (chapter 12) ************************************ ************************************ ************************************ ***CLEAR MEMORY ************************************ clear all ************************************ ***CREATE SHORTCUTS AND LOG FILE ************************************ ***Shortcut for folders global codes = "H:\course\codes" global data = "H:\course\data" global output = "H:\course\output" ***Start saving results window log using "$codes\Stata12.log", replace text ************************************ ***OPENING COMMANDS ************************************ ***Tell Stata to not pause for "more" messages set more off ***Open 2021 GSS use "$data\GSS2021.dta", clear ***Complex survey design svyset [weight=wtssnrps], strata(vstrat) psu(vpsu) singleunit(scaled) ************************************ ***GENERATING VARIABLES ************************************ ***Generate dummy variable for Democrats vs. Republicans ***"Independents" will be missing tab partyid, m tab partyid, m nolabel gen democrat=. replace democrat=1 if partyid>=0 & partyid<=2 replace democrat=0 if partyid>=4 & partyid<=6 label variable democrat "Political party" label define party 1 "Democrats" 0 "Republicans" label values democrat party tab partyid democrat, m tab democrat, m ***Generate dummy variable for hispanic tab hispanic, m tab hispanic, m nolabel gen hisp=. replace hisp=0 if hispanic==1 replace hisp=1 if hispanic>=2 & hispanic<=50 tab hispanic hisp, m tab hisp, m ***Generate race/ethnicity variable tab race, m tab race, m nolabel gen raceeth=. replace raceeth=1 if race==1 & hisp==0 //non-hispanic white replace raceeth=2 if race==2 & hisp==0 //non-hispanic black replace raceeth=3 if hisp==1 //hispanic replace raceeth=4 if race==3 & hisp==0 //other label variable raceeth "Race/Ethnicity" label define race 1 "White" 2 "Black" 3 "Hispanic" 4 "Other" label values raceeth race tab raceeth race, m tab raceeth hisp, m tab raceeth, m ***Generate education group tab educ, m tab educ, m nolabel gen educgr=. replace educgr=1 if educ>=0 & educ<=11 // Less than high school replace educgr=2 if educ==12 // High school replace educgr=3 if educ>=13 & educ<=15 // Some college replace educgr=4 if educ==16 // College replace educgr=5 if educ>=17 & educ<=20 // 5+ years of college, graduate school ***Create label for variable label variable educgr "Education group" ***Create labels for categories label define educgr 1 "Less than high school" 2 "High school" /// 3 "Some college" 4 "College" 5 "Graduate school" ***Assign labels for categories label values educgr educgr ***Verify new variable tab educgr, m tab educ educgr, m ************************************ ***ASSOCIATIONS BETWEEN NOMINAL-LEVEL VARIABLES ************************************ ************************************ ***PHI - Political party by ethnicity ************************************ ***Remember to report column percentages ***taking into account survey weights tab democrat hisp [aweight=wtssnrps], col nofreq // column percentages tab democrat hisp // sample size tab democrat hisp, m // missing cases ***Phi correlation coefficient ***Phi is designed to measure the degree ***of relation for two binary variables ***(i.e., dichotomous variables, dummy variables) ***To compute Phi, first convert the binary variables into 1's and 0's, ***and estimate the Pearson'r correlation corr democrat hisp // in this case, Pearson's r correlation same as Phi pwcorr democrat hisp // same as above pwcorr democrat hisp, sig // Phi with test of significance ************************************ ***CHI SQUARE, LAMBDA, CRAMER'S V - Political party by race/ethnicity ************************************ ***Remember to report column percentages ***taking into account survey weights tab partyid raceeth [aweight=wtssnrps], col nofreq // column percentages tab partyid raceeth // sample size tab partyid raceeth, m // missing cases ***Chi square tab partyid raceeth, chi // weights not allowed svy: tab partyid raceeth // chi square test with complex survey design (correct form) ***Cramer's V tab partyid raceeth, V // weights not allowed ***Chi square, Cramer's V tab educgr raceeth, chi V // weights not allowed ***Lambda ***If your Stata doesn't have the lambda command, ***type "ssc install lambda" to install it. ***ssc install lambda ***Note: When row totals are very unequal, ***Lambda can be zero even when there is an association between the variables. ***For very unequal row marginals, it's better to use ***a Chi Square based measure of association. lambda partyid raceeth [aweight=wtssnrps] ************************************ ***ASSOCIATIONS BETWEEN ORDINAL-LEVEL VARIABLES ************************************ ************************************ ***GAMMA - Politcal party by education group ************************************ ***Remember to report column percentages ***taking into account survey weights tab partyid educgr [aweight=wtssnrps], col nofreq // column percentages tab partyid educgr // sample size tab partyid educgr, m // missing cases ***Gamma measures the strength and pattern/direction of the association tab partyid educgr, gamma // weights not allowed ***Test statistic: Z = gamma / ASE ***ASE: Asymptotic Standard Error di -0.1259/0.015 // test statistic ***p-value ***"normal" command calculates area under the curve below the Z-score ***If Z is positive, p-value (one-tailed test): di 1-normal(Z) ***If Z is negative, p-value (one-tailed test): di normal(Z) di normal(-8.3933333) // p-value ************************************ ***SPEARMAN'S RHO - Political party by years of schooling ************************************ tab partyid, m tab educ, m ***Remember to report column percentages ***taking into account survey weights ***These commands might not work, because of too many values for education tab partyid educ [aweight=wtssnrps], col nofreq // column percentages tab partyid educ // sample size tab partyid educ, m // missing cases ***Spearman's rho (rank correlation coefficient) spearman partyid educ // weights not allowed ***Spearman's rho squared di (-0.1320) * (-0.1320) di -0.1320^2 ************************************ ***CLOSING COMMANDS ************************************ ***Save data save "$data\Stata12.dta", replace ***Save log log close