************************************ ************************************ ***SOCI 420: ADVANCED METHODS OF SOCIAL RESEARCH ***MEASURES OF DISPERSION (chapter 4) ************************************ ************************************ ************************************ ***CLEAR MEMORY ************************************ clear all ************************************ ***CREATE SHORTCUTS AND LOG FILE ************************************ ***Shortcut for folders global codes = "H:\course\codes" global data = "H:\course\data" global output = "H:\course\output" ***Start saving results window log using "$codes\Stata04.log", replace text ************************************ ***OPENING COMMANDS ************************************ ***Tell Stata to not pause for "more" messages set more off ***Open 2021 GSS use "$data\GSS2021.dta", clear ***Complex survey design svyset [weight=wtssnrps], strata(vstrat) psu(vpsu) singleunit(scaled) ************************************ ***GENERATE VARIABLES ************************************ ************************************ ***Sex ***Original variable tab sex tab sex, m tab sex, m nolabel ***Generate variable generate female=. replace female=0 if sex==1 replace female=1 if sex==2 ***Verify variable tab female, m tab sex female, m ***Labels label variable female "Sex" // Label for variable label define female 0 "Male" 1 "Female" // Labels for categories label values female female // Assign labels for categories tab female, m ************************************ ***Hispanic ***Original variable tab hispanic tab hispanic, m tab hispanic, m nolabel ***Generate variable gen hisp=. replace hisp=0 if hispanic==1 replace hisp=1 if hispanic>=2 & hispanic<=50 ***Verify variable tab hisp, m tab hispanic hisp, m ***Labels label variable hisp "Hispanic" // Label for variable label define hisp 0 "Non-Hispanic" 1 "Hispanic" // Labels for categories label values hisp hisp // Assign labels for categories tab hisp, m ************************************ ***Race/ethnicity ***Original variable tab race tab race, m tab race, m nolabel ***Generate variable gen raceeth=. replace raceeth=1 if race==1 & hisp==0 //non-hispanic white replace raceeth=2 if race==2 & hisp==0 //non-hispanic black replace raceeth=3 if hisp==1 //hispanic replace raceeth=4 if race==3 & hisp==0 //other ***Verify variable tab raceeth, m tab raceeth race, m tab raceeth hisp, m ***Labels label variable raceeth "Race/Ethnicity" // Label for variable label define racecode 1 "Non-hispanic white" 2 "Non-hispanic black" 3 "Hispanic" 4 "Other" // Labels for categories label values raceeth racecode // Assign labels for categories tab raceeth, m ************************************ ***Age group ***Original variable tab age tab age, m tab age, m nolabel ***Generate variable egen agegr = cut(age), at(18,25,45,65,90) ***Verify variable tab agegr, m table agegr, stat(min age) stat(max age) stat(count age) ***Labels label variable agegr "Age group" // Label for variable label define agecode 18 "18-24" 25 "25-44" 45 "45-64" 65 "65-89" // Labels for categories label values agegr agecode // Assign labels for categories tab agegr, m ************************************ ***Education group ***Original variable tab educ, m tab educ, m nolabel ***Generate variable gen educgr=. replace educgr=1 if educ>=0 & educ<=11 // Less than high school replace educgr=2 if educ==12 // High school replace educgr=3 if educ>=13 & educ<=15 // Some college replace educgr=4 if educ==16 // College replace educgr=5 if educ>=17 & educ<=20 // 5+ years of college, graduate school ***Verify variable tab educgr, m tab educ educgr, m ***Labels label variable educgr "Education group" // Label for variable label define educgr 1 "Less than high school" 2 "High school" 3 "Some college" 4 "College" 5 "Some graduate school" // Labels for categories label values educgr educgr // Assign labels for categories tab educgr, m ************************************ ***Religion ***Original variable tab relig tab relig, m tab relig, m nolabel ***Generate variable gen religion=. replace religion=1 if relig==1 //protestant replace religion=2 if relig==2 //catholic replace religion=3 if relig==3 //jewish replace religion=4 if relig>=5 & relig<=13 //other replace religion=5 if relig==4 //none ***Verify variable tab religion, m tab relig religion, m ***Labels label variable religion "Religion" // Label for variable label define relcode 1 "Protestant" 2 "Catholic" 3 "Jewish" 4 "Other" 5 "None" // Labels for categories label values religion relcode // Assign labels for categories tab religion, m ************************************ ***Veterans ***Original variable tab vetyears tab vetyears, m tab vetyears, m nolabel ***Generate variable gen veteran=. replace veteran=1 if vetyears>=1 & vetyears<=4 // Some years of active duty replace veteran=0 if vetyears==0 // No active duty ***Verify variable tab veteran, m tab vetyears veteran, m ***Labels label variable veteran "Veteran" // Label for variable label define veteran 0 "Non-Veteran" 1 "Veteran" // Labels for categories label values veteran veteran // Assign labels for categories tab veteran, m ************************************ ***Immigration attitude ***Original variable tab letin1, m tab letin1, m nolabel ***Generate variable ***"remain the same as it is" will be missing gen proimmig=. replace proimmig=1 if letin1==1 | letin1==2 replace proimmig=0 if letin1==4 | letin1==5 ***Verify variable tab proimmig, m tab letin1 proimmig, m ***Labels label variable proimmig "Immigration attitude" // Label for variable label define proimmig 0 "Anti-immigration" 1 "Pro-immigration" // Labels for categories label values proimmig proimmig // Assign labels for categories tab proimmig, m ************************************ ***Political party ***Original variable tab partyid, m tab partyid, m nolabel ***Generate variable ***"Independents" will be missing gen democrat=. replace democrat=1 if partyid>=0 & partyid<=2 replace democrat=0 if partyid>=4 & partyid<=6 ***Verify variable tab democrat, m tab partyid democrat, m ***Labels label variable democrat "Political party" // Label for variable label define party 1 "Democrats" 0 "Republicans" // Label for categories label values democrat party // Assign labels for categories tab democrat, m ************************************ ***INCOME BY CATEGORIES OF ONE VARIABLE ************************************ ***Income tabstat conrinc [aweight=wtssnrps], stat(min p25 p50 p75 max iqr mean sd) ***Income by sex tabstat conrinc [aweight=wtssnrps], by(female) stat(min p25 p50 p75 max iqr mean sd) ***Income by race/ethnicity tabstat conrinc [aweight=wtssnrps], by(raceeth) stat(min p25 p50 p75 max iqr mean sd) ***Income by age group tabstat conrinc [aweight=wtssnrps], by(agegr) stat(min p25 p50 p75 max iqr mean sd) ************************************ ***INCOME BY COMBINATIONS OF MORE THAN ONE VARIABLE ************************************ ***Income by sex and race/ethnicity table raceeth female [aweight=wtssnrps], stat(min conrinc) stat(p25 conrinc) /// stat(p50 conrinc) stat(p75 conrinc) /// stat(max conrinc) stat(iqr conrinc) /// stat(mean conrinc) stat(sd conrinc) ***Income by sex and age group table agegr female [aweight=wtssnrps], stat(min conrinc) stat(p25 conrinc) /// stat(p50 conrinc) stat(p75 conrinc) /// stat(max conrinc) stat(iqr conrinc) /// stat(mean conrinc) stat(sd conrinc) ************************************ ***INCOME WITH COMPLEX SAMPLE DESIGN ************************************ ***No weight mean conrinc if conrinc!=. estat sd ***Weight ***It corrects the mean mean conrinc if conrinc!=. [aweight=wtssnrps] estat sd ***Complex survey design ***It correct the mean, standard error, and standard deviation svy, subpop(if conrinc!=.): mean conrinc estat sd ***Income by sex svy, subpop(if conrinc!=.): mean conrinc, over(female) estat sd ***Income by race/ethnicity svy, subpop(if conrinc!=.): mean conrinc, over(raceeth) estat sd ***Income by age group svy, subpop(if conrinc!=.): mean conrinc, over(agegr) estat sd ***Income by sex and race/ethnicity svy, subpop(if conrinc!=.): mean conrinc, over(female raceeth) estat sd svy, subpop(if conrinc!=.): mean conrinc, over(raceeth female) estat sd ***Income by sex and age group svy, subpop(if conrinc!=.): mean conrinc, over(female agegr) estat sd svy, subpop(if conrinc!=.): mean conrinc, over(agegr female) estat sd ************************************ ***BOXPLOT ************************************ ***Income graph box conrinc [aweight=wtssnrps], ytitle(Respondents' income) graph hbox conrinc [aweight=wtssnrps], ytitle(Respondents' income) ***Income by sex graph hbox conrinc [aweight=wtssnrps], over(female) ytitle(Respondents' income) ***Income by race/ethnicity graph hbox conrinc [aweight=wtssnrps], over(raceeth) ytitle(Respondents' income) ***Income by age group graph hbox conrinc [aweight=wtssnrps], over(agegr) ytitle(Respondents' income) ***Income by sex and race/ethnicity graph hbox conrinc [aweight=wtssnrps], over(female) over(raceeth) ytitle(Respondents' income) graph hbox conrinc [aweight=wtssnrps], over(raceeth) over(female) ytitle(Respondents' income) ***Income by sex and age group graph hbox conrinc [aweight=wtssnrps], over(female) over(agegr) ytitle(Respondents' income) graph hbox conrinc [aweight=wtssnrps], over(agegr) over(female) ytitle(Respondents' income) ************************************ ***CLOSING COMMANDS ************************************ ***Save data save "$data\Stata04.dta", replace ***Save log log close