************************************. * Comparing Interval and Nominal Ordinal Variables. * Comparison of Means. * 2000 NES (from Companion). * Attention to Hillary Clinton Feeling Thermometer (hillary) and Party ID (partyid3). ************************************. use nes2000.dta, clear tabulate partyid7, summarize(hillary) nostandard tabulate partyid7, summarize(hillary) tabstat hillary, by(partyid7) stats(mean n) tabstat hillary, by(partyid7) stats(mean n sd semean median) // p50 is same as median tabstat hillary, by(partyid7) stats(mean n sd semean median) long format table partyid7, contents(mean hillary n hillary) table partyid7, contents(mean hillary n hillary) format(%9.2f) center row col // Illustrate changes with graphs. // Equivalent to SPSS line graphs are "dot charts" (graph dot (mean)) // (note, mean is the default) // Also, bar charts can illustrate, like SPSS, the same effects tabulate partyid7 numlabel partyid7, remove graph bar (mean) hillary, over(partyid7) * Handling label problem graph bar (mean) hillary, over(partyid7,label(angle(45))) /// ytitle("Average Feelings Toward Hillary") /// ylabel(,angle(horizontal)) scheme(s1mono) * Boxplots are also quite nice graph box hillary, over(partyid7, label(alt)) graph dot (mean) hillary, over(partyid7) * More familiar perspective graph dot (mean) hillary, over(partyid7, label(angle(45))) vertical * Problem: Cannot connect the lines in a dot plot, like in SPSS * Solution: create artificial "variable" that does not correspond to * observations. Each point is the mean or percentage of a category ********************* * Sugested approach for a line graph * Creation of variables: * y - average feelings toward Hillary * x - party identification * should only be 7 observations (one for each mean of hillary and for each category of party ID * category of party ID // Step 1: Create x * Need the unique numerical values of partyid7, which ranges 0/6 egen pid7_hillary = seq() in 1/7, from(0) to(6) label values pid7_hillary partyid7 * Note, egen is a shorthand command for a series of generate and replace * commands * Alternative approach, using only generate and replace * If there is an egen implementation, that will always be easier gen pid7_hillary = . forvalues i=1/7 { replace pid7_hillary = `i' - 1 in `i' } label values pid7_hillary partyid7 // Step 2: Create y * Calculate the means for each category of partyid7 gen hillary_mean = . // initializes the y variable forvalues i=1/7 { quietly summarize hillary if partyid7==(`i'-1) replace hillary_mean = r(mean) in `i' } * The forvalues command does the following: * for i=1, summarize hillary for partyid7==0 (strong democrat) * r(mean) contains the mean of hillary for Strong Democrats * The second line assigns r(mean) to y (hillary_mean) in the appropriate * position (first observation) * The forvalues command then repeats the same calculations for * i equals 2 through 7 (partyid7 is 1 through 6), and assigning the * means for each party identification category to the correct position * Alternative method using matrices drop pid7_hillary hillary_mean global counter = 1 matrix xpid = J(7,1,.) matrix hillary_mean = J(7,1,.) forvalues i = 0(1)6 { matrix xpid[$counter,1] = `i' quietly summarize hillary if partyid7== `i' local mean = r(mean) matrix hillary_mean[$counter,1] = r(mean) global counter = $counter + 1 } svmat hillary_mean svmat xpid rename xpid pid7_hillary * Simple version of the graph * Do this first, to get a sense of what to cleanup graph twoway line hillary_mean pid7_hillary * clean up graph graph twoway connected hillary_mean pid7_hillary /// , xlabel(0(1)6,valuelabels angle(45)) /// xscale(range(-0.5 6.5)) /// xtitle("Party Identification") /// ytitle("Average Feelings Toward Hillary") scheme(s1mono) * Control variable table partyid7, contents(mean hillary n hillary) table partyid7 gender, contents(mean hillary n hillary) * Graphs graph bar (mean) hillary, over(partyid7) over(gender) graph dot (mean) hillary, over(partyid7) over(gender) vertical * Generate x axis first (7 observations) * Use partyid7_hillary from above * Generate mean variables for men and women gen hillary_mpid = . gen hillary_wpid = . * Calculate the means for each category of partyid3 for Women and Men forvalues i=1/7 { * Women quietly summarize hillary if partyid7==`i'-1 & gender == 2 quietly replace hillary_wpid = r(mean) in `i' * Men quietly summarize hillary if partyid7==`i'-1 & gender == 1 quietly replace hillary_mpid = r(mean) in `i' } graph twoway line hillary_wpid hillary_mpid pid7_hillary * clean up graph graph twoway line hillary_wpid hillary_mpid pid7_hillary /// , xscale(range(0.8 3.2)) xlabel(0(1)6, valuelabels angle(45)) /// xtitle("Party Identification") /// ylabel(#8,angle(horizontal)) /// ytitle("Average Feelings Toward Hillary Clinton") /// clpattern(solid dash) clwidth(*2 *2) /// legend(pos(1) ring(0) col(1) label(1 "Women") label(2 "Men")) /// scheme(s1mono) graph twoway line hillary_wpid hillary_mpid pid7_hillary /// , xscale(range(0.8 3.2)) /// xlabel(0 "Strong Democrat" 1 "Weak Democrat" /// 2 "Leaning Democrat" 3 "Independent" /// 4 "Leaning Republican" 5 "Weak Republican" /// 6 "Strong Republican",angle(45)) /// xtitle("Party Identification") /// ylabel(#8,angle(horizontal)) /// ytitle("Average Feelings Toward Hillary Clinton") /// clpattern(solid dash) clwidth(*2 *2) /// legend(pos(1) ring(0) col(1) label(1 "Women") label(2 "Men"))