Week 7: Ground Game #

Monday, October 21, 2024
14 Days until Presidential Election

Two more weeks until election day! Right now, both candidates are engaging in last-ditch efforts to attract undecided voters. Trump and Harris are jumping between battleground states (and other key electorates) holding media appearances, rallies, fund-raisers, and other campaign events. The New York Times reports that Harris has campaigned at 39 events since September 1st while Trump has campaigned at 59. Whether this discrepancy will have an effect on vote share will be the focus of this week’s post.

Binomial Logit Simulations and Probabilistic Models #

## Joining with `by = join_by(year, state)`

## [1] 7803102

## [1] 42.59986

## [1] 46.66471

## [1] 3.710212

## [1] 6.070667

## [1] 0.2486347

## [1] 0.272381

Up until this point, I have generally been using linear regression models to predict party vote shares and turnout. As we have seen through the weeks, there are a couple problems that consistently appear. Total vote shares can add up to over 100%, but binomial logit regressions ensure that outcomes stick within a certain 0-1 threshold. What’s more, binomial logit regressions are strong in showing the odds of one outcome over another, which is most helpful in determining who is most likely to win an election.

The above charts demonstrate the relationship between hypothetical poll support for each party in each state with the probability of state-eligible voter voting for party. We can compare North Carolina and Georgia and see that there exist two different non-linear relationships across the states. In Georgia, as poll support for both Democrats and Republicans increase, the probability of state-eligible voter voting for the respective party increases relatively gradually. In North Carolina, as poll support for Democrats increase, the probability of state-eligible voter voting for Democrats increases dramatically (the relationship for Republicans appears more gradual and linear).

In Georgia, my home state, the voting eligible population increases linearly with each year.

The distribution of predicted draws on the win margin for Trump in Georgia shows a firm lead with a range of around 4 points in his favor.

We have simulated fluctuations in the probability of a voter from the voting eligible population voting for a party by using a prior at the standard deviation of its polls. This distribution demonstrates an incredibly close race between Harris and Trump but a skew toward Trump, suggesting a Trump victory.

Field Offices and Campaign Events #

Table 1: Obama Romney Field Offices
	Model 1	Model 2
(Intercept)	−0.340	0.001
	(0.196)	(0.079)
romney12fo	2.546
	(0.114)
swingTRUE	0.0006	−0.012
	(0.055)	(0.011)
core_repTRUE	0.007
	(0.061)
battleTRUE	0.541	0.014
	(0.096)	(0.042)
medage08	−0.0003	−0.0009
	(0.003)	(0.001)
pop2008	0.0000007	−7e−08
	(4e−08)	(2e−08)
medinc08	−0.000002	0.000001
	(0.000001)	(0.0000006)
black	0.003	0.00005
	(0.001)	(0.0005)
hispanic	0.0002	0.0008
	(0.001)	(0.0006)
pc_less_hs00	0.506	−0.130
	(0.259)	(0.112)
pc_degree00	0.951	0.305
	(0.223)	(0.097)
as.factor(state)Arizona	−0.028	−0.050
	(0.156)	(0.067)
as.factor(state)Arkansas	0.076	0.001
	(0.090)	(0.039)
as.factor(state)California	−0.076	−0.099
	(0.104)	(0.045)
as.factor(state)Colorado	0.163	−0.173
	(0.094)	(0.041)
as.factor(state)Connecticut	0.036	−0.145
	(0.200)	(0.086)
as.factor(state)Delaware	0.207	−0.135
	(0.309)	(0.134)
as.factor(state)Florida	−0.290	0.244
	(0.091)	(0.039)
as.factor(state)Georgia	0.029	−0.018
	(0.077)	(0.033)
as.factor(state)Hawaii	0.179	−0.126
	(0.272)	(0.117)
as.factor(state)Idaho	0.154	−0.054
	(0.109)	(0.047)
as.factor(state)Illinois	0.117	−0.033
	(0.088)	(0.038)
as.factor(state)Indiana	0.140	−0.032
	(0.090)	(0.039)
as.factor(state)Iowa	0.081	−0.115
	(0.081)	(0.035)
as.factor(state)Kansas	0.147	−0.046
	(0.091)	(0.039)
as.factor(state)Kentucky	0.105	0.008
	(0.084)	(0.036)
as.factor(state)Louisiana	−0.004	−0.0009
	(0.092)	(0.040)
as.factor(state)Maine	0.274	−0.088
	(0.150)	(0.065)
as.factor(state)Maryland	−0.047	−0.074
	(0.128)	(0.055)
as.factor(state)Massachusetts	−0.118	−0.110
	(0.163)	(0.070)
as.factor(state)Michigan	−0.083	0.175
	(0.093)	(0.040)
as.factor(state)Minnesota	0.271	−0.073
	(0.092)	(0.040)
as.factor(state)Mississippi	−0.034	0.0002
	(0.087)	(0.038)
as.factor(state)Missouri	0.040	0.059
	(0.085)	(0.037)
as.factor(state)Montana	0.167	−0.050
	(0.103)	(0.045)
as.factor(state)Nebraska	0.167	−0.031
	(0.095)	(0.041)
as.factor(state)Nevada	−0.054	0.208
	(0.143)	(0.062)
as.factor(state)New Hampshire	0.091	0.178
	(0.178)	(0.077)
as.factor(state)New Jersey	−0.166	−0.087
	(0.137)	(0.059)
as.factor(state)New Mexico	0.093	0.080
	(0.128)	(0.055)
as.factor(state)New York	−0.019	−0.056
	(0.097)	(0.042)
as.factor(state)North Carolina	0.230	0.054
	(0.084)	(0.036)
as.factor(state)North Dakota	0.163	−0.032
	(0.102)	(0.044)
as.factor(state)Ohio	0.305	−0.031
	(0.084)	(0.036)
as.factor(state)Oklahoma	0.113	−0.022
	(0.092)	(0.040)
as.factor(state)Oregon	0.270	−0.093
	(0.115)	(0.050)
as.factor(state)Pennsylvania	−0.351	0.111
	(0.089)	(0.039)
as.factor(state)Rhode Island	0.117	−0.123
	(0.246)	(0.106)
as.factor(state)South Carolina	−0.003	−0.027
	(0.101)	(0.044)
as.factor(state)South Dakota	0.156	−0.033
	(0.097)	(0.042)
as.factor(state)Tennessee	0.081	0.005
	(0.087)	(0.038)
as.factor(state)Texas	0.044	−0.037
	(0.082)	(0.035)
as.factor(state)Utah	0.036	0.060
	(0.124)	(0.054)
as.factor(state)Vermont	0.146	−0.080
	(0.158)	(0.068)
as.factor(state)Virginia	−0.396	0.027
	(0.083)	(0.036)
as.factor(state)Washington	0.280	−0.122
	(0.112)	(0.048)
as.factor(state)West Virginia	0.142	0.003
	(0.099)	(0.043)
as.factor(state)Wyoming	0.160	−0.064
	(0.135)	(0.058)
romney12fo × swingTRUE	−0.765
	(0.116)
romney12fo × core_repTRUE	−1.875
	(0.131)
obama12fo		0.374
		(0.020)
core_demTRUE		0.004
		(0.027)
obama12fo × swingTRUE		−0.081
		(0.020)
obama12fo × core_demTRUE		−0.164
		(0.023)
Num.Obs.	3110	3110
R2	0.712	0.651
R2 Adj.	0.706	0.644
AIC	4851.7	−366.8
BIC	5226.3	7.9
Log.Lik.	−2363.855	245.384
RMSE	0.52	0.22

Table 1: Effects of Field Offices on Turnout and Vote Share
	Model 1	Model 2
(Intercept)	0.029	0.022
	(0.002)	(0.003)
dummy_fo_change	0.004	0.009
	(0.001)	(0.002)
battleTRUE	0.024	0.043
	(0.002)	(0.003)
as.factor(state)Arizona	−0.012	0.0004
	(0.005)	(0.007)
as.factor(state)Arkansas	−0.026	−0.055
	(0.003)	(0.004)
as.factor(state)California	−0.021	0.020
	(0.003)	(0.005)
as.factor(state)Colorado	−0.024	−0.035
	(0.003)	(0.005)
as.factor(state)Connecticut	−0.022	0.008
	(0.006)	(0.010)
as.factor(state)Delaware	−0.001	0.033
	(0.010)	(0.015)
as.factor(state)District of Columbia	0.035	−0.002
	(0.017)	(0.026)
as.factor(state)Florida	−0.035	−0.048
	(0.003)	(0.005)
as.factor(state)Georgia	−0.001	0.002
	(0.002)	(0.004)
as.factor(state)Hawaii	−0.021	0.069
	(0.009)	(0.013)
as.factor(state)Idaho	−0.023	0.005
	(0.003)	(0.005)
as.factor(state)Illinois	−0.029	−0.004
	(0.003)	(0.004)
as.factor(state)Indiana	−0.030	−0.010
	(0.003)	(0.004)
as.factor(state)Iowa	−0.038	−0.039
	(0.003)	(0.005)
as.factor(state)Kansas	−0.035	−0.009
	(0.003)	(0.004)
as.factor(state)Kentucky	−0.029	−0.029
	(0.003)	(0.004)
as.factor(state)Louisiana	−0.006	−0.014
	(0.003)	(0.004)
as.factor(state)Maine	−0.030	0.007
	(0.005)	(0.007)
as.factor(state)Maryland	0.005	0.020
	(0.004)	(0.006)
as.factor(state)Massachusetts	−0.005	−0.007
	(0.005)	(0.007)
as.factor(state)Michigan	−0.035	−0.019
	(0.003)	(0.004)
as.factor(state)Minnesota	−0.021	0.0008
	(0.003)	(0.004)
as.factor(state)Mississippi	0.002	0.017
	(0.003)	(0.004)
as.factor(state)Missouri	−0.037	−0.045
	(0.003)	(0.004)
as.factor(state)Montana	−0.015	−0.019
	(0.003)	(0.005)
as.factor(state)Nebraska	−0.020	0.008
	(0.003)	(0.004)
as.factor(state)Nevada	−0.039	−0.039
	(0.005)	(0.007)
as.factor(state)New Hampshire	−0.038	−0.032
	(0.006)	(0.009)
as.factor(state)New Jersey	−0.018	0.023
	(0.004)	(0.006)
as.factor(state)New Mexico	−0.032	−0.008
	(0.004)	(0.006)
as.factor(state)New York	−0.035	0.019
	(0.003)	(0.004)
as.factor(state)North Carolina	0.004	−0.014
	(0.003)	(0.004)
as.factor(state)North Dakota	−0.009	0.002
	(0.003)	(0.005)
as.factor(state)Ohio	−0.049	−0.041
	(0.003)	(0.005)
as.factor(state)Oklahoma	−0.046	−0.026
	(0.003)	(0.004)
as.factor(state)Oregon	−0.033	0.006
	(0.003)	(0.005)
as.factor(state)Pennsylvania	−0.050	−0.047
	(0.003)	(0.005)
as.factor(state)Rhode Island	−0.009	0.007
	(0.008)	(0.012)
as.factor(state)South Carolina	0.014	0.013
	(0.003)	(0.005)
as.factor(state)South Dakota	−0.044	−0.002
	(0.003)	(0.004)
as.factor(state)Tennessee	−0.033	−0.048
	(0.003)	(0.004)
as.factor(state)Texas	−0.025	−0.009
	(0.002)	(0.003)
as.factor(state)Utah	−0.018	−0.015
	(0.004)	(0.006)
as.factor(state)Vermont	−0.025	0.035
	(0.005)	(0.007)
as.factor(state)Virginia	−0.014	−0.033
	(0.003)	(0.005)
as.factor(state)Washington	−0.009	0.009
	(0.003)	(0.005)
as.factor(state)West Virginia	−0.043	−0.044
	(0.003)	(0.005)
as.factor(state)Wisconsin	−0.046	−0.037
	(0.003)	(0.005)
as.factor(state)Wyoming	−0.021	−0.011
	(0.004)	(0.006)
as.factor(year)2012	−0.021	−0.045
	(0.0007)	(0.001)
dummy_fo_change × battleTRUE	−0.002	0.007
	(0.002)	(0.003)
Num.Obs.	6224	6224
R2	0.424	0.473
R2 Adj.	0.419	0.469
AIC	−28783.2	−23658.9
BIC	−28412.7	−23288.4
Log.Lik.	14446.586	11884.440
RMSE	0.02	0.04

Trump Field Offices	Clinton Field Offices	Romney Field Offices	Obama Field Offices
165	538	283	791

For now, we return to linear regressions to the evaluate the relationship between states and demographics and the presence of field offices for the Obama (Model 1) and Romney (Model 2) campaigns in 2012. The Obama model shows that, on average, for every field office that Romney had, Obama had about 2.5. The Obama and Romney campaigns also were more likely to have field offices in counties with higher educational degree attainment. The Obama campaign was more likely to have field offices in counties with higher percentages of less-than-high-school levels of educational attainment than the Romney campaign. We could analyze the relationships between these campaigns and the demographics of the counties they exist in endlessly, but these models are useful in giving color to the idea that the decision of setting up a field office in a certain district is intentional.

In another set of models, we can evaluate the effects of the presence of field offices on turnout (Model 1) and Democratic vote share (Model 2). On average, Democratic vote share and turnout were marginalyl higher in the counties of battleground states with field offices. We can also see the discrepancy in the sheer number of field offices between 2012 and 2016 campaigns; the Democratic candidates, Clinton and Obama, had far more field offices than their Republican opponents

## `summarise()` has grouped output by 'year', 'state'. You can override using the
## `.groups` argument.

Table 2: Can the number of campaign events predict state-level vote share?
	Model 1	Model 2
(Intercept)	48.189	51.810
	(0.369)	(0.369)
n_ev_D	0.126
	(0.034)
ev_diff_D_R	0.105
	(0.067)
n_ev_R		−0.126
		(0.034)
ev_diff_R_D		0.230
		(0.078)
Num.Obs.	714	714
R2	0.021	0.021
R2 Adj.	0.019	0.019
AIC	4910.1	4910.2
BIC	4928.4	4928.5
Log.Lik.	−2451.039	−2451.089
F	7.778	7.776
RMSE	7.49	7.49

The above maps visualize the locations of various campaign events held for Democrats and Republicans across the current and past two elections. Across all three maps, we see a concentration of events in the Northeast region of the country. We also see how from 2016 to 2024, the number of events greatly diminishes in Florida; this could likely be to the fact that it is now considered much less of a swing state than it used to be. There are fewer events on the whole in 2020 because of the pandemic. By 2024, the campaign events occur either entirely in battleground states or major fundraising centers for each party (e.g. New York and California for the Democrats).

Take a look at the summary statistics of the model that we created to predict vote share based on the number of campaign events, and you will find pretty large coefficients for the Democrats when they have more events over Republicans (Model 1) and for the Repbulicans when they are the ones that possess the positive margin. This predictive power is quickly humbled by an abysmal R-Squared value. I decide to leave the number of campaign events out of my forecasting model for this reason.

## `summarise()` has grouped output by 'state', 'party'. You can override using
## the `.groups` argument.

Here, I visualize the lead that a party has over the other in terms of the number of campaign events held in key battleground states. I look at the current campaign and the past two presidential campaign years. In 2016 and 2020, the Republicans generally held more events in battleground states than the Democrats; in fact, in 2016, they held out over Democrats in all battlegrounds states. By 2024, though, Democrats are holding more events in these states. At the same time, I am still not involving campaign event lead in my predictive model, and I do not believe that Democrats having more events in these states is determinative or even indicative of a win for them.

Updating Model Predictions #

state	electors	winner
Alabama	9	Republican
Alaska	3	Republican
Arizona	11	Republican
Arkansas	6	Republican
California	54	Democrat
Colorado	10	Democrat
Connecticut	7	Democrat
Delaware	3	Democrat
District Of Columbia	3	Democrat
Florida	30	Republican
Georgia	16	Republican
Hawaii	4	Democrat
Idaho	4	Republican
Illinois	19	Democrat
Indiana	11	Republican
Iowa	6	Republican
Kansas	6	Republican
Kentucky	8	Republican
Louisiana	8	Republican
Maine	4	Democrat
Maryland	10	Democrat
Massachusetts	11	Democrat
Michigan	15	Republican
Minnesota	10	Democrat
Mississippi	6	Republican
Missouri	10	Republican
Montana	4	Republican
Nebraska	5	Republican
Nevada	6	Republican
New Hampshire	4	Democrat
New Jersey	14	Democrat
New Mexico	5	Democrat
New York	28	Democrat
North Carolina	16	Republican
North Dakota	3	Republican
Ohio	17	Republican
Oklahoma	7	Republican
Oregon	8	Democrat
Pennsylvania	19	Republican
Rhode Island	4	Democrat
South Carolina	9	Republican
South Dakota	3	Republican
Tennessee	11	Republican
Texas	40	Republican
Utah	6	Republican
Vermont	3	Republican
Virginia	13	Democrat
Washington	12	Democrat
West Virginia	4	Republican
Wisconsin	10	Republican
Wyoming	3	Republican

winner	electoral_votes
Democrat	223
Republican	315

This week’s model is virtually the same as last week’s save for the fact that I rely on elastic-net regression instead of LASSO regression. I fear that LASSO regression is too penalizing, and I think it is wise to take the best of the ridge and LASSO models for something so unclear like election forecasting. With updated polling and economic data and this new regularization method, Trump is predicted to win the election by grabbing the electoral votes of all seven of the battleground states.

Conclusion #

According to this week’s models, Trump will win the 2024 Presidential Election, taking 315 electoral votes.

In comparison to last week’s model, this week presents a landslide victory for Trump. I made an effort to regularize my model this week through a more generous method this week, and I think that is mainly why this week’s model presents a much much larger margin. I will continue to regularize my models going forward. A lead in the number of campaign events a certain party holds over another does not seem to really affect vote share or relevant variables that could really tip the election in any direction. In my last week of forecasting, I hope to create one final and robust model (taking the best methods from each week) and tee up the ball for my final prediction. I look forward to seeing you next week!

Sources #

The New York Times. “Where Are Trump and Harris Campaigning?” The New York Times, updated 18 Oct. 2024, https://www.nytimes.com/interactive/2024/10/16/us/politics/harris-trump-2024-campaign.html.

Polling Data Provided by GOV 1347: Election Analytics teaching staff (which itself drew from the FiveThirtyEight GitHub)

Economic Data Provided by GOV 1347: Election Analytics teaching staff (which itself drew from the Burueau of Economic Analysia and Federal Reserve Economic Data)