A Socio-Economic Analysis of Polling Booth Catchments at the 2019 Australian Federal Election

Abstract Within the Australian political geography literature, a growing body of work has aimed to understand the distinctive socio-economic and demographic patterns that characterize the Australian federal political landscape. Using results from the last two federal elections (2016 and 2019), this paper analyses polling booth catchments in the context of stability and change in voter support. Polling booths are assigned to four possible types-stable Coalition, stable Labor, change to Coalition and change to Labor-depending on outcomes between 2016 and 2019. The polling booths are then given a unique spatially defined catchment which is then compared across a range of socioeconomic and demographic variables. The results suggest significant differences in polling booth catchments across several key indicators. These differences can be used to understand Australia’s political geography’s distinct and changing nature and add an important component to the existing literature.


Introduction
At the 2019 Australian federal election, voters defied predictions by polling agencies and other political professionals by returning the Liberal/National Party Coalition (the Coalition) to power. The two-party preferred vote of 51.57 per cent resulted in the Coalition winning 77 of the 151 seats on offer, ahead of the Australian Labor Party (Labor), who won 68 seats with 48.7 per cent of the votes. On a two-party basis, Labor suffered a 1.17 per cent swing against them. The election postmortem focused on the ability of the Coalition to win over some of Labor's traditional support base, the failure of muddled Labor policy to gain traction and the approach the Labor party was taking to climate change and their attitude toward Adani (a large coal mining company) and coal mining in general, which saw large numbers of voters become increasingly alienated. The electorate, according to some commentators, could be represented by increasing divisions between an Australia made up of two parts, & Pons, 2019; Clarke, Goodwin, & Whiteley, 2017;Cook, Hill, Trichka, Hwang, & Sommers, 2017;Goot & Watson, 2007;Heath & Goodwin, 2017;Niemi, Written, & Franklin, 1992;Pasek et al., 2009). Aided by the widespread availability of survey data, studies within this genre have considered the voting choices expressed by individuals and have compared these revealed choices with the characteristics of each individual respondent. In the US for example Cook et al. (2017) used exit polls to understand support for Donald Trump, while in the UK, Bailey et al. (2020) utilized the findings a panel survey of voters to identify the factors, including the socio-economic characteristics of voters, explaining the results of the BREXIT vote. In the Australian context, the availability of the Australian Electoral Survey conducted at each federal election provides a valuable insight into voting outcomes. Considering the most recent data, which covers the 2019 federal election Cameron and McAllister (2019) analyzed Australians' political attitudes and behavior, noting the clear divisions between different cohorts of voters and their voting behavior. In particular, the findings from the survey illustrate that males, those in older age cohorts, and those characterized as middle-class tended to lean more toward supporting the conservative Liberal/National party Coalition. In contrast, females, younger voters and working-class voters tended more toward Labor support.
While survey data such as that referred to above provide an interesting and useful insight into voting patterns and outcomes, researchers working in the broad area of political geography have added an additional layer to our understanding (Pacione, 2014). Researchers such as RL Johnston, Shelly, and Taylor (1979) have had a significant influence on the field of electoral geography publishing a range of studies ( RJ Johnston, 1980;Ron Johnston, Manley, Pattie, & Jones, 2018;Ron Johnston & Pattie, 2006;Rossiter, Pattie, & Johnston, 2021) and providing the context for the significant collection of international literature (Forest, 2018;Furlong, 2019;Goetz, Davlasheridze, Han, & Fleming-Muñoz, 2019;Scala, Johnson, & Rogers, 2015).
Within the broadly defined Australian political geography literature, there are several examples which have attempted to provide an understanding of voting patterns from the context of a geographical lens. Early studies such as the work by Herbert (1975), RJ Johnston and Forrest (1985) and Jones (1981) considered the impact of 'geography' and 'space' on voting outcomes within limited methodological approaches, while more contemporary work has benefited from advances in both data availability and methodological advancing including the wide-spread use of GIS technologies. While only focusing on one electorate, Forrest et al. (2001) illustrated the utility of combining polling booth data and spatially linked census data to analyze the socio-demographic characteristics of voting outcomes. Looking at the Australian state of New South Wales and the federal electoral division of Farrer, the authors used GIS software to build a unique dataset representing polling booth catchment areas combined with voting outcomes and socio-demographic data. Using voting outcome as a dependent variable and the socio-demographic characteristics of polling booth catchment areas as independent variables, the researchers found election outcomes in terms of first preference votes differed across parties in terms of gender, age, socio-economic status, education level, country of birth and location in an urban or rural catchment area.
Using a similar approach of attaching polling booth outcome data to socio-demographic data at a spatial level, Davis and Stimson (1998) considered the rise of Pauline Hanson's One Nation Party during the 1998 Queensland State election. Using a range of variables representing factors that were thought to attract or repel support for One Nation, the researchers identified key characteristics that predicted a positive outcome for the party. Specifically, they found that the core areas of One Nation support appeared in communities with significant levels of population change associated with in-migrants searching for a particular urban-rural lifestyle. Considering their results at length, the researchers conclude: Where these areas contain unskilled workers in blue-collar industries, few indigenous Australians or people born overseas, and have a high number of people either achieving or attempting to achieve the Australian dream of homeownership, then the ONP is likely to do well. Conversely, inner-city areas, particularly those areas marked by high levels of multicultural populations or higher incomes, are the safe havens of the major political parties (p. 81).
Following on from this analysis, Stimson, together with several other colleagues (Robert Stimson, McCrea, and Shyy 2006;R. J. Stimson and Shyy 2009), and (R Stimson and Shyy 2013), subsequently published a series of papers analyzing election outcomes across 2001, 2004 and 2007 federal elections. Again, utilizing polling booth catchment areas to assess the electoral socio-demographic characteristics of voting outcomes, Stimson and his collaborators identified several typologies of local voter support that discriminated between primary vote support across parties. Using a two-way continuum that differentiated between advantage and disadvantage on one axis and multicultural-younger and monocultural-older on the other, the researchers illustrated that each party were positioned in a different quadrant depending on the socio-demographic characteristics of the polling booth catchment areas where they had the most support. For instance, the Liberal party was located in the quadrant represented by relatively high levels of socio-economic advantage and high levels of mono-culturalism and older cohorts. In contrast, the Labor party was located in the exact opposite quadrant representing relatively low levels of socio-economic advantage and higher multiculturalism and younger age cohorts. These patterns held up across the three federal elections that the researchers considered and reflected the post-election analyses conducted using individual survey data.
While these previous studies have been framed within the context of electoral geography, they have largely been devoid of any sophisticated spatial analysis of voting outcomes, being instead limited to the use of a-spatial regression approaches or the use of basic thematic maps. A recent exception to this Australian literature has been the work of Forbes, Cook, and Hyndman (2020), who considered the outcomes of the two-party preferred vote in the federal elections held between 2001 and 2016 using a spatial error model. Their detailed analysis identified the relative stability of the socio-demographic variables used to differentiate between the two-party preferred votes. They found that industry of employment and type of work were influential drivers of voting outcomes, with energy-related manufacturing, construction and administrative roles being strongly linked to the Coalition across all elections. Income also had an important effect, with higher incomes associated with coalition support. Labor maintained support from those with higher education levels throughout much of the period considered, while birthplace diversity was strongly associated with Labor support. Places with higher unemployment moved away from Labor over time, and places with higher household mobility tended to favor the Coalition.
The current paper is set within this theme of changing voting behavior and investigates how the 2019 Australian Federal Election outcomes can be understood according to the socio-economic characteristics of polling booth catchment areas classified in terms of stability or change. In doing so, the paper utilizes a combination of Australian Electoral Commission polling booth outcome and Australian Bureau of Statistics Census socio-demographic data aggregated into polling booth catchment areas. Four possible scenarios characterize the polling booth catchment areas; polling booths that remained either Coalition held or Labor held between the two elections and booths that changed hands between the two elections (Coalition to Labor or Labor to Coalition). Once classified, each group is compared with reference to a range of socio-economic variables using a geographically weighted regression approach to help discern the potential drivers of stability and change.

Data and approach
This paper adopts an aggregate socio-economic analytical approach to consider voting outcomes at the 2019 Australian federal election. Specifically, it adopts a geographically weighted logistic regression approach to consider how the socio-economic characteristics of polling booth catchments are associated with the outcomes observed in voting outcomes, characterized as stable Coalition, change to Coalition, stable Labor and change to Labor.

Developing polling booth catchments
As the goal of this paper is to undertake an analysis of the socio-economic patterns of voting outcomes, it was first necessary to develop a suitable dataset containing both voting outcomes at the polling booth level and socio-economic census data at a chosen level of spatial aggregation. To do so, a vector shapefile containing polling booth locations across Australia ( Figure  1) was obtained from the Australian Urban Research Infrastructure Network (AURIN) online portal and combined with the Australian Bureau of Statistics Statistical Area 1 (SA1) shapefile. Using the Voronoi polygon function in QGIS, a series of polygons were produced, with each polygon centered on a given polling booth point. This provided the basis for a potential catchment area for each polling booth. Individual SA1s were then attached to each individual polygon using the 'join attributes by location' function, and filtering by SA1s that intersected with any given catchment polygon ( Figure 2). This provided a file that listed polling booths and their associated SA1s which could then be used to aggregate SA1 level Census of Population and Housing data to represent the socio-economic characteristics of polling booth catchment areas. In cases where polling booths were not able to be separated, results were aggregated. This aggregated data set, which contained 7530 identifiable catchment areas was then combined with polling booth election results data obtained from the Australian Electoral Commission to develop the final dataset used for the analysis in this paper (see below). Following Reid and Lui (2017) it is assumed that the majority of voters visit polling booths close to their homes and that associations are possible between booth-level results and their surrounding areas' population characteristics' (.p.7). such an assumption was supported by data from the Australian Electoral Commission which showed that at the time of the 2019 Federal election, around 84 per cent of voters cast their votes at their local booth (Australian Electoral Commission, 2019).

Election outcomes, demographic and socio-economic indicators
To analyze the election outcomes for the two major parties (excluding independents and other minor parties), booth results for two-party preferred outcomes were obtained for the 2016 and 2019 Australian federal elections. Australian elections are run on a preferential voting system whereby the two-party preferred (TTP) outcome is the electoral result after preferences have been distributed to the highest two candidates, who in some cases can be independents. For the purposes of TPP, the Liberal/National Coalition (the Coalition) is usually considered a single party, with Labor being the other major party. Typically, the TPP is expressed as the percentages of votes attracted by the two major parties, e.g., "Coalition 50%, Labor 50%", where the values include primary votes and preferences. Considering the result for each of the matching polling booths across the two elections allowed four dependent categorical variables (polling booth outcome) to be developed. The polling booth outcome categories are This polling booth outcome data was assigned to the matching polling booth catchment and used as the dependent variable in the analysis.
The independent variables used in this paper comprised a dataset of socio-economic indicators and were built around the SA1s used to construct the polling booth catchments discussed above. Raw census data for each SA1 was aggregated into the associated polling booth catchments prior to being converted into percentages for use in the analysis. As the electoral data represented the 2016 and 2019 Federal elections, census data for 2016 was used to analyze the shift between the two election dates. This is in keeping with earlier research by Forbes, Cook, and Hyndman (2019), who identify the necessity to match voting data with the most appropriate census year.
A-priori it was assumed, based on existing studies and detailed previous analysis of voting patterns ( The final list of independent variables used in the analysis is outlined in Table 1. The variables cover generational age cohorts, household characteristics, ethnic background, religious affiliation, housing characteristics, employment and occupational characteristics and income. Given the lag between the 2016 census results and the 2019 election, data on the generational age cohorts and income were adjusted to align with the election year.

Analytical technique
An initial analysis of the data across the four polling booth catchment types suggested significant overlap between variables and the potential for significant multi-collinearity (linear relationships) between the independent variables. As a result, a multivariate factor analysis technique (principal component analysis) was used to produce several unrelated factors which increases the interpretability of the data while at the same time minimizing information loss. The technique of principal components analysis creates new uncorrelated variables that successively maximize variance, allowing for the newly created variables to be used in statistical approaches such as regression analysis without concern around multi-collinearity. The separate socio-economic and demographic variables were subjected to a principal components analysis followed by a varimax rotation procedure run in SPSS. Considering the rotated solution, five significant factors were identified with eigenvalues greater than 1. The rotated output was used to name the five new factors, with the new variables being extracted using the regression method in SPSS. These new variables were used as independent variables in the subsequent analysis.
In order to explore the correlates between the dependent variable (polling booth outcomes) and the independent variables, a geographically weighted logistic regression was performed using the MGWR software (Li et al., 2019). Geographically weighted logistic regression is an extension of geographically weighted regression adapted to allow for the modeling of binary dependent variables. The output from the model includes a global model together with local results. The results from the global model include global coefficient estimates, standard errors and t-values together with standard model fitting results. The local results include individual component estimates, standard errors, and t-values. The local results allow for the visualization of outcomes via mapping in GIS software, therefore, providing an indication of spatial association and diversity between the dependent variable and the individual independent variables. To allow the modeling of the dependent variable, the four categories of polling booth types were re-coded into four separate binary variables and then used as the dependent variable in four separate regression equations. In each case, the dependent variable coded 1, 0 (i.e., Coalition = 1, all other =0) was regressed against the new independent variables (components produced from the PCA). Table 2 presents the results from the principal component analysis of the original independent variables, while Figures 3-7 present the spatial distribution of the factor scores in the form of thematic maps. Factor one was interpreted as reflecting a continuum of occupational status from more blue-collar occupations (positive scores) to more white-collar (negative scores). Factor two was interpreted as reflecting various characteristics of family status. In particular, the high positive loadings on variables such as couples with children, youth (the presence of young children), the presence of higher percentages of generation-x cohort and the proportion of home purchasers suggested that higher positive scores on this factor represent polling booth catchments with higher levels of households in the family formation/growth stage, while lower/negative scores represent higher levels of non-family households (older couples without children or single people). Factor three was interpreted as representing polling booth catchments according to the level of community or neighborhood stability. Given the high positive loadings on the variables, generation-y, private renters and residential in-movement, catchment areas with higher scores on this component were considered to reflect lower levels of residential or community stability. Factor four was interpreted as representing the racial or ethnic composition of a polling booth catchment. The high positive loadings on the variables overseas-born, recent arrivals, non-Christian religions and of Islamic faith suggests higher scores are representative of a higher level of migrant communities within a polling booth catchment. The final factor was interpreted as reflecting the level of social disadvantage within a pooling booth catchment. The positive loadings on the variables single parents, public sector renters, indigenous and unemployment suggest that higher scores on this factor are representative of higher levels of social disadvantage within a polling booth catchment.

Results
To illustrate the spatial variation in each of the five factors, the extracted scores were mapped using QGIS (Figures 3-7). These figures show the clear spatial distribution of the factors across the polling booth catchments included in the analysis. For example, the factor accounting for occupational status (factor 1, Figure 3) seems to reflect a clear more blue-collar/more white-collar divide across the country, with more blue-collar (shaded brown) concentrated in outer urban areas and regional/rural locations, while more white-collar (shaded light yellow) concentrated in inner and middle suburbs of the major urban regions. Similar distinct spatial patterns are reflected when the remaining four components are mapped, suggesting that in the context of the analysis undertaken in this paper, the spatial distribution of the factors that a-priori might be thought to be associated with voting outcomes, will likely impact the geographic differences noted in voting outcomes (see Figure 8).

Geographically weighted regression
Given the inherent geographical variation in polling booth catchments, a geographically weighted logistic regression was performed on each dependent variable representing voting outcomes (stable     Coalition =1, other =0; change to Coalition =1, other =0; stable Labor = 1, other =0; change to Labor = 1, other =0). The results for the global model, together with the geographically weighted logistic regression results and comparative statistics for each type of voting outcome are presented below.
When comparing voting outcomes, the starkest comparison between the outcomes is between the stable Coalition booths and the stable Labor booths. In both cases the results for the global models and the geographically weighted logistic regression models reflect the generally understood socio-political divide that is representative of Australian politics (Cameron and McAllister 2019). The global model for the conservative Coalition booths (Table 3) suggests that stable booths were more like to be characterized by more white collar occupations, more stable communities (less residential/community in-movement) less overseas-born population and less disadvantage. In contrast, the stable Labor booths (Table 4) were more likely to be characterized by more blue-collar occupations, more community in-movement, more overseas-born populations and higher levels of disadvantage.
Considering the geographically weighted logistic regression results (Figures 9 and 10) it is clear that there was marked geographic variation across the (upper panel) local coefficients with   the signs on the coefficients matching those in the global models. Moreover, when the local significance results are considered (lower panel of each set of maps) it is clear that most of the dependent variables have a relevant contribution to understanding the outcomes for stable coalition and stable Labor across most of the country. This is not unexpected given the wide geographical spread of the voting outcomes for stable Coalition and Labor across Australia (Figure 8).
Much more diverse findings are identified when the outcomes for change to Coalition are considered (Table 5 and Figure 11). The global coefficients suggest that the likelihood of a booth being classified as change to Coalition increases as the catchment area is more blue-collar than white collar, has higher residential in-movement, has lower levels of overseas-born and has lower levels of disadvantage. From a global perspective, booths that changed to Coalition between 2016 and 2019 shared some characteristics with stable Labor booths and some characteristics with the stable Coalition booths. Overall, this accords with much of the post-election discussion that suggested that the coalition successfully stole votes from Labor's traditional support base (Cameron and McAllister 2019) as some voters turned away from Labor's policies.
The global results are further confirmed in the results from the geographically weighted logistic regression results mapped in Figure 11. Considering the local coefficient and significance maps for the occupation status variable while the local coefficients are largely significant, there is a large amount of variation in the geographical pattern of this variable. In particular, the greatest apparent concentrations of high coefficients are located in the north-east and south-west of the country, suggesting it was these areas where the blue-collar characteristics were more important in explaining the change to Coalition. Interestingly, these areas, especially those located in the state of Queensland were locations that the Labor vote was punished in electorates heavily reliant on mining, electorates that would likely have high concentrated blue-collar mining worker and their families and local communities. Considering the residential or community stability measure, once again there is significant variation across the country, with locally concentrated significance located in the north-east of the country. Here a relatively high positive coefficient indicates that higher levels of community in-movement increased the likelihood of a change to Coalition voting outcome.
The maps representing ethnicity and disadvantage reflected broadly similar patterns of local coefficients as the outcome for the stable coalition booths. The difference appeared in the significance maps where especially for the maps showing the local outcomes for the measure of disadvantage, the significant coefficients were more discrete and localized compared to either the stable Coalition or Labor outcomes.
The final set of output relates to those booths that changed to Labor between 2016 and 2019. The global results (Table 6) suggest that the likelihood of a booth changing to Labor increased as the occupational characteristics became more-white collar, as family characteristics came to represent older households or families or where social disadvantage was lower. These global figures suggest that Labor was able to pick up votes in local areas that were previously coalition held by other parties.
Considering the output from the geographical weighted logistic regression ( Figure 12) once again there is marked variation in the local coefficients on the variables considered (occupation, household and disadvantage). The geographic influence of the occupation variable shows the most variation in the southeast corner of the country with the occupation status variable having less influence on the likelihood of a change to Labor, compared to other areas. The local influence of the Family status variable showed a large degree of variation ranging from negative coefficients to positive coefficients. Markedly, negative coefficients were most obviously significant in the northeast of the country and especially in the State of Queensland, while positive coefficients were significant in the state of South Australia. The maps for the final variable, disadvantage, illustrate higher negative coefficients representing lower levels of disadvantage were influential for the change to labor vote in the southeast and southwest of the county while relatively higher disadvantage levels were influential elsewhere.

Conclusion
This paper has undertaken an analysis of voting outcomes for the 2019 Australian federal government election measured at the level of polling booths. The aim of the analysis was to understand how differences in polling booth outcomes at both a global and geographical (local) level are influenced by a range of socio-economic factors. In doing so, the analysis adopted in the paper used a geographically weighted logistic regression approach which provides a means of exploring both the global relationship between voting outcomes and socio-economic factors, but also accounts for the spatiality that is inherent in the dataset. Importantly, the approach accounts for the spatial heterogeneity in the relationship between voting outcomes and the socio-economic covariates.
In a global sense, when the four possible voting outcomes were considered a set of clear patterns were evident. There was a clear dichotomy between stable Coalition and stable Labor outcomes that reflected expected patterns and, in many ways, supported a range of existing political science dialogue. These patterns reflect the earlier work by Stimson et al. (2009) who located the Liberal party in a higher socioeconomic status, lower multicultural quadrant of their two factor model and the recent work by (Forbes, Cook, and Hyndman 2020) who found that seats won by the coalition tended to have higher incomes and persons employed in specific injuries including the energy sector and lower levels of recent migration. The global results for the change to Coalition also reflect much of the narrative around 2019 election outcomes, especially the importance of disaffected traditional voter support (Cameron and McAllister 2019) .
While the global regression results are interesting in their own right, it is the results from the geographically weighted logistic regression that provide interesting insights into voting outcomes. The most apparent finding is that when the different voting outcomes are considered, there is considerable spatial variation in the patterns observed, illustrating that while voting outcomes are necessarily impacted by a range of expected socio-economic factors, how these factors are expressed across space differs considerably. Of the four possible voting outcomes considered, it was the change to Coalition and to a lesser extent the change to Labor outcomes that represented the geographically distinct outcomes the best. In several cases, while the global findings suggested a particular outcome, the local results suggested that there were likely to be local issues or factors leading to a particular outcome. The clearest example of this relates to the widely acknowledged shift to Coalition that occurred in regional mining areas once held by Labor candidates, an outcome clearly illustrated by the local significance of the occupational status variable in traditional mining areas.
Overall, the findings from this paper have illustrated the utility of adopting a geographical analytical approach to understand voting outcomes, especially where local geographic variations in voting are thought to be important. While global outcomes provide a useful starting point a more nuanced understanding of voting patterns emerges once local variation in factors are considered. As such, analysis such as that undertaken in this paper has a valuable contribution to make to the understanding of voting patterns and should be considered an important extension to the large body of political science literature that purports to understand election outcomes.

Disclosure statement
No potential conflict of interest was reported by the authors.

Availability of data and material
Data is available from the author upon request.