Employing the Townsend Index as a measure of deprivation, we can capture change over time for local areas from 1971 to 2021 in England and Wales. The Townsend Index was originally devised using 1981 Census data but has been reproduced (for a variety of geographical small areas) using data from the 1971, 1981, 1991, 2001 and 2011 Censuses (see, for example, Norman, 2010, 2016, 2017; Norman & Darlington-Pollock, 2017). Here we further extend the time-series using the most recent 2021 Census data.
To calculate a measure which tracks change over time in deprivation for small areas requires that:
-
All input variables are available and have sufficiently similar definitions at each census time point;
-
The geographies for which the data were originally released can be made compatible with common spatial units over time; and
-
Once adjusted to consistent geographies, the raw data can be standardised and combined to a composite index so that a change in deprivation score has a meaningful interpretation in terms of improving or worsening deprivation.
The use of census data meets these criteria. We discuss and illustrate these aspects below.
Input variables to the Townsend Index
The Townsend Index focuses on ‘material’ deprivation which relates to the “lack of goods, services, resources, amenities and physical environment which are customary, or at least widely approved in the society under consideration” (Townsend et al., 1988, p.36). Since these aspects are not directly measurable, four proxy indicators identified from the Census, on unemployment, car ownership, home ownership and household overcrowding, were selected by Townsend and colleagues.
The input variable definitions, and extracts explaining their inclusion from Townsend et al. (1988), are shown in Table 1. The relevant raw data are all available for small areas from the 1971 Census through to the most recent 2021 Census. The numerators and denominators for unemployment are derived from individual responses on the census forms, and for non-home ownership, car ownership and overcrowding from the household returns. The unemployment variable is effectively comparable over time but has minor differences in the age boundaries of economic activity. This is due to a younger school leaving age in 1971 (15, thereafter 16) and an upper limit of age 59 (female) and 64 (male) rising to age 74 in more recent censuses. The home ownership and car variables are consistent over time. We appraise differences in the overcrowding variable below.
Table 1
Input variables to the original Townsend Deprivation Index
Indicator | Reasoning |
Unemployment Percentage of economically active residents who are unemployed | “Unemployment … reflects a great deal more than lack of access to earned income and the facilities of employment, in that it carries implications for a general lack of material resources and the insecurity to which this gives rise.” |
Home ownership Percentage of private households not owner occupied | “Non-owner occupation reflects lack of wealth as well as income, and therefore … choice … in the housing market … Taken together [non-car and non-home ownership) offer a fairly good reflection of income levels in different areas.” |
Car ownership Percentage of private households who do not possess a car | “The lack of a car is perhaps a more controversial choice, for it is not a clearcut and direct reflection of household or individual deprivation … However, a number of studies show that it is probably the best surrogate for current income.” |
Overcrowding Percentage of private households with more than one person per room | “Overcrowding gives a more general guide to living circumstances and housing conditions … This overcrowding indicator also helps balance that on housing tenure, bearing in mind that … owner occupation by no means always represents substantial command of resources.” |
Note: Adapted from Townsend et al. (1988; pp. 36 − 7). |
Reproducing the Townsend Index over time necessitates appraisal of the overcrowding measure for 2021. Counting total rooms in the household was required for the 1971, 1981, 1991 and 2001 Censuses. In 2011 there was a question about the number of rooms and an additional question specifically about the number of bedrooms. However, in the 2021 Census, respondents in England and Wales were not asked to count the number of rooms in the household: only the number of bedrooms. Thus, for the latest 2021 Census, it is not possible to calculate the percentage of households with more than one person per room.
The bedroom-based occupancy rating provides a measure of whether a household’s accommodation is overcrowded or under-occupied (ONS, 2023). A measure based on counting the number of bedrooms should be reliable since the census instructions on which rooms to count has varied over time (and between the UK’s constituent countries). A negative occupancy rating (-1 and − 2) implies that a household has fewer bedrooms than the standard requirement based on the household’s demographic structure. Note that in 2021 the Valuation Office Agency (VOA) provided estimated counts of rooms by household which is available alongside the census counts. However, we prefer to use the bedroom occupancy measure since the VOA information may be subject to different biases to census respondent supplied information.
To determine the impact of different ‘household overcrowding’ measures for measuring deprivation and change over time using the Townsend Index, the previous ‘more than one persons per room’ variables in 1991, 2001 and 2011 were compared with the bedroom occupancy variables in 2011 and 2021 (for simplicity excluding the 1971 and 1981 data). The appraisal here is for England and Wales, and with all data converted from the previous census geographies to the 2021 Lower Super Output Areas (LSOAs) (see below). A total of 35,056 (98%) of the 35,672 LSOAs in 2021 are included. The missing areas were unpopulated at time points before 2021.
Table 2 shows strong positive correlations across the years and between the differently defined measures indicating that the geography of household overcrowding is persistent, and the measures are sufficiently comparable. The variable names in bold are proposed to be included as overcrowding indicators in the calculation of comparable Townsend Scores over time. In 1991 and 2001 these are the standard ‘more than one persons per room’ definition. In 2011 this is an average of persons per room and the combined negative bedroom occupancies (-1 and − 2). In 2021 the variable comprises the negative bedroom occupancies. The rationale for the use of the average in 2011 is to alleviate possible discontinuities but also since this correlates with the earlier and later censuses marginally more strongly than the single variables.
Table 2
Correlations between measures of household overcrowding: England and Wales 1991 to 2021
Correlations | ppr91 | ppr01 | ppr11 | occ11min1 | occ11min2 | pprocc11 | occ21min1 | occ21min2 |
ppr91 | | 0.840 | 0.765 | 0.745 | 0.765 | 0.775 | 0.679 | 0.724 |
ppr01 | 0.840 | | 0.882 | 0.835 | 0.864 | 0.882 | 0.782 | 0.822 |
ppr11 | 0.765 | 0.882 | | 0.915 | 0.945 | 0.979 | 0.849 | 0.912 |
occ11min1 | 0.745 | 0.835 | 0.915 | | 0.989 | 0.974 | 0.796 | 0.916 |
occ11min2 | 0.765 | 0.864 | 0.945 | 0.989 | | 0.992 | 0.835 | 0.927 |
pprocc11 | 0.775 | 0.882 | 0.979 | 0.974 | 0.992 | | 0.852 | 0.933 |
occ21min1 | 0.679 | 0.782 | 0.849 | 0.796 | 0.835 | 0.852 | | 0.898 |
occ21min2 | 0.724 | 0.822 | 0.912 | 0.916 | 0.927 | 0.933 | 0.898 | |
Note: Variables prefixed ‘ppr’ are based on persons per room. Variables prefixed ‘occ’ are based on bedroom occupancy. The variable prefixed ‘pprocc’ is an average of the two definitions available in 2011. The suffixes min1 and min2 imply that a household has − 1 and − 2 fewer bedrooms than the standard requirement. All correlations are statistically significant (p < 0.05).
Creating a set of consistent geographical units
Having assessed the comparability of input variables and definitions over time, the issue of geographical unit comparability is next addressed. The geographical units used for the release of census data are different at each time point. This is due to variations in strategies for the geography of data collection and release. The latter is affected by the decisions on threshold counts of population and household and geographic scale with respect to the protection of personal confidentiality (Cockings et al., 2011). Unless data are adjusted to a consistent set of zones, a time-series analysis will be hampered by any boundary changes (Norman et al., 2003).
The analytical units for the Townsend deprivation index over time are the Lower Super Output Areas (LSOAs) used for the release of the 2021 Census in England and Wales. The LSOAs are a statistical geography which has become a scale commonly used in deprivation applications (DCLG, 2015; Norman and Darlington-Pollock, 2017). In 2021 there were 35,672 LSOAs in England and Wales which comprise between 997 and 9,898 persons (mean 1,671) and between 400 and 1980 households (mean 695). We adjust the data from the previous censuses to this geography so that the data and findings are relevant to a contemporary policy setting. We term the units of data release at previous time points as the ‘source’ geography and the 2021 LSOAs the ‘target’ geography. For the reliability of the conversions, we use units which are inherently smaller than the target geography. In 1971, 1981 and 1991 these are termed Enumeration Districts (EDs) but are different zonal systems at each of these time points. For 2001 and 2011 we use the Output Area geography which nested into the LSOAs at each of those time points, but not necessarily in 2021.
The method of conversion is defined in Norman et al. (2003) and used subsequently in Norman (2010, 2016) and in Lloyd et al. (2023b). A residential postcode location directory is used to connect the source units to the target zones using GIS point and polygon linkages. Since there will be partial overlaps between the zonal systems, the postcodes which fall in the intersections are used to apportion the population counts from the source to target geographies. Conceptually, the method is a hybrid of areal interpolation (population distribution proxied by postcode distribution) and dasymetric mapping (postcode presence for where people live). The list of residential (i.e. not business premises) postcodes is defined to be as contemporary as possible with each census using the dates for when the postcode was ‘live’. As detailed in Norman and Riva (2012, p.489), this is feasible back to 1981, but not for 1971, and so the 1981 list is used in that case.
Figure 1 illustrates a geographic conversion scenario for the Roundhay area in Leeds, UK. Figure 1a shows the ‘source’ geography EDs which were used for the dissemination of the 1981 Census. Figure 1c shows the 2021 LSOAs, the target geography to which all data need to be adjusted. The ED and LSOA polygons have effectively no correspondence. Figure 1b illustrates both the postcode distribution (point location symbols weighted by address counts) over an Open Street Map (https://www.openstreetmap.org/) background. The postcode locations are associated with the residential areas, with very few in the area of Roundhay Park.
Table 3a has an extract from the linked postcode directory which lists the source 1981 ED and target 2021 LSOA with which each postcode is linked, along with the number of addresses at each postcode. The number of postcodes and addresses are then summed for the intersections of the ED and LSOA polygons and also summed across the EDs. Dividing the intersection count by the total count in the ED gives the proportion of the raw population data which would be allocated to the target LSOA. Taking the first two rows on Table 3b as an example, for ED 08DAD03, ~0.27 of a population count will be apportioned to LSOA E01011650 and ~0.73 to LSOA E01011651. Table 3c lists the 1981 EDs which each contribute to E01011650 and those EDs to E01011651. The apportioned ED / LSOA intersection populations would be summed for these LSOAs.
Table 3
Extracts of postcode links and calculation of conversion weights, Roundhay, Leeds, UK
a) Postcode location linked to source and target geographies |
Postcode | Addresses at each Postcode | 1981 ED Code | 2021 LSOA Code | 2021 LSOA Name | |
LS8 2EQ | 10 | 08DABD03 | E01011650 | Leeds 020A | |
LS8 2FD | 46 | 08DABD03 | E01011651 | Leeds 020B | |
LS8 2HG | 16 | 08DABD03 | E01011650 | Leeds 020A | |
LS8 2HQ | 17 | 08DABD03 | E01011650 | Leeds 020A | |
LS8 2ET | 22 | 08DABD06 | E01011653 | Leeds 020C | |
LS8 2EX | 15 | 08DABD06 | E01011650 | Leeds 020A | |
LS8 2EY | 17 | 08DABD06 | E01011653 | Leeds 020C | |
LS8 2EZ | 25 | 08DABD06 | E01011653 | Leeds 020C | |
LS8 2HA | 16 | 08DABD06 | E01011653 | Leeds 020C | |
LS8 2HB | 21 | 08DABD06 | E01011653 | Leeds 020C | |
LS8 2HD | 4 | 08DABD06 | E01011653 | Leeds 020C | |
LS8 2HE | 11 | 08DABD06 | E01011650 | Leeds 020A | |
b) Conversion weights from source to target geographies |
1981 ED Code | 2021 LSOA Code | 2021 LSOA Name | Addresses in Intersection | Total Addresses in ED | Proportion of Addresses in Overlap |
08DABD03 | E01011650 | Leeds 020A | 65 | 240 | 0.2708 |
08DABD03 | E01011651 | Leeds 020B | 175 | 240 | 0.7292 |
08DABD04 | E01011651 | Leeds 020B | 68 | 185 | 0.3676 |
08DABD04 | E01011652 | Leeds 024C | 117 | 185 | 0.6324 |
08DABD05 | E01011650 | Leeds 020A | 122 | 175 | 0.6971 |
08DABD05 | E01011651 | Leeds 020B | 53 | 175 | 0.3029 |
08DABD06 | E01011650 | Leeds 020A | 61 | 250 | 0.2440 |
08DABD06 | E01011653 | Leeds 020C | 189 | 250 | 0.7560 |
08DABD07 | E01011650 | Leeds 020A | 57 | 221 | 0.2579 |
08DABD07 | E01011651 | Leeds 020B | 16 | 221 | 0.0724 |
08DABD07 | E01011652 | Leeds 024C | 28 | 221 | 0.1267 |
08DABD07 | E01011653 | Leeds 020C | 120 | 221 | 0.5430 |
c) Constituent (part) source units contributing to target areas |
1981 ED Code | 2021 LSOA Code | 2021 LSOA Name | Addresses in Intersection | Total Addresses in ED | Proportion of Addresses in Overlap |
08DABD01 | E01011650 | Leeds 020A | 25 | 236 | 0.1059 |
08DABD02 | E01011650 | Leeds 020A | 111 | 237 | 0.4684 |
08DABD03 | E01011650 | Leeds 020A | 65 | 240 | 0.2708 |
08DABD05 | E01011650 | Leeds 020A | 122 | 175 | 0.6971 |
08DABD06 | E01011650 | Leeds 020A | 61 | 250 | 0.2440 |
08DABD07 | E01011650 | Leeds 020A | 57 | 221 | 0.2579 |
08DABD22 | E01011650 | Leeds 020A | 35 | 261 | 0.1341 |
08DABD23 | E01011650 | Leeds 020A | 143 | 145 | 0.9862 |
08DABD01 | E01011651 | Leeds 020B | 211 | 236 | 0.8941 |
08DABD02 | E01011651 | Leeds 020B | 126 | 237 | 0.5316 |
08DABD03 | E01011651 | Leeds 020B | 175 | 240 | 0.7292 |
08DABD04 | E01011651 | Leeds 020B | 68 | 185 | 0.3676 |
08DABD05 | E01011651 | Leeds 020B | 53 | 175 | 0.3029 |
08DABD07 | E01011651 | Leeds 020B | 16 | 221 | 0.0724 |
Calculating deprivation change over time
As a composite measure of deprivation, the Townsend Index is the unweighted sum of the standardised (using z scores) percentages of unemployment (natural log, to account for the skewness in unemployment percentages), car ownership, home ownership and household overcrowding (natural log, again to account for skewness). The original version and calculations for other individual census years and geographies are cross-sectional measures of deprivation, applicable to the census year of the input variables. Crucially, a change in index score (or quantile) between censuses cannot be interpreted as an absolute change in deprivation because the cross-section measures are time point specific. Time comparable versions of the Townsend Index were developed in Norman (2010) to assess change between 1991 and 2001 (UK coverage), and later extended in time to cover each census from 1971 to 2011 (GB coverage) (Norman, 2016; Norman & Darlington-Pollock 2017). An equivalent method has been applied to the Carstairs Deprivation Index for Scotland for 1981 to 2011 (Exeter et al., 2019).
Using the same two LSOAs in Roundhay, Leeds, as used to illustrate geographic data conversion (Fig. 1 / Table 3), Table 4 displays percentages of non-home ownership at each census from 1971 to 2021. For the cross-sectional standardisation to a z score in 1971, the mean across all LSOAs in England and Wales areas (48.44%) is subtracted from the observation (for the first LSOA here, 47.68%) and divided by the standard deviation (SD) for all LSOAs (26.31). The resulting score is -0.03. Negative z scores are for values which are lower than the national (England and Wales) average and positive z scores for those which are higher. E01011650 is very close to the mean. E01011651 at 27.54% is well below the national average, so the resulting z score is further from zero at -0.79. All the other ‘cross-sectional’ z scores are calculated accordingly. The difficulty with these cross-sectional values is that one year cannot be compared with another in a meaningful way. In E01011650, non-home ownership increased between 2001 and 2011 from 14.91–19.33%. However, the z scores (-0.79) remained the same because the national average also rose.
To overcome this issue, in the ‘Whole Time Period’ part of Table 4, the mean and standard deviation values are for all areas across all time points. The average non-home ownership over the six censuses was 37.77% and the standard deviation was 23.73. These are then used as inputs for the z score calculations with each of the percentages for the individual LSOAs results in scores which are comparable across space and time. For E01011650, in 1971, with non-home ownership above the average for the whole time period, the z score is positive, which then reduces by 1981 as the percentage non-home ownership fell relative to the national average. Contrary to the counter-intuitive cross-sectional measures, in 2001 and 2011, the z scores for E01011650 move closer to zero as the percentages of non-home ownership rise.
Table 4
Calculation of cross-sectional and time comparable z scores, LSOAs in Roundhay, Leeds, UK
| Non-Home Ownership |
Percentages | 1971 | 1981 | 1991 | 2001 | 2011 | 2021 |
E01011650 | 47.68 | 35.80 | 27.26 | 14.91 | 19.33 | 19.87 |
E01011651 | 27.54 | 31.12 | 25.63 | 33.17 | 37.30 | 34.12 |
Cross-sectional | | | | | | |
Mean | 48.44 | 41.08 | 31.39 | 31.33 | 35.67 | 37.72 |
Standard Deviation | 26.31 | 26.35 | 21.65 | 20.90 | 20.60 | 20.52 |
Z Scores: Year Specific | | | | | | |
E01011650 | -0.03 | -0.20 | -0.19 | -0.79 | -0.79 | -0.87 |
E01011651 | -0.79 | -0.38 | -0.27 | + 0.09 | + 0.08 | -0.18 |
Whole Time Period | | | | | | |
Mean | 37.77 23.73 |
Standard Deviation |
Z Scores: Comparable Over Time |
E01011650 | + 0.42 | -0.08 | -0.44 | -0.96 | -0.78 | -0.75 |
E01011651 | -0.43 | -0.28 | -0.51 | -0.19 | -0.02 | -0.15 |
Using this approach, for each LSOA and census year, the percentages of non-home ownership and no car and the log transformed percentages of unemployment and overcrowding have had time comparable z scores calculated across the six censuses. These are then summed, unweighted (as is standard for the Townsend measure), to a final deprivation score. The areas with higher levels of deprivation have scores which are positive, with negative scores representing areas with lower levels of deprivation. Since the scores are comparable over time, if the scores are reducing, then the area is becoming less deprived (and vice versa).
Many applications use the continuous scores categorised into quantiles (see Norman et al., 2023). We categorise the deprivation scores into quintiles such that the cut-offs partition the scores into groups of equal population size. Here LSOAs in quintile 1 are the least deprived areas and those in quintile 5 are the most deprived. Over time if an area changes quintile, then the logical interpretation can be made as to whether areas have become more or less deprived.
Using the same approach as Exeter et al. (2019), in addition to using quintiles, we also cluster areas into groups using k means classification using the z scores of each input variable. This results in five categories of LSOAs representing areas which were over the time period 1971 to 2021 in different deprivation trajectory groups. We provide further details about this classification in the analysis below.
Note that for this time-series of scores, quintiles and trajectory categories, if an area had a population of fewer than 100 people (as in both the original versions of the Townsend and Carstairs indexes), then it is excluded from any calculation for that time point. This may be the case where areas with small or no persons present at previous censuses may subsequently have been developed with residential housing. Some areas may previously have been populated but then the housing has been demolished and then redeveloped so the time series of census data is interrupted.