Study Design
Building on previous cross-sectional analyses, we used longitudinal data to examine changes in NP supply in NYS and comparison states (i.e., NY and NJ). We first examined unadjusted longitudinal trends in NYS to describe trends in NP supply in primary care practices from 2012-2018. However, these unadjusted trends cannot be interpreted causally because any increases in NP supply in NYS could have occurred independent of the NP Modernization Act as NP supply has been increasing across the United States. Therefore, we used a difference-in-differences design to compare the NP supply in primary care practice in NYS to neighboring comparison states before and after the policy change. In a difference-in-differences design, the impact of the law is estimated based on the experience in New York State before and after the policy implementation compared to what would have happened (i.e., the counterfactual) in the absence of the policy change. To establish the counterfactual, we compared changes in NP supply in primary care practices in NYS to that in neighboring states (Pennsylvania [PA] and New Jersey [NJ]) before and after implementation of the NP Modernization Act. The comparison states border NYS and have similar population and healthcare market characteristics as NYS and serve as reasonable controls for the projected increase in NP counts. We would conclude a positive impact of the law if the pre-post changes for practices in New York state were greater than the pre-post changes for practices in PA and NJ and a negative impact of the law if the pre-post changes were less than those changes in practices in PA and NJ.
Data Sources
SK&A database. We used longitudinal data from the SK&A outpatient database (2012-2018) to identify primary care practices in NYS and in the comparison states (i.e., PA and NJ). The SK&A database contains information on the population of office-based providers in the U.S. and is the most complete resource of its kind (32). It provides information on providers (e.g., NPs, physicians) including name, practice name and location, contact information, network affiliation, and National Provider Identifier (NPI). This data source has an advantage over other sources in that it also contains information about number of providers, site specialty, and practice ownership that are not available elsewhere. (32) For our study, we retained only primary care practices. We defined primary care practices as those practices that had more than half of their providers with individual specialty of Family Practitioner, General Practitioner, Geriatrician, Internal Medicine/Pediatrics, Internist, Preventive Medicine Specialist, and Pediatrics. For our analysis, we retained only practices that had observations for all seven years (2012-2018) so that we can trace the full path of changes in NP counts in those practices over the entire study period.
American Community Survey (ACS). The ACS is an annual random survey of approximately 3.5 million housing units conducted by the U.S Census Bureau. We merged these data with SK&A practice-level data to identify practices that were located in underserved (i.e., rural or low-income) Zip codes. At the Zip-code level, only 5-year average estimates are available for income.
Area Health Resource File (AHRF). We also used AHRF, which is a dataset of county-level health information assembled by the Health Resources and Services Administration. The AHRF pulls information about health professionals, facilities, and demographic information from over 50 discrete data sources. These data were used to construct market-level covariates.
Variables
Dependent Variables. Using the SKA dataset, we calculated practice-level dependent variables. The SK&A file provides provider-level observations. We used these provider-level observations to sum the total number of NPs within a practice. We then used these counts to calculate the following variables: 1) whether a practice has at least one NP and 2) total count of NPs. In the SK&A database, some NPs are attributed to multiple primary care practices. In order to ensure that we did not double count these providers, we estimated full-time equivalent (FTE) for each NP and divided those FTEs across all primary care practice sites to which the NPs were attributed in the SK&A database. Because we did not have any data on the actual amount of time spent by each NP at each practice, we proportionally divided the NP FTEs evenly across each attributed practice. For example, if in the SK&A database an NP had three listed practices as their employment setting, each of those practices would be assigned .33 FTEs for that NP.
Independent Variable. Our primary independent variable of interest was the NY NP Modernization Act. The NP Modernization Act was implemented in January of 2015. In order to estimate the impact of the Act, we generated a binary treatment variable which took the value of “1” in NYS in the years 2015, 2016, 2017, and 2018 and “0” otherwise.
We compared the impact of the NP Modernization Act overall and separately for practices in underserved areas and those in rural areas. Using ACS data, we characterized each practice as being in an underserved area including rural and/or low-income communities. We consider practices to be in underserved areas if they were in either a rural or low-income community. A practice is considered rural if it is located in a “rural” (RUCA>4) Zip code using Rural-Urban Commuting Area Codes (RUCAs) (33, 34). We identified a practice as being in a low-income community if the median household income of the community is in the lowest quartile among other Zip codes across the study states. (35, 36)
Covariates. In the analytic models, we controlled for a number of practice- and market-level variables. For practice characteristics, we included the number of physicians in the practice to control for practice size. For market characteristics, in all regressions, we included the following AHRF-derived county characteristics in each year: the percent of population that is Medicare-eligible, Medicare Advantage penetration rate, per capita income, poverty rate, unemployment rate, number of active physicians per 100,000 population, number of hospital beds per 100,000 population, number of federally qualified health centers (FQHCs), and number of rural health clinics. These represent time-varying patient demand and healthcare supply factors that may influence NP supply. We have used these control variables in other studies. (20)
Data Analysis
We first examined and described unadjusted trends in NP supply of NPs in primary care practices from 2012-2018 in NYS as well as comparison states. We then estimated the association between the NYS NP Modernization Act and NP supply in primary care practices using a difference-in-differences design. The difference-in-differences model is constructed as a fixed effect model which introduces a separate intercept for each practice in year. This regression specification allows us to trace within-practice changes in NP counts over time both before and after the policy change. Specifically, we used ordinary least squares regression to estimate the following model for practice i in county j and year t:
Where Y represents the dependent variables, X represents a matrix of practice and market covariates, δ is the year fixed effect, and ϒi is the practice-level fixed effect. The indicators βk represent the effect of the NY Modernization Act separately in each of the 3 years leading into the policy (i.e., before 2015) and the 3 years following the policy in practices located in NY relative to practices in the comparison states. The year 2015 is omitted as the reference year in which the policy was implemented. To calculate a single overall effect of the policy, we averaged the post-2015 coefficients and used the delta method to calculate standard errors on those averages. We estimated these models for the full sample and then separately by rural and low-income areas to assess differential effects of the policy on practices located in these communities.
The primary assumption of the difference-in-differences design is parallel trends in the pre-policy period. That is, practices in NY would have experienced the same trend in outcomes as practices located in PA and NJ. While not directly testable since the assumption relates to parallel trends in potential outcomes after the policy change, we test for differential trends in the pre-period by conducting an F-test that each of the pre-period coefficients are jointly equal to zero. A lack of differences in the pre-period would raise confidence in the credibility of the research design, and in particular of using practices in PA and NJ as valid comparisons.
All models were estimated using ordinary least squares regression. The models using the binary variable of having at least one NP in the practice were estimated as linear probability models. In each regression, we clustered the standard errors by practice to account for autocorrelation within a practice. When modeling count data, the error term is unlikely to be normally distributed and thus test statistics may be invalid. Therefore, we chose to bootstrap the standard errors. As a sensitivity check, we also re-estimated our models using the full dataset as opposed to the balanced sample (i.e. only practices that were in our dataset each year of the study). The results were the same.