Baseload power potential from optimally-configured wind, solar and storage power plants across the United States


 In the present paper, we assessed the potential for local wind, solar PV, and energy storage to provide baseload (constant, uninterrupted) power in every county of the contiguous United States. The amount of available capacity between 2020 and 2050 was determined via a least-cost optimization model that took into account changing costs of constituent technologies and local meteorological conditions. We found that, by 2050, the potential exists for about 6.8 TW of renewable baseload power at an average cost of approximately $50 / MWh, which is competitive with current wholesale market rates for electricity. The optimal technology configurations constructed always resulted in over two hours of emergency energy reserves, with the amount increasing as the price of energy storage falls. We also found that, given current price decline trajectories, the model has a tendency to select more solar capacity than wind over time. A second part of the study performed three million simulations followed by a regression analysis to generate an online map-based tool that allows users to change input costs assumptions and compute the cost of renewable baseload electricity in every contiguous US county.


Introduction
There is substantial debate in the scientific literature 1,2,3,4,5,6 and popular discourse 7,8 about the ability of variable renewable energy sources, such as wind and solar, to replace firm or "baseload" generation assets, such as natural gas, coal, or nuclear power. This discussion is often further hampered by a lack of spatiality, i.e. the same solar panel or wind turbine could have vastly different production characteristics in different locations, whereas thermal plants are not necessarily as spatially differentiated in output.
Policy support 9 and rapid declines in the costs of wind and solar have led to their wide deployment across the world. In the United States (US), almost half of generation capacity deployments from 2009 to 2018 were renewables 10 and they are expected to continue to grow in the future 11 . However, their benefit is limited by their reliance on the variable, just-in-time "fuel", delivery in the form of solar radiation and wind. The ability of an electricity grid to incorporate high-levels of renewables can also be limited by stability constraints such as required inertia levels 12 .
Recent analysis has suggested that it might be economically beneficial to overbuild renewable energy capacity and simply curtail the excess energy 13 . But, while this might be true when considering overall systems costs, it would be harder to justify in deregulated markets where individual power plant operators are looking to maximize profits. Other studies have looked at grid-scale energy storage systems as a way to buffer such concerns by reducing the amount of curtailed energy 14 and recent all-technology competitive utility capacity RFPs (requests for proposals) have been won by renewables and storage combinations 15 .
Further analyses have shown that renewable energy and storage systems can be used to fit useful load shapes at reasonable costs 16 as well as minimize behind the meter costs when differences in rates and feed-in tariffs allow for load shifting and demand response 17 . There also exists the potential for transmission and distribution infrastructure expansion deferral with the appropriate placement of energy storage systems within a network 18 .
Renewables paired with energy storage systems (ESS), or "hybrid projects", are becoming more popular as the costs of each component decline. However, the optimal amounts of renewables and energy storage needed to produce a given output varies depending on both location and the relative costs of each component. For instance, when energy storage costs are high, it might be more optimal to deploy more lower-cost renewables with complimentary output and less storage, but as storage costs decline, one might need less generation capacity overall.
Most analyses have focused on the impact of combinations of renewables and storage for just a few sites, or a narrow set of use cases, but, to the authors' knowledge, there has not been a systematic assessment of the amount, and cost, of providing a standard output over a wide band of prices and locations. The present analysis seeks to fill that knowledge gap.
For the present study, we perform the following steps: 1) Build an open-source optimization model that finds the least-cost combinations of solar PV capacity (single-axis tracking), onshore wind capacity (100m hub height), and energy storage capacity (both power and energy components) required to provide a constant (1 MW in the model, but scalable), uninterrupted, baseload generation output of electricity for multiple calendar years; 2) Apply the model over each and every county in the contiguous US for three (2014, 2015, and 2016) calendar years of hourly constructed power data; 3) Repeat the analysis with changing technology cost assumptions between 2020 and 2050; 4) Run the model again in each county with over 1,000 input price combinations to capture uncertainty in future cost projections; 5) Build an online tool that allows users to use their own cost assumptions with our methodology.

Methods and data
The present study seeks to provide a robust, geographic and temporal, direct comparison between different baseload generation options. First, we cost-optimize combinations of wind, solar and energy storage capacity that would be required to supply a constant, uninterrupted, hourly 1 MW output for three calendar years over every contiguous US county given 1,000 combinations of prices between the renewable generators and energy storage systems. To perform such optimization, we must construct three calendar years of hourly county-level capacity factor data for solar PV (single-axis tracking, tilted at latitude with North-South azimuth) and wind (100-meter hub height). We create the wind and solar data using data from the National Oceanic and Atmospheric Administration (NOAA) High Resolution Rapid Refresh (HRRR) weather forecast model and applying power algorithms.
To generate the combined wind, solar PV, and storage baseload power plants, we created an open-source optimization model (WIS:dom ® -B 1 ) that was derived from larger-scale optimization models using the following basic equations: such that where is the set of capacities that the model can choose from; including solar PV ( ), wind ( ), energy storage power ( ) and the battery energy capacity ( ℯ ). is the amortized cost of each of the technologies that the model can choose from, is the discharge of the energy storage system ( ) at time , and ℰ is the variable operations and maintenance cost of the energy storage system. ℒ is the demand at time that the model must meet given the chosen suite of technologies; ℛ is the capacity factor of the solar PV at the location considered at time ; ℛ is the capacity factor of the wind at the location considered at time ; and ℰ is the charging (or energizing) of the energy storage system at time .
is the minimum amount of energy that the energy storage system is required to have at all times, and is the energy storage system state of charge at time . Note that we use cyclic boundary conditions for the ESS state of charge equation set. In Equation set (5), is the self-discharge rate, is the efficiency of charging, and is the efficiency of discharging of the ESS.
The annual cost, , of each technology is given by: where ℐ is the available investment tax credit, ℴ is the overnight price of technology ; ℛ is the weighted cost of capital associated with technology ; ℬ is the book life of technology ; is the annual fixed operations and maintenance costs of technology ; is the annual variable costs of technology ; and is the expected electricity production of technology (given per unit of wind or solar PV deployed for a given location). Note that and are only included in Eq. (6) for wind and solar as the variable costs of these technologies are combined into their fixed costs without loss of generality.
From the results of the optimization, we calculate a metric defined as the levelized cost of delivered electricity (LCODE) for the optimized systems. LCODE is calculated with the following formula Note that in the formulation of the LCODE, only the electricity used to meet demand is considered as useful (for an annual 1 MW constant output, ∑ ℒ = 8,760 MWh). Any curtailment of electricity (combined generation from wind and solar above demand that was not stored in the ESS) contributes to costs, but not consumption; although that energy could have useful value to the grid or to other processes such as hydrogen generation. Table 1 shows a breakdown of the relative costs of each component in the hybrid system from 2020 to 2050. While the costs projections in Table 1 provide a wide range of possible costs for the model to consider, predicting the future cost of a particular technology, or the future relative costs of different technologies, is challenging. For example, cost declines might stall for one technology while accelerate in another, resulting in an optimal solution that differs from the solutions given the inputs in Table 1. Thus, to give a richer parameter space of how differing technology cost changes between the technologies impact the results, we ran a parametric analysis of prices in each county.
To computer the parameterization, we divided each the amortized cost of each technology by solar to define a range of relative costs between each technology. Dividing the cost of wind and energy storage power by solar costs results in the relative costs being unitless; however, dividing energy storage costs by solar results in the relative costs being of units per hour [hr -1 ]. We chose this approach because the relative cost of energy storage is a practical unit.
Finally, we took the ranges of each "normalized" cost and extended them further to capture a wider range of possible relative costs. For each technology, we divided the new ranges into ten equally spaced percentages as shown in Table 2.   We executed an 1,000 optimizations (all combinations of the three columns in Table 2) for each county, resulting in over three million individual optimizations in total (each solving over three calendar years at hourly resolution). This allowed us to more fully explore an uncertain future. Performing all these optimizations facilitated providing a large dataset to perform our regression analysis against.
Renewable power data set To calculate the county-level renewable capacity factors, the present paper used three (2014-2016) years' worth of hourly weather data based on the NOAA HRRR weather forecast model, which is a specially configured version of Advanced Research WRF (ARW) model. The HRRR is run hourly on a 3-km grid resolution and its domain covers the continental United States as well as portions of Canada and Mexico 20 . Using multiple years' worth of data allowed the model to compensate for a wide range of meteorological events such as the polar vortex of 2014 and the higher than normal jet stream of 2015.

Wind power dataset
The wind power dataset is created by using the three dimensional wind, temperature, airdensity and specific humidity data from the HRRR and using the rotor equivalent formulations 21,22 . The rotor-equivalent formulation area weights the effect of the variables to account for variations as a function of height within the wind turbine rotor-layer. Therefore, in case of wind speed, the rotor equivalent formulation takes into account effect of wind speed shear and wind direction veer each of which has shown to have, on an average, an effect of around 5% on wind power generation 22,23 . Once the rotor equivalent values are calculated, the wind power can be estimated using: Where is the power output of the wind turbine, is the coefficient of performance for the wind turbine 2 , is the density of air, is the rotor swept area of the turbine, and is the rotor-equivalent wind speed.

Solar power dataset
The solar power dataset is created using downwelling short and longwave radiation coupled with 2-m temperature from the HRRR. For 2014 and 2015, the HRRR did not output direct normal irradiance (DNI) and diffuse horizontal irradiance (DHI) and, therefore, these were calculated by performing multiple multi-variate regression (for GHI and DNI separately) using the HRRR variables, observations from GOES-east and GOESwest as exogenous variables and ground based observations of GHI and DNI from the NOAA SURFRAD and SOLRAD sites as endogenous variables 24 . The regression has an added benefit of correcting for model biases in downwelling radiation that mainly occurred from improper representation of the cloud field. Starting in 2016, the satellite measurements were assimilated into the HRRR and the model outputted both DNI and DHI. Therefore for 2016, the regression is performed using just the model variables against the ground-based observations to correct for any model biases in estimates of irradiance components. Once the irradiance components are obtained, the solar PV production is estimated using an empirically derived model developed by King et al 25 , at Sandia National Laboratory.
The above power calculations are performed for each 3-km HRRR grid cell at 5-minute time resolution. For the purpose of the present paper, the data is spatially aggregated to county-level and temporally aggregated to hourly. It is assumed that all the generation occurring within a county needs to be moved to the population weighted center for consumption. The aggregated county wind and solar generation profiles are calculated by performing available capacity potential (taking into account siting restrictions) weighted average of the power capacity factors and derating due to transmission loss as shown here: Where is the average hourly capacity factor for a given county; is the potential within the HRRR cell ; is the hourly capacity factor of the resource within the HRRR cell ; the transmission, , is assumed to be 1.5625% per 100 km; and is the distance to the population centroid of the county.

Siting data set
To ensure that the model does not site wind and solar generation in unsuitable locations, the HRRR cells that are unsuitable for wind and solar siting are removed from consideration as described using a method similar to that explained in the SI materials of MacDonald et al. 26 . The present paper expanded the exclusion zones to include data from the USGS land use (re-gridded to the HRRR 3-km grid cells).
All cells that meet any of the following criteria were removed from consideration for capacity installation: 1. Remove all sites that are not on appropriate land-use categories. 2. Remove all sites that have protected species. 3. Remove all protected lands; such as national parks, forests, etc. 4. Compute the slope, direction and soil type and determine its applicability to wind and solar installations. 5. Remove prohibited military and other government regions. 6. Avoid radar zones and shipping lanes. 7. Avoid migration pathways of birds and other species.
Of the HRRR cells that remain after the above filtering, the maximum potential density allowed was defined as 10 MW / km 2 and 5 MW / km 2 for utility-scale solar PV and onshore wind, respectively. These densities might not be reached in some locations because local aspects (such as slope and pre-existing wind or solar capacity) might prohibit build out.

Demand data set
For the present paper, we forced the model to match a steady, flat output. Thus, the solar/wind/ESS hybrid plant must supply a flat 1 MW "base load" output for every hour of the year. Because the model has perfect knowledge of the future (the data we give it to optimize over), we do not allow the plant to miss any hours even though that makes for a more expensive system overall 16 . Note that it is trivial to show that providing a baseload output guarantees that any load shape can be provided (albeit at a higher LCODE). During the analysis, we performed various different load shapes, in particular representing the historical load profiles within each county; however, for clearer comparison to other baseload generation power plants, we chose to only to analyze flat, constant load. We note that the matching to county loads reduced the LCODE compared with the baseload version; which is consistent with the premise that demand profiles have some weather component driving their behavior.

Results and discussion
The following section displays a subset of the result of the analysis for the particular years considered in Table 1. For brevity, we only display results for the years 2020 and 2050 in the main body of this paper. All data and the intermediate maps (2025 -2045) are available in the Appendix. Figure 1 shows how much optimal renewable baseload capacity is available in 2020 and 2050.

Figure 1: Optimal renewable baseload capacity in each county in 2020 (left) and 2050 (right). Grayed out counties indicate locations where it is infeasible to site renewables for various reasons.
The optimal renewable baseload capacity available in 2020 is 5,800 GW with the ability to provide 50,808 TWh of electricity. Between 2020 and 2050, the optimal renewable (upper bound) baseload capacity available amount increases to 6,800 GW (59,568 TWh), a 17% increase. The increase is due to the model choosing more energy storage relative to renewable generation capacity in 2050 as prices for energy storage systems decline. Thus, with less land needed for siting renewables relative to the same amount of energy storage (siting for energy storage systems was considered negligible), more renewable baseload capacity could be deployed overall. For reference, in 2018, the total installed capacity of (all types of) power plants in the US was about 1,200 GW 10 (with 4,178 TWh electricity produced), so the technical potential for renewables to provide for all US electricity needs exists many times over. In fact, 6,800 TW of baseload power would be almost ten times the coincident US electricity peak demand in 2019 27 .
It can be observed in Figure 1 that the majority of the optimal renewable baseload power available is concentrated in relatively low populations regions. Thus, even though there is potential to cover all the US demand with these power plants, there will still be a need to coordination and transmission infrastructure.
As the cost of each component piece of the renewable baseload power plant fall, so does the overall cost to provide baseload power from renewables. Figure 2 shows the levelized cost of delivered electricity (LCODE) for the optimal renewable baseload generation in 2020 and 2050 (See Equation 7).  These average LCODE values are also competitive with the projected future costs of other baseload technologies, such as coal, natural gas and nuclear, see the Appendix for more discussion.
It should be noted that our LCODE values are conservative in that we only consider the baseload energy delivered in the divisor of our calculation, i.e. we give no value to any energy that is curtailed in the process of providing baseload power. The model might choose to curtail energy if: 1) it does not deem it to be economically optimal to build more energy storage power capacity (bigger inverter and power electronics) to consume that produced power; or 2) it does not deem it to be economically optimal to build more energy storage energy capacity (more battery cells) to hold the excess energy. With decreases in energy storage costs the curtailment values dropped from a locational average (median) value of 50% (50%) in 2020 to 45% (45%) in 2050.
These curtailment numbers are high and indicate that if value could be derived from the excess energy production, such as to make hydrogen, ammonia, renewable natural gas, or some other dense energy carrier, the value proposition of renewable baseload generation "energy centers" could be even better than indicated here.
In most counties, the model deployed more solar capacity than wind capacity. In fact, the ratio of solar to wind capacity deployed increased from 11 to 15 from 2020 to 2050 (see Appendix for more discussion). Also, as part of the analysis, the model was limited in its ability to discharge the energy storage units and was required to keep the systems at least 5% SOC (see Table 1), thus there are always reserves available in case of an unexpected shortfall in generation (forecast errors) or some other interruption (solar eclipse). The average length of renewable baseload reserves in 2020 was about 2.1 hours (median 2.2 hours) and in 2050, the average length of renewable baseload reserves was about 2.5 hours (median 2.6 hours), see the Appendix for more details.

Online tool development
For any given county in the contiguous US, the cost of providing renewable baseload generation declines as the costs of the respective components fall through time. However, given that the future costs and relative costs of each type of technology are uncertain, we developed an online tool that allows users to deviate from our input assumptions in Table  1.
The basis for the online tool is a regression model, unique to each county, that is fit to the 1,000 optimizations run over the range of relative costs shown in Table 2 and predicts the LCODE of the hybrid system in that county for the chosen technology price inputs. The regression models fit the data well, with the county average r-squared value being 96% (the minimum in a county being 82%). A screenshot of the baseload tool is shown in Figure 3, and the online tool can be found here: https://www.vibrantcleanenergy.com/products/wisdom-b/. Password will be removed when published.
Another, more sophisticated, online tool can be found below, that shows the capacities as well as the LCODE; however, there is less optionality with cost input changes: https://www.vibrantcleanenergy.com/products/wisdom-b-expanded-model/. Password will be removed when published. All of the open source WISdom ® -B code and data are also available on GitHub at: <URL>. Will be released when published.

Conclusions & Future Work
Climate change necessitates the need to quickly move societies to more clean and affordable sources of energy. In the electricity sector, some of those sources need to provide a controllable, steady, and reliable output, something that variable renewables cannot do alone. This could become even more important when considering possible electrification pathways to decarbonize the economy. Moreover, it is critical to have a metric by which all generation sources can be compared. The LCODE considered in the present paper facilitates comparison between local combined variable generation with storage (hybrid plants) and any other type of power plant matching the same demand.
A simplified picture is produced in the present paper, where the hybrid plants much match a constant, uninterrupted base load without fail for three years. The costs are then directly comparable to other baseload generators. We showed that with current costs, the renewable baseload generation is more expensive than traditional baseload generators; however, within the next few decades will become competitive. Further, we provide a free online tool (generated by performing millions of optimizations), that users can adjust costs of technologies to compare LCODEs for themselves.
The present analysis also showed that enough renewable baseload potential exists across the US to meet the current electricity demand ten times over. It, further, showed that the most abundant locations for renewable baseload power plants are remote (long on supply) and substantial infrastructure would still be required to transport the electricity to the demand center that are short on supply. Finally, the analysis produced optimal configurations with substantial quantities of curtailment (oversupply or non-captured) electricity, which could be co-optimized with fuel production (hydrogen or ammonia, for example) and provide another income stream for these renewable energy centers.
We would like to extend the present study to include: 1) running finer spatial and temporal resolution for more years' worth of data (880,000 sites with 105,120 time steps per year); 2) Assess the impacts of climate change on the meteorologically-based renewable baseload potential; 3) Add more regionally-specific technology cost multipliers to better capture local costs; and 4) Develop methods to utilize curtailed energy (up to 87%) for hydrogen (or other fuels) production to help determine the economics of these sites for economy-wide decarbonization.