Don’t confuse New York State with New York City; the former subsumes the latter, though imperious residents of the latter will probably tell you the subsuming goes the other way. But either way, don’t confuse the New York City’s open data site – to which we’ve travelled any number of times – with the State’s site, a novel stop-over for us, in which I scrolled my way into the euphonious
Statewide_200W_or_less_Residential_Non-Residential_Solar_Photovoltaic_Incentive_Program_Map_Beginning_2000 workbook, whose title balloons my word count by exactly one (blame the underscores).
All of which you can get here.
(You’ll note by the way that the site brings the data to our attention in the first instance in map form; nevertheless, a click of the blue Export button will repackage these into what the site call a static spreadsheet.)
The workbook tracks installations for solar energy-saving equipment, or photovoltaic (PV) systems, (detailed here, I think), but learning more about the data and the activity they detail will require some extra-worksheet due diligence. Click the unassumingly small right-pointing arrow beneath the data’s title on the open data site and you’ll bring out;
That legend is a mite clearer on the site itself, but only a mite. If you’re looking for more, and click the right-of-the-screen’s About link, this legend emerges:
That is, it’s the same content we saw in the first screen shot (trust me). I also had to look elsewhere to learn something about PV inverters, modules, and nameplates, elements that figure in data set fields, and which aren’t denoted in the workbook.
In any case, the data themselves look pretty good, a reliable virtue of US-based open data sets. Once you perform the customary column auto-fits you’ll probably want to rid the workbook of the undifferentiating State field in the D column, whose 32,670 rows contain 32,670 entries for NY. (I’d also loosen the wrapped text in the Location 1 field). Once you’ve gotten that far the data should be in line for some – ahem – illuminating, if not electrifying, finds.
Start simply enough, for example, with a pivot-table breakout of installations by sector types:
Row Labels: Sector
Values: Sector (Count, befitting Sector’s text data)
Sector (again, by % of Column Total).
The numbers are clear, though I don’t know what ultimately distinguishes a Commercial from an Industrial site. In any case residential installations dominate the mix, not terribly surprisingly.
Next, an enumeration of installations by year could inform us as well:
Row Labels: Date Install (grouped by Year)
Values: Date Install (Count)
Date Install (again by % of Column Total; and you won’t need Grand Totals here)
Remember that the 2015 data extend only through May 31 (effectively the 29th, the date of the last recorded installation) and thus augur a record year, one much in keeping with the definitive upward installation arc drawn across the data. On the other hand, one needs to key those 32,000 installations to a statewide population of nearly 20 million; some inter-state comparisons might be in point here.
We could also ask about the average waiting time separating an application for a system and its date of installation. Here we could easily enough poke our mouse into the next available column, head it Application Wait or something like it, format it in Number mode (sans decimals) and enter, in what should be T2,
(If you’ve in fact deleted the D column and its endless NYs the above would read =I2-H2+1.)That 1 plays its part in quantifying an installation rolled out on the very day of its request, and there are (apparently) a few such models of promptitude in there. Absent the 1, these swiftly-appointed rounds would return a zero.
This pivot table then awaits:
Row Labels: Date Install (again grouped by Year)
Values: Application Wait (Average, two decimals)
Application Wait (Count).
It seems clear that the one-day averages declared for all the installations effected in the program’s first five years serve a placeholder role, perhaps paving over absent or defective wait data. But note as well the ragged course of wait-time averages, strung below across a rudimentary pivot chart:
The spike in average waits in 2011 also dovetails with that year’s singular, backtracking installation totals. A story line, in there, perhaps? But note on the other hand the impressively speedy average for 2015 thus far, that for a year that foretells record installation demand.
And there’s at least one other interesting relationship to explore – the interaction between what the workbook calls Project Cost and $Incentive. I suspect – but cannot yet prove – that the installation savings signified by $Incentive sits atop the Project Cost denominator, so that an incentive of $1,190.00 on a Cost of $2,496.00 yields an effective installation charge of $1,306.00, or a 48% economy. The alternative understanding – whereby the $2,496.00 stands as the resultant of a savings of $1,190.00 applied to a $3,686.00 project – could be entertained, but I suspect the data mean to communicate the former.
Assuming for the next five minutes I’m correct, we could move to break out average incentive savings by year, once we do some thinking about the Cost numbers. If you sort Project Cost by Smallest to Largest the numbers bottom out in row 32,497, after which the next 174 rows set down no cost as all. And the cost figure in 32,497 – $9,532,080,102.00, for an installation in the town of Calverton in Suffolk county in Long Island – can’t be right, can it? Its speck-like $4,144.90 incentive doesn’t remotely cohere with that monumental outlay, and because it doesn’t I’m imposing a blank row atop 32,497, thus beating back that big number along with all the missing project cost records.
Now we can pivot table the remaindered 32,495 data-bearing rows:
Row Labels: Date Install (again, grouped by Year)
Values: Project Cost (Sum)
And in the interests of dividing summed Incentives by summed Project Costs to realize percentages, I’d piece together a simple calculated field:
I get, once all our numbers get the formats they deserve:
(Remember that you can retitle any and all the field headers.)
Now that, I would submit, is interesting. From 2002, when the numbers start getting big, incentive percentages slip every single year, and the slippage is big, too. Have these yearly contractions been itemized in some New York State agenda for the PV program, or is some other, less planful market vector tightening the screw?
I don’t know, but hey, story-seekers: are – ahem – any light bulbs going off?