Mystery Solved, I Think

3 Oct

Mystification has its place. As an antidotal corrective to haughtiness and unbecoming outbreaks of self-confidence, an anomic stumble or two through the data can work wonders for the reality checker within.

Beware, my brother’s philosophy professor warned, of people who are sure of themselves; and sureness gets taken for a ride through the Beds Open Overnight workbook that scratched at my head in the previous post and across the last couple of weeks.

The problems, again, are those intermittently cooperative pivot tables we had put to the task of grouping various worksheet data, some reporting the data in fastidiously rounded-off integers, and some in tranches spewing rancid jets of decimals, like this:

 group1

These several outcomes appeared to perhaps have be geared to respective formats, with the excesses above perpetrated by the garden-variety Accounting mode, and the prim, decimal-less groupings tonsured by some exotic custom format, e.g.

 group2

But, if I may anticipate your question, when I painted the custom format atop the numbers suited in Accounting, the decimals yet refused to go away; nothing changed. And that’s mystifying.

In search of enlightenment I retold the mystery to Michelle Tenwick, the contact person making herself thoughtfully available right there in the NHS Trust by Sector sheet. She replied, in part:

“Here the variations seen in the formatting of the data fields are as mentioned so that they are consistent with previous publications. Most importantly, the figures remain the same regardless of whether they are custom or accountancy format.”

That’s all true, but not quite explanatory, and indeed – why would the sheets have been troubled to assume different numeric formats to begin with?

I then put the matter before Debra Dalgleish, a bona fide pivot table expert who’s authored three books on the feature. She offered that because some of the data cells reveal hyphens in lieu of numbers – apparently standing in for 0, that would cause Excel to treat the entire field in Text terms, thus resisting any efforts at rounding. Perhaps, but when I drew up a demo data set comprising nothing but decimal-freighted numbers, the rounding off would not happen.

I was thus left to speculate that the operative formattings on the sheets – be they Custom or Accounting – have nothing to do with the problem. My suspicion – though remember, I’m officially mystified – was that the data themselves had been put through some sort of ringer, perhaps via a download from some other, presently unknown source or sources (note Tenwick’s note about consistency with past publications). I should add that my past experience with the grouping/rounding issue has instructed me that decimal-retention is the default, and so it’s the data here that are amenable to the rounding that pose the question.

But then I had an idea, one which returns me, via the back door, to Debra Dalgleish’s thought about the hyphenated entries, viewed this time on its head. Remember that in fact the hyphens aren’t text – they’re numeric zeros, formatted as an Accountant might (keep the jokes to yourself). It seems, after having subjected the data to a bit of controlled experimentation, that as long as the data has at least one zero, however it’s formatted, the grouping will consent to a rounding. It may be that the intervallic intelligence of the grouping feature requires the grounding of a zero somewhere in the data, thus establishing a floor atop which the intervals can be built.

 

But whatever the reasoning, a data set seems to call for at least one zero, at least if you like your grouped numbers all smoothed and polished and rid of those scruffy decimals. But what if your data have no zero? You could throw in a dummy, zero-valued record, but that intention would have to be thought through, particularly if you’re counting and/or averaging the data.

And so if the data are zero-less you may also alternatively want, or need, to round the numbers in a user-introduced column in the data set with the ROUND, or FLOOR, or CEILING functions, a matter I’ve already broached here:

http://datadrivenjournalism.net/resources/groupthink_grouping_values_in_excel_pivot_tables

And that link points to wider pivot table strategy, an open secret of the enterprise: the trick is to do as much preparatory work in the data set before it washes up onto the pivot table. For all its aggregating might, pivot tables won’t do everything, and a little data-organizational proactivity can help the table along.

And that’s because the pivot table Group Selection command is zero-tolerant.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: