Search results for 'oecd'

Time (and Record) Management: OECD Data

25 Jul

The OECD clearly has a lot of time on its hands – after all, think how long it takes to say The Organisation for Economic Co-operation and Development. But time likewise figures in its research remit, here keying a survey of the time management habits of denizens of its member nations (at least most of them, along with representations from China, India, and South Africa, designated OECD partners), and recapitulated here:



The data – and it’s the Total sheet I’m reviewing – spread out some neat cross-national takes on how respondents apportion their daily round, normalized for each nation’s activities to the day’s 1440 minutes (and I assume normalization means reconciling the inter-national variance in activity definitions; see the Activity category tab). You’ll be interested to know, for example, that South Africans appear to be able to devote 562 minutes a day or night – nine hours and 22 minutes – to sleeping, and that while earmarking but 463 minutes to that circadian necessity we’re left to conclude that the Japanese and Norwegians seem to be drinking the most coffee. It is also noteworthy that respondents from Turkey – the only preponderantly Muslim country in the national roster – spend the most time in religious activity (row 32), at 19 minutes (South Africans come in one minute behind). It’s all interesting, but I can’t resist the geekish aside to the effect that the row 32 numbers are variously positioned – with some middle-aligned, (along the vertical axis), and others right (and you might very well also want to pump up the 8-point text prevailing throughout).

But at the same time the numbers urge a familiar question upon the plenary: For whom is the spreadsheet intended – a public of passive, if interested, readers, or that A-Team of caffeinated deep thinkers who want nothing better than to do something with the data? If you’re siding with the latter insurgents, and want to act upon what it is you’re viewing here with your canteen of slicers (and Slicers) and dicers, then the data – and we’ve seen this before – have to be reshaped. That’s a question we’ve asked in so many words before, and it requires asking here as well.

Let’s see. For starters, we need to do something about row 3, harboring as it does survey date identifiers impersonating as a row of data. And a similar reprisal need be practiced upon those purple subtotal rows, which should be deleted. Leaving them in place and proceeding to drop the data as they stand into a pivot table will do nothing less than double any summed results, and yes, we’ve seen this before (e.g., my January 24, 2013 post).

On the other hand stripping those purple rows will also deprive the sheet of their associated headings, e.g., Paid Work or Study, Unpaid Work. But again, rows of data should be of a piece; insinuating rows of titles into the average daily minutes is tantamount to playing checkers, when in fact we’re playing chess. Moreover, the Men & Wom field in the A column sufficiently identifies the activities subject to the minutes-per-day estimates. But there’s more to be said about this.

But before we expound the larger issue betokened above, swing over to column AC and its swath of computations averaging the numbers lining up to their immediate left. Try cell AC4:


And try explaining it. Note all the references in the expression are alphabetically contiguous; and given its cellular continuity how do we defend the formulation above, when we have


at our disposal? I can’t answer my own question.

And indeed – those averages ignore the three partner nations on the other side of AC, thus plunking the wrong kind of data amid the time-estimate numbers. And the blank AD column has to go.

But there’s that larger issue, reprising an earlier question, the one I asked and attempted to resolve in my August 22, 2013 post. The Total sheet confers field status upon tufts of data that should more rightly conceive of themselves as items relenting to a larger, governing field. Thus, for example, the field-defined member countries in the sheet should be made to submit to the controlling aegis of a Country field. It seems to me, then, that a reconstructed data set would wheel out records looking something like this:


Again, umbrella fields on the order of Gender or Country are far mightier enablers of the kinds of grouping and ordering feats that pivot tables perform, and well justify the necessary surcharge the additional data entry would exact (again, turn back to August 22).

For example, the data reforms I’m (again) proposing would free a pivot tabler to rank national time outlays by any activity, by grabbing the Rank Largest to Smallest marble from the Show Values As bag of alternatives. The point is that Rank Largest to Smallest can do what it does only among items planted in the same field – and not between data settled in different ones.

So if you’re wearing the A-Team colors, why not reorganize the data here and shop it to the OECD – if you have the time?

Putting OEDC Asylum Data on the Table

5 Dec

Shopping for spreadsheets? If you’re in the neighborhood, poke through the inventory at the Organization for Economic Co-ordination and Development, acronymized OECD, you’ll be happy to know. We’ve been here before, on July 25 to be exact;  but I told you it’s a large inventory.

In fact there appear to be several points of ingress to the holdings, but here’s one

a self-described Public Beta page that enables you to “Find core datasets in the data warehouse and make your own tables and charts in OECD Stat Extracts”, a triflingly ambiguous inducement, appearing as it does to conflate the term “datasets” and the “Database” informing the title atop the quoted passage, but again, that’s a trifle.  More to the point is the availability of the data in either unmediated spreadsheet form, or downloadable as such via an Export button.

In any case one such set, a brief of statistics on asylum seekers to OECD circa 2013, seized the attention of the Guardian’s datablog (note that site’s redesign, and the disappearance of its erstwhile Data link from its home page), and is the one I think I’ve managed to capture here -even as the OECD heads its source page Stacks and flows of immigrants, 2002-2012 and so seems to have made off with a year (I’ve also retitled the workbook, overwriting its long, number-beset native name):

Inflows of Asylum seekers in OECD members and the Russian Federation

A first view of the sheet exposes it in Page Break Preview, a perspective on the data that won’t particularly advantage the analyst, and so you’ll want to make your way back to the standard Normal View. Moreover, the recourse toward the Freeze Panes selection at the columnar intersection of A and B seems puzzlingly needless, as the sheet bares all its numbers and country names immediately. I’d unfreeze the panes right away, even as I continue to wonder about the dictionary form which Microsoft mines its verbs. What would your teacher say about unfreeze? In addition, the missing field header in column A needs to be supplied.

Note, on the other hand, the proper distancing of both the sheet’s title and its yearly grand totals from the body of the data with the expedient of choice, the blank rows. But at the same time recall the OECD’s invitation to its visitors to “Find core datasets…and make your own charts and tables…”.  Again, our sheet assigns its yearly data to one field per year, when in fact all the years and their associated numbers would more profitably be reined into greater, subsuming fields. Not this, then:



Butt again, this:


(For a straightforward review of the means toward the above transformation over and above what I’ve said about it, look here.)

The companion question, then, is what plans the OECD have for the data once they hand them over to us. True, the workbook’s as-it-stands configuration does happily supports a round of charting, but less certain are the data’s congeniality to making “your own tables”. Once, however, the information fills the pared-parameter shape assumed by the second of the above screen shots, the prospects for this kind of pivot table loom a lot larger:

Row Labels:  Country

Column Labels: Year

Values: Values

Now this facile breakout of the asylum numbers by year is ready to happen, because all the years share the same Year field as an item. Try the above pivot table that when the each and every year occupies a field to itself.

But hold on. When the pontificating is done, my self-assured amelioration cooks up nothing but precisely the worksheet that already presents itself at the point of download:


Same numbers, same juxtaposition of fields. So what was my point?

That’s a sweat-inducingly good question. But once all those years and their concomitant data recuse themselves from fields into items you can also treat the values to the % of Row Total styling, and learn, for example, that 30.88% of Turkey’s asylum seekers during the 2002-13 interval made their way to that country in 2013 alone, a clear attestation of the Syrian conflict’s scattering effect upon its populace. Less obvious is Hungary’s 2013 percentage of 37.22 of all the seekers it received across the 12 years, a skew I can’t yet explain.

Now consider this OECD book, counting member populations by their residents’ country of origin:

OEDC Immigration data 2011

Compare that far larger dataset’s field structure with that of the one above, e.g.


That’s really more like it, even if the data themselves could stand some clarification (I’d bang out a lookup table that would return country names in full, and the data are available on the OECD site in a PDF for doing so. Is AUS, after all, Austria or Australia?).

And these variously-structured workbooks drag one ore question into the fray: why should some OEDC workbooks be stitched along this, or that, pattern of design – fields as items, or items as fields? I don’t know, that answer; nor do I know if the OEDC has endeavored to think about it.

But I’ll be happy to think about it, folks.  Here’ – I’ve attached my business card.