My goodness there’s been a blog drought! But never fear, I have been hatching some ideas – but they may not be to everyone’s liking – I mean they are not what I would consider controversial, but some might want to stop reading now because I’m going to talk about numbers – statistics, data you know…
What got me thinking about this particularly riveting topic was, of course, COVID. I’m wondering if our constant stream of COVID statistics is having broader effects. Are we becoming more used to dealing with data given that last year saw record numbers of people hanging out each morning for the case numbers etc to be released – we don’t do that for CPI or census data do we?
One particular item caught my attention, but once I started thinking about it lots more examples popped up. My initial question arose around the use of two little words – ‘with’ and ‘because’. The numbers reported were for people in hospital with COVID and people in hospital because of COVID.
Stay with me here. These are two quite different ideas and the numbers are useful and informative, but for quite different reasons. The number ‘with’ is really important for hospitals – to know how many people they are treating and isolating for the disease for starters. But it really doesn’t tell those outside the hospital system particularly much. The number ‘because of’ is important as a general indicator of public health and the public health response. This tells us how severe symptoms are, how people are faring generally with COVID. It might be quite difficult sometimes for those collecting the data to know which option to choose for a given patient but I imagine most of the time there is good information on why the patient first presented at the hospital. Over time of course, it is quite possible for patients to move from one group to the other but generally this migration should be small, but maybe not?
See, one small change to the wording of the data descriptor and I’m off questioning the data1.
When, as a young undergraduate, I was required to endure Statistics 101, our very patient lecturer tried very hard to make the subject speak to a group of mostly disengaged late teens. One particular example has stayed with me – road crash numbers. No matter how hard he tried to find a way for the statistic to show otherwise – young novice drivers were always more likely to have more, and more serious, road crashes. That is unless, of course, they were females and then the blokes outshone them by a long mark. Mind you, this was back in the dinosaur era so things my have changed since then.
But I think what he was banking on was that once we become personally engaged with the data we start to take an interest in it and question, query and dig deeper – sometimes. I hope this is true – at least for some of us. Curiosity about data and what it means, thinking about what the numbers purport to show and what they might actually show, are really important to understanding so much about what’s going on around us2.
Presenting this data in a readily digestible format also improves and increases its accessibility and its reach. Cheers to the fabulous Casey Briggs (ABC TV for those of you living outside Australia – or under a rock if inside Australia) and his skilful use of modern technology coupled with a strong grasp of how to explain numbers and second prize to NSW for their recently updated COVID daily dashboard to make the numbers easier to see. And people’s choice award for the young kids in Victoria who showed everyone else how good they were at collating the data and presenting it in a readily accessible form during last year’s never ending lockdowns.
Quite a few COVID commentators have raised issues about what numbers are reported publicly and why some are apparently not collected – there’s always numbers not collected for all sorts of reasons and for some of us there’s never enough data but lots of these queries are valid. Why, for example was Victoria alone in reporting the breakdown of ‘with’ and ‘because’ with hospital admissions? I have no idea what the other states’ numbers mean and given that Victoria now just publishes ‘hospitalisation’ I am left wondering what it means in Victoria too – my guess is it’s people in hospital with COVID. Maybe I blinked and missed the commentary around the explanation for dropping the dual reporting .
NSW publishes weekly data on hospitalisations broken down by vaccination status. Seriously useful data which I would presume other jurisdictions collect, but why don’t they publish it?
What our fixation on these numbers may have provided is an introduction to statistics my erstwhile lecturer could only dream of. So while I don’t expect a sudden interest in every ABS release or an explosion in graph drawing, maybe we can expect a slightly more informed understanding of what it presented to us in the form of statistics.
In the Guardian on 30 January 2022, there was a very interesting explanation of how to interpret the data on the vaccination stays of people in hospital with COVID. I won’t try to reproduce it here because they had some very nifty graphics, but here’s a link – I hope it still works. If it doesn’t, I’ve included a less professional looking link at the end.
It shows the full context of who’s in hospital and the relative size of each cohort – relative being the most important factor here. For people who are fully vaccinated, the proportion in hospital is way smaller than the proportion of non-vaccinated people. This is even though there are more vaccinated people in hospital. Very simply, a small proportion of a large number (those vaccinated in this case) can easily be bigger than a larger proportion of a small number (those not vaccinated). It’s a very neat rebutal of anti-vax claims that because more vaccinated people are in hospital, vaccines don’t work.
I guess my message is, we all need to think about what we are served up as reliable data. It should go without saying to make sure you use reliable sources. But beyond that, think hard about what are the numbers they have collected – ask yourself – do I really understand what they are representing or am I assuming this?
Then think hard about how to use the numbers. And maybe you’ll find yourself hankering for those labour market statistics and the regular GDP figures will be fascinating – or not. But maybe political polling is your thing – same applies here – always find out who they surveyed, what they asked and how the survey was conducted.
And of course, always be very sceptical about things you read online….
Footnotes
1 This particular super power is probably also connected to my other more well-known super power of breaking forms by interrogation.
2 Or have I just listened to one too many More or Less podcasts? For the curious, More or Less is a relatable, if slightly nerdy, podcast that ‘likes to be at the sharp end of statistics’. I highly recommend it for some interesting number dives.
Thanks Geraldine,
I agree with you that covid is making us love statistics and all the different ways the data can be presented and interpreted – see for example
https://covid-19-au.com/ https://coronavirus.data.gov.uk/ https://covid.cdc.gov/covid-data-tracker/#datatracker-home
Vin Martin uses footy data to get young ones excited about stats … (?)
Cheers, Kerry
On Wed, Feb 2, 2022 at 3:56 PM Does this make sense wrote:
> publicpolicyeconomistatlarge posted: ” My goodness there’s been a blog > drought! But never fear, I have been hatching some ideas – but they may not > be to everyone’s liking – I mean they are not what I would consider > controversial, but some might want to stop reading now because I’m going to > ta” >
LikeLike