Thursday, November 8, 2012

Type I and Type II errors

A recent news item (see here for the BBC report) covered research into the UK breast cancer screening.  The problem with the screening is that it is not perfect: it can miss some genuine tumours but it can also 'over-diagnose', i.e. signal a tumour which is actually harmless.

These should be familiar as Type I and II errors.  The null hypothesis is the absence of a tumour, so a Type I error is that of incorrectly diagnosing a tumour, when there isn't one.  A Type II error is missing a genuine tumour.  (One could look at this the other way round, with the null being the presence of a tumour, etc.)

According to the report, for every life saved, three women had unnecessary treatment for cancer.  This seems quite a high ratio but partly reflects the fact that the incidence of cancer is actually quite low.  The probability of a Type I error is given as 1% in the article.  This would be consistent with something like the following: for every thousand women tested, 10 are incorrectly diagnosed and treated, while three are correctly diagnosed and treated (hence approximately three times as many false positives as genuine ones.)

As well as the probabilities, the costs of the errors should also be taken into account.  The cost of missing a diagnosis is apparent to us, which is why there is a national system of screening.  The costs of over-diagnosis are less obvious but can be substantial.  The treatment is unpleasant, to say the least.  The costs of over-diagnosis might also be masked because it is concluded that the treatment has worked, rather than that there never was a cancer.

Election polls and odds

The recent US election provides some interesting opportunities to look at the opinion polls.  One of the most accurate turned out to be that of Nate Silver of the New York Times who gathered up all the opinion poll data and turned it into a prediction of victory.

Many journalists and opinion-formers in the US were saying the election was 'too close to call' even on the eve of the election itself.  But this seems to confuse two quite different possibilities:

1. Strong evidence of a narrow win for Obama
2. No evidence of a strong win for either side

Many commentators went with 2 above, but 1 is the correct interpretation.  Let's see how this works.

Silver gives evidence for Colorado, one of the 'tipping point states' that could be decisive in the election.  Based on the various polls, Silver projected vote shares of 50.8% for Obama, 48.3% for Romney.  On this basis it looks fairly close, and this is probably how the commentators viewed it (especially as there is a +/-3% points margin of error on the polls).  However, Silver also gives the projected probabilities of winning, which are 80% Obama, 20% Romney.  This looks much more decisive.  How do we get from the poll figures to the odds of 80:20?

If we take the margin of error as representing two standard errors, as is usual, then the standard error is 1.5%.  Disregarding the 0.9% of voters not supporting either candidate, we have p = 50.8/99.1 = 51.26% for Obama and hence 48.7% for Romney.

We then ask, is it likely that Obama's true share of the vote is less than 50%?  This is a question using a sample proportion so we calculate the z-score as:

z = (0.5126 - 0.5)/1.5 = 0.84

This cuts off 20.05% in the tail of the distribution.  This tells us that there is a 20% chance of getting such evidence (sample proportion of 51.26% or more) if Obama's true vote share is 50% or less.  Hence there is a 20% chance of a Romney victory, 80% for Obama.

This is my own take on the evidence, Silver's procedure is probably more sophisticated, but our approximation seems to work.  (You could try it out on other states to see if you too can replicate it.  Here's Virginia, another tipping point state:  Obama 50.7, Romney 48.7, margin of error +/-2.5.  Silver's odds for this are 79:21 for Obama.)

Note also that +/-3% (points) is a typical margin of error for polls.  Recall that the standard error of a proportion is the square root of p(1-p)/n.  If p is approximately 50% and n is about 1000 (a typical poll) then the formula gives a standard error of 1.58.  Doubling this gives our margin of error.

Thursday, July 12, 2012

Seasonal adjustment can be dangerous

Here's an interesting post from Paul Krugman talking about the implications of smoothing a data series, in this case GDP.  Although it's about a technique called the Hodrick-Prescott (H-P) filter, this is just a fancy means of smoothing, as covered in chapter 11 of the book, on seasonal adjustment.

Essentially, Krugman's argument is that economists have smoothed the GDP data and then called this 'potential output'.  Since actual GDP is not far off the smoothed value (in 2012, as I write), some interpret this to mean there is not much of a recession.  Hence little need for active fiscal or monetary policy to address the problem.

However, the smoothed series inevitably follows the actual series (even though it changes more slowly) and hence is bound not to be too far away.  The implicit assumption is that when actual output falls, so does potential output, which is generally not warranted.  The H-P filter only filters out short-term fluctuations and does not deal well with large changes, such as a recession.

Who would have thought that such a technical issue could have such a powerful effect upon the debate around economic policy?

Thursday, June 28, 2012

Economists are over-confident

This is an interesting story about economists' use of statistics, arguing that we are generally overconfident about our results, in particular about forecasts.  It is good to see some empirical evidence about this.  The authors found that readers tend to focus on the point estimate and not give enough attention to the standard error and confidence interval.  Interestingly, when the results were supplemented with graphs there was no improvement in understanding of the numbers and their implications.  However, when only graphs were shown, then the interpretations were better.

Take a look at the original paper (follow the link to the earlier version, which is freely available) to learn more.

This is food for thought when considering how you might present statistical results.

Monday, April 16, 2012

How an anecdote becomes a statistic (based on 'research')


Everyone has heard about those astonishing facts whose source proves tricky to uncover.  “We only use 10% of our brain” is one example, usually followed by some encouragement to undertake a course in new ways of thinking.  There is no scientific basis for this supposed fact.

The same is true of statistics, when numbers are touted around supposedly supporting an argument (or perhaps prejudice) but whose origin proves elusive.  Here, for once, is a nice example where we can trace the story back to its source, and discover how tenuous it is. 

The Daily Telegraph reported on October 31st 2011 (see here) that there are more Porsches in Greece than there are taxpayers declaring an income above 50,000 Euros per annum, “according to research by Professor Herakles Polemarchakis, former head of the Greek prime minister’s economic department”.   This suggests support for the assertion that tax evasion is widespread in Greece, that therefore the fiscal crisis is their own fault and hence that austerity measures can be justified.

The story is interesting for several reasons:
1.      A simple anecdote is preferred to statistical evidence,  
2.      For once, there is a reported source of the story, and
3.      There is some confusion about whether the story relates to all Porsches or the Cayenne model in particular (different news agencies have different versions).

The BBC followed up this story on the 16th April 2012 (so the story had been going unchallenged for at least six months) with some fact checking, see here for details.  First they found that there were 311,428 people with incomes above 50,000 Euros in 2010.  Then they asked Porsche how many Cayennes they sold in Greece.  Their spokesman (after he finished laughing) said they had sold around 1,500 over the previous nine years. 
So what of Professor Polemarchakis’ research?  When asked, he replied that his remark was casual, based on what had been circulating in policy circles in Greece a few years back.  He said the only hard fact he was aware of was "the per capita number of Cayennes in [the Greek city of] Larissa was twice that of Cayennes in the OECD countries".

So it is that a casual observation becomes an anecdote which becomes evidence based on research.  However, it is one of those stories where one should immediately be suspicious of the “fact”.  A quick Google check shows Porsche’s European sales in 2011 were 42,084, so Greece is unlikely to take more than (at a guess) 10% of them, around 4,000 vehicles.  To have over 300,000 Porsches (never mind the Cayenne model only) in the country means they must have amassed them over 75 years…