Thursday, September 17, 2015

An estimate based on... zero evidence?

News agencies today report that "RAF strikes since 2014 have killed around 330 IS fighters".  The UK defence secretary said the figure was "highly approximate" since there were no UK ground troops there to confirm casualties.

He also said ministers did not believe the action had caused civilian casualties.

It looks like both these claims are highly speculative and based on very little information.  No source is given for the claim about civilian casualties.  To be fair to the minister, he was answering a Parliamentary question so had to say something, but what he did say should probably be heavily discounted.

Stories such as this (not just about warfare) appear all the time in the press.  It is always useful to ask yourself 'what is the evidence for this?'.  If little or none is given then one should not take the claims too seriously (even if the issue itself is serious).

Wednesday, September 2, 2015

Material for the new edition - how to improve a graph

Here is something that I am intending to put in the new edition, showing how one can improve a graph to make it more readable and impart a message more clearly.  This will be in the form of a 'boxout', separated off from the main explanatory text.  When learning (about it any subject) it is useful to see examples of bad presentation as well as of good.  Do you think this is a useful example from which you learn something?  Add your comment below.

Improving the presentation of graphs - an example


Today we are assailed with information presented in the form of graphs, sometimes done well but often badly.  We give an example below of how presentation might be improved for one particular graph, showing employers’ perceptions of Economics graduates’ skills.  One can learn a lot from looking at examples of graphs in reports and academic papers and thinking how they might be improved. The original graph is not actually a bad one but it could be better.






Problems with this picture include:

1.     The category labels are difficult to read, being small and wrap-around text
2.     The vertical axis title is sideways, so difficult to read
3.     It is difficult to compare across categories.  For example, which skill has the most ‘very high’ or ‘fairly high’ responses?
4.     A subjective judgement, but the colours are not particularly harmonious.

The version below takes the same data but presents it slightly differently:



Turning the graph on its side means that the labels are much easier to read, as is the horizontal axis label.  Making it a stacked bar chart saves space and makes it look less cluttered.  It is fairly easy to see that ‘interpreting quantitative data’ scores the most ‘very high’ or ‘fairly high’ responses – hopefully this book makes some contribution towards that!  Using different shades of the same colour makes for a better appearance (and probably works better if printed in grayscale too).

You might have noticed that the categories are now in a different order.  This is a quirk of Excel, the same data table was used for both charts.  Fortunately the ordering does not matter.  We shall give similar examples at other places in this book.




New edition!

I am currently working on a new edition of my textbook, which will be the seventh.  It has been pretty successful - it is the market leader in the UK - and by now its structure and content are reasonably settled.  However, I do want to make some changes to the new edition based on the opinions and suggestions of reviewers and on my own experience of teaching and working with statistics.
It would be great if I could hear from you also about both the existing text and about my ideas for changes in the next edition.  What do you like about the book and what do you not like?  Are there parts where you find the explanations difficult to follow?  Is there anything you think is missing?  Do you use the associated web site with quizzes etc?  Please contribute by leaving comments on this post.
Now here's what I plan to change in the next edition:

  • In chapter 1 on descriptive statistics - have some examples of good and bad graphs, and showing how to improve a bad graph into a better one.  
  • Chapter 2 on probability - a more detailed explanation of the principles of probability including the use of Venn diagrams to illustrate those principles, making them more intuitively clear.
  • Chapter 8 on multiple regression - have examples showing how one can graph the regression coefficients, rather than have them listed in a table.  The graphical presentation is easier to comprehend and interpret.
  • Chapter 9 on data collection - update the sources of data to reflect the huge increase in online sources, including issues around so-called 'big data'
  • Generally - add a few more examples related to business and management; currently the text focuses a bit more on Economics. 
I will be following up with posts covering the new material.  You can also comment on these as they go online.