R is a wonderful, flexible, if somewhat arcane tool for analytics of all kinds. Part of its power, yet also its ability to bewilder, comes from the fact that there are so many ways of doing the same, or similar, things. Many of these ways are instantly available thanks to many heroes of the R … Continue reading My favourite R package for: correlation
Tag: Statistics
The Datasaurus: a monstrous Anscombe for the 21st century
Most people trained in the ways of data visualisation will be very familiar with Anscombe's Quartet. For the uninitiated, it's a set of 4 fairly simple looking X-Y scatterplots that look like this. What's so great about those then? Well, the reason data vizzers get excited starts to become clear when you realise that the dotted grey … Continue reading The Datasaurus: a monstrous Anscombe for the 21st century
Simpson’s paradox and the importance of segmentation
Here's a classic business analysis scenario, which I'd like to use to illustrate one of my favourite mathematical curiosities. Your marketers have sent out a bunch of direct mail to a proportion of your previous customers, and deliberately withheld the letters from the rest of them so that they can act as a control group. As analyst extraordinaire, you get … Continue reading Simpson’s paradox and the importance of segmentation
New website launch from the Office of National Statistics
Yesterday, the UK Office of National Statistics, the institution that is "responsible for collecting and publishing statistics related to the economy, population and society", launched its new website. As well as a new look, they've concentrated on improving the search experience and making it accessible to mobile device users. The front page is a nice at-a-glance … Continue reading New website launch from the Office of National Statistics
The Sun and its dangerous misuse of statistics
Here's the (pretty abhorrent) front cover of yesterday's Sun newspaper. Bearing in mind that several recent terrorist atrocities are top of everyone's mind at the moment, it's clear what the Sun is implying here. The text on the front page is even more overt: Nearly one in five British Muslims have some sympathy with those who have fled … Continue reading The Sun and its dangerous misuse of statistics
Kruskal Wallis significance testing with Tableau and R
Whilst Tableau has an increasing number of advanced statistical functions - a case in point being the newish analytics pane from Tableau version 9 - it is not usually the easiest tool to use to calculate any semi-sophisticated function that hasn't yet been included. Various clever people have tried to work some magic aroud this, for instance by … Continue reading Kruskal Wallis significance testing with Tableau and R