Are vast numbers of Britons really exiting the workforce due to long term sickness?

There's been a claim made over at least the past couple of years by all manner of sources that vast swathes of the working-age UK population have exited the workforce because long-term illness has apparently rendered them incapable of work. Whilst there have been exceptions, the version of the explanation for this that seems to … Continue reading Are vast numbers of Britons really exiting the workforce due to long term sickness?

You can finally can turn off Microsoft Excel’s incessant desire to mess up your data by auto-converting it to an inappropriate type

I'm late to discovering this, but in case I'm not the last: In what might be a data analyst's best gift of the year 2023, I recently learned that you can now stop Microsoft Excel from automatically "recognising" and converting certain types of data to other types. Think here of Excel's ability to decide that … Continue reading You can finally can turn off Microsoft Excel’s incessant desire to mess up your data by auto-converting it to an inappropriate type

Most Britons favour legalising assisted dying, irrespective of political affiliation

Today sees the big vote by our British Parliamentarians on introducing a law to allow for assisted dying taking place. There have been strong views espoused in both directions from the politicians, oftentimes transcedning party lines. Curious coalitions have formed, not least the the one between Diane Abbot and Edward Leigh who penned a joint … Continue reading Most Britons favour legalising assisted dying, irrespective of political affiliation

Aggregating and analysing location data using H3 in Snowflake or R

Geographic location analysis has been an important subset of data analysis since time immemorial. One of the most famous examples from times past is the visualisation that John Snow created in response to an outbreak of cholera almost 170 years ago. That dataviz led to an action - the disabling of a water pump that … Continue reading Aggregating and analysing location data using H3 in Snowflake or R

Creating a granular “votes cast” dataset for the last century’s worth of UK General Elections

Earlier this year the UK had a General Election, which, in many quarters, was declared a landslide victory for Labour. Certainly we have - thank goodness - a new government, and the switch towards Labour in terms of the number of seats (and hence MPs) was dramatic. But there remains an ongoing debate as to … Continue reading Creating a granular “votes cast” dataset for the last century’s worth of UK General Elections

How to get a Wikipedia (or other HTML) table into R as a dataframe

I recently wanted to use some data I found in a Wikipedia article for analysis in R. Acknowledging of course the historical buyer-beware status of Wikipedia data - although these days it often seems as reliable as any other source. It turns out it's pretty easy to do. You can use the rvest library, which … Continue reading How to get a Wikipedia (or other HTML) table into R as a dataframe

Today’s young Britons really seem to dislike the Conservative party

Following the surprise, albeit much needed, announcement of a UK general election, I've not been able to resist hoovering up any and all available data on the political dynamics of the situation. In terms of voting intentions, it's a vastly different situation to any seen within the last quarter of a century or so. Labour … Continue reading Today’s young Britons really seem to dislike the Conservative party

Situations when multicollinearity in regression model variables isn’t important

When creating basic multiple regression models, if your predictor variables correlate with each other this usually presents a problem in that you can end up with unstable estimates for the resulting coefficients. One way to test for multi-collinearity is to check for a relatively high Variance Inflation Factor, or VIF. Many packages exist that make … Continue reading Situations when multicollinearity in regression model variables isn’t important

The great SQL leading vs trailing commas debate

It might seem a small thing, but I noticed that a recent update of the Snowflake database now allows you to have a trailing comma at the end of the SQL's SELECT statement. For example, this now works: SELECT my_field, my_field_2, FROM my_table Whereas before that'd give an error. It's arguably bad form nonetheless, but … Continue reading The great SQL leading vs trailing commas debate

The Data Is Plural newsletter provides a mass of free and fascinating data

I recently chanced upon "Data Is Plural" - an email newsletter, currently on issue 370. Each week it provides a list and some commentary on "useful/curious datasets". There's a ton of links in each issue for anyone who wants data to play or work with to get stuck into. To give a taster of what … Continue reading The Data Is Plural newsletter provides a mass of free and fascinating data