The Data Is Plural newsletter provides a mass of free and fascinating data

I recently chanced upon "Data Is Plural" - an email newsletter, currently on issue 370. Each week it provides a list and some commentary on "useful/curious datasets". There's a ton of links in each issue for anyone who wants data to play or work with to get stuck into. To give a taster of what … Continue reading The Data Is Plural newsletter provides a mass of free and fascinating data →

Notes on the book “Becoming a Data Head”

Below are notes that I took when reading Alex J. Gutman and Jordan Goldmeier's book "Becoming a Data Head - How to Think, Speak, and Understand Data Science, Statistics, and Machine Learning". The notes simply aim to summarise the parts of the book that most attracted my attention, sometimes reworded or reorganised, and don’t necessarily … Continue reading Notes on the book “Becoming a Data Head” →

Using ChatGPT’s Data Analysis bot to analyse your data

One less widely known feature of OpenAI's large language model chatbot, ChatGPT, is that if you become a paying subscriber then you can create your own bots that are attuned to be good at doing specific types of task. OpenAI also provides you with a few examples that they created, which include the one I'm … Continue reading Using ChatGPT’s Data Analysis bot to analyse your data →

The ongoing battle between human creators and AI trainers

In order for the current generation of generative AI tools - large language model chatbots, art generators et al - to work they must first undergo an extensive training process whereby they are fed a huge number of examples of the sort of content they will be later expected to produce. Per Wikipedia, the basic … Continue reading The ongoing battle between human creators and AI trainers →

Does protesting work?

There's been a lot of news about political protests recently in the UK, and some rather unsavoury efforts by the government to stop particular instances of them happening. Outside of the UK recent years have seen some very high profile such protests. 2020 saw the Black Lives Matter related protests after the killing of George … Continue reading Does protesting work? →

Is British gas and oil really 4x as good for the environment as imported fuel?

The British Prime Minister, Rishi Sunak, recently declared that he's going to enable a huge expansion of North Sea gas and oil extraction. There is a lot to criticise about this plan to say the least. But here I will endeavor to restrict myself to digging into one of his more surprising claims about this … Continue reading Is British gas and oil really 4x as good for the environment as imported fuel? →

Writing conditional filter statements in dplyr

Somehow only recently did I realise that you can use if statements directly within R’s dplyr library filter function. This lets you create conditional filter criteria that can filter on different variables based on some other condition external to the function call. For instance you can change what you filter for by referencing another unrelated variable in your code. … Continue reading Writing conditional filter statements in dplyr →

The risks of using generative AI when writing code

I've been nearly as fixated as much of the rest of the online world is with artificial intelligence over the last few months. Particularly the large language models (LLMs) out there, ChatGPT being the most famous example. Whilst the hype seems to be dying down a little - it's been a while since I've seen … Continue reading The risks of using generative AI when writing code →

Are AIs developing unpredictable new abilities, or are we just measuring them badly?

One of the things that make people nervous, awestruck, or both about the development and release of recent AI models is the prospect of them developing "emergent abilities". The terminology here can be complicated. Different people mean different things by "emergent abilities". Here in the context of large language models (LLMs), we're talking about the … Continue reading Are AIs developing unpredictable new abilities, or are we just measuring them badly? →