As terrifying a thought as it might be, Jason from Pushshift.io has extracted pretty much every Reddit comment from 2007 through to May 2015 that isn't protected, and made it available for download and analysis. This is about 1.65 million comments, in JSON format. It's pretty big, so you can download it via a torrent, as per the … Continue reading Free dataset: all Reddit comments available for download
Author: Adam
4 ways to make Alteryx tools easier to find
Alteryx is surely the king of easy-to-use data manipulation tools, and as a bonus handles predictive and spatial analysis via a very friendly set of tools. One method by which it makes things easy to use is that it has a huge array of built-in tools to drag and drop, thus saving you the task … Continue reading 4 ways to make Alteryx tools easier to find
When is it safe to stop watching the match?
Despite the Harvard Business Review's insistence that data analyst is the sexiest job of the 21st century, ask a non-quant about popular references to data analyssis and you are quite likely to hear some reference to Moneyball (be that book or film). Spoiler alert: "sabermetric" data analysis enabled a baseball team with less money to … Continue reading When is it safe to stop watching the match?
Basic text tokenisation with Alteryx
Free text analytics seems a fashionable pastime at present. The most commonly seen form in the wild might be the very basic text visualisation known as the "word cloud". Here, for instance is the New York Times' "most searched for terms" represented in such a cloud. When confronted with a body of human-written text, one of the first steps for many text-related analytical techniques … Continue reading Basic text tokenisation with Alteryx
From restaurant-snobbery to racism: some perils of data-driven decision-making
Wired recently wrote a piece explaining how now OpenTable, a leading "reserve a restuarant over the internet" service, was starting to permit customers to pay for their meal via an app at their leisure, rather than flag down a waiter and awkwardly fiddle around with credit cards. There's an obvious convenience to this for the … Continue reading From restaurant-snobbery to racism: some perils of data-driven decision-making
Exporting CSV data from SQL Server Management Studio
SQL Server Management Studio is a commonly-used bit of the Microsoft SQL Server install, and a decent enough tool for browsing, querying and managing the data. Sometimes though you might have the urge to extract a big chunk of data - most often I do this to generate a big text-file dump for import into other data … Continue reading Exporting CSV data from SQL Server Management Studio
Stephen Few’s new book “Signal” is out
Stephen Few's latest, "Signal: Understanding what matters in a world of noise" has just been released - or at least it has in the US, seems to be stuck on pre-order on Amazon UK at present. Not many reviews seem to be floating around just yet, but the topic is ultra-fascinating: In this age of so-called Big … Continue reading Stephen Few’s new book “Signal” is out
Data science vs rude Lego
Data science moves onwards each day, helping (perhaps) solve more and more of the world's problems. But apparently there's at least one issue for which we don't have a great machine-learning/AI solution for just yet - identifying penises made out of Lego. Indeed this is apparently the problem that plagued the potential-Minecraft-beater "Lego Universe" nearly … Continue reading Data science vs rude Lego
New chart types coming in Excel 2016
As far as I can recall, it has been many many years since a new chart type of significance has found its way into an Excel update. However for the 2016 release we're getting some new treats, as seen in this presentation from Scott Ruble. Several of those chart types are versions of what could already be done in … Continue reading New chart types coming in Excel 2016
UK election 2015: Who actually voted for the Conservative party?
Here in the UK we just had our general election, electing the government who will rule over us for the next 5 years. The results - a Conservative majority - were something of a surprise to most people, myself included. I'm sure I won't be able to hide my leanings for long, so to be clear, … Continue reading UK election 2015: Who actually voted for the Conservative party?