Free text analytics seems a fashionable pastime at present. The most commonly seen form in the wild might be the very basic text visualisation known as the "word cloud". Here, for instance is the New York Times' "most searched for terms" represented in such a cloud. When confronted with a body of human-written text, one of the first steps for many text-related analytical techniques … Continue reading Basic text tokenisation with Alteryx
Month: June 2015
From restaurant-snobbery to racism: some perils of data-driven decision-making
Wired recently wrote a piece explaining how now OpenTable, a leading "reserve a restuarant over the internet" service, was starting to permit customers to pay for their meal via an app at their leisure, rather than flag down a waiter and awkwardly fiddle around with credit cards. There's an obvious convenience to this for the … Continue reading From restaurant-snobbery to racism: some perils of data-driven decision-making
Exporting CSV data from SQL Server Management Studio
SQL Server Management Studio is a commonly-used bit of the Microsoft SQL Server install, and a decent enough tool for browsing, querying and managing the data. Sometimes though you might have the urge to extract a big chunk of data - most often I do this to generate a big text-file dump for import into other data … Continue reading Exporting CSV data from SQL Server Management Studio
Stephen Few’s new book “Signal” is out
Stephen Few's latest, "Signal: Understanding what matters in a world of noise" has just been released - or at least it has in the US, seems to be stuck on pre-order on Amazon UK at present. Not many reviews seem to be floating around just yet, but the topic is ultra-fascinating: In this age of so-called Big … Continue reading Stephen Few’s new book “Signal” is out
Data science vs rude Lego
Data science moves onwards each day, helping (perhaps) solve more and more of the world's problems. But apparently there's at least one issue for which we don't have a great machine-learning/AI solution for just yet - identifying penises made out of Lego. Indeed this is apparently the problem that plagued the potential-Minecraft-beater "Lego Universe" nearly … Continue reading Data science vs rude Lego