Basic text tokenisation with Alteryx

Free text analytics seems a fashionable pastime at present. The most commonly seen form in the wild might be the very basic text visualisation known as the "word cloud". Here, for instance is the New York Times' "most searched for terms" represented in such a cloud. When confronted with a body of human-written text, one of the first steps for many text-related analytical techniques … Continue reading Basic text tokenisation with Alteryx

From restaurant-snobbery to racism: some perils of data-driven decision-making

Wired recently wrote a piece explaining how now OpenTable, a leading "reserve a restuarant over the internet" service, was starting to permit customers to pay for their meal via an app at their leisure, rather than flag down a waiter and awkwardly fiddle around with credit cards. There's an obvious convenience to this for the … Continue reading From restaurant-snobbery to racism: some perils of data-driven decision-making

Data science vs rude Lego

Data science moves onwards each day, helping (perhaps) solve more and more of the world's problems. But apparently there's at least one issue for which we don't have a great machine-learning/AI solution for just yet - identifying penises made out of Lego. Indeed this is apparently the problem that plagued the potential-Minecraft-beater "Lego Universe" nearly … Continue reading Data science vs rude Lego