A question came up recently about variations in the age at menarche - the first occurrence of menstruation for a female human - with regards to the environment. A comparison by country seemed like a reasonable first step in noting whether there were in fact any significant, potentially environmental, differences in this age. A quick … Continue reading Average age at menarche by country
Tag: Data
data.world: the place to go for your open data needs?
Somewhere in my outrageously long list of data-related links to check out I found "data.world". Not only is that a nice URL, it also contains a worthy service that I can imagine being genuinely useful in future, if it takes off like it should. At first glance, it's a platform for hosting data - seemingly biased towards the … Continue reading data.world: the place to go for your open data needs?
#VisualizeNoMalaria: Let’s all help build an anti-Malaria dataset
As well as just being plain old fun, data can also be an enabler for "good" in the world. Several organisations are clearly aware of this; both Tableau and Alteryx now have wings specifically for doing good. There are whole organisations set up to promote beneficial uses of data, such as DataKind, and a bunch of … Continue reading #VisualizeNoMalaria: Let’s all help build an anti-Malaria dataset
Accessing Adobe Analytics data with Alteryx
Adobe Analytics (also known as Site Catalyst, Omniture, and various other names both past and present) is a service that tracks and reports on how people use websites and apps. It's one of the leading solutions for organisations who are interested in studying how people are actually using their digital offerings. Studying real-world usage is often far more insightful, … Continue reading Accessing Adobe Analytics data with Alteryx
Kaggle now offers free public dataset and script combos
Kaggle, a company most famous for facilitating competitions that allow organisations to solicit the help of teams of data scientists to solve their problems in return for a nice big prize, recently introduced a new section useful even for the less competitive types: "Kaggle Datasets". Here they host "high quality public datasets" you can access for free. … Continue reading Kaggle now offers free public dataset and script combos
How many teachers do we need? The official Governmental model
How do we know how many teachers are required to keep the UK's schools in good working order? It's an interesting question, with obvious implications for Governmental education policy with regards to teacher compensation, incentives, training places and so on. The "official" requirements are calculated via the Government's "Teacher Supply Model", which, happily, in the … Continue reading How many teachers do we need? The official Governmental model
Are station toilets profitable?
After being charged 50p for the convenience of using a station convenience, I became curious as to whether the owners were making much money on this most annoying expression of a capitalistic monopoly high on the needs of many humans. It turns out data on those managed by Network Rail is available in the name … Continue reading Are station toilets profitable?
Microsoft Academic Graph: paper, journals, authors and more
The Microsoft Academic Graph is a heterogeneous graph containing scientific publication records, citation relationships between those publications, as well as authors, institutions, journals and conference "venues" and fields of study. Microsoft have been good enough to structure and release a bunch of web-crawled data around scientific papers, journals, authors, URLs, keywords, references between and so on for … Continue reading Microsoft Academic Graph: paper, journals, authors and more
Free dataset: all Reddit comments available for download
As terrifying a thought as it might be, Jason from Pushshift.io has extracted pretty much every Reddit comment from 2007 through to May 2015 that isn't protected, and made it available for download and analysis. This is about 1.65 million comments, in JSON format. It's pretty big, so you can download it via a torrent, as per the … Continue reading Free dataset: all Reddit comments available for download
Free data: Constituency Explorer – UK demographics, politics, behaviour
From some combination of the Office of National Statistics, the House of Commons and Durham library comes Constituency Explorer. Billing itself as "reliable evidence for politicians and journalists - data for everyone", it allows interactive visualisation of many interesting demographics/behavioural/political attributes by UK political constituency. It's easy to view distributions and compare between a specific contstituency, the region … Continue reading Free data: Constituency Explorer – UK demographics, politics, behaviour