Sign in

Applying Machine Learning to Massive Datasets

A different sort of churn. Photo by Cassie Matias on Unsplash

Managing churn, the process of losing and gaining customers over time, is critical for subscription-based services like Spotify or Netflix. At the same time, these services are capable of collecting massive amounts of data pertaining to user interactions. This provides an opportunity to apply machine learning as a means to identify vulnerable users based on their behaviour, and take action before they choose to leave or downgrade their plans.

Critically, the sheer amount of data involved requires large-scale “big data” processing techniques. Apache Spark has been designed exactly for just such tasks. It is a modern analytics engine designed to…


Looking at trends in the Stack Overflow Annual Developer Survey 2011 — 2020.

The MacBook Pro had a lot more port variety in 2011 (Photo by Lukas from Pexels)

The field of computing has developed incredibly rapidly over the past decade, with hot new startups, frameworks and languages seemingly coming and going all the time. But with all these exciting changes, perhaps we should take a step back for a second and have a look at how the life of a developer has changed over this same time period. Here we look at aspects encompassing satisfaction at work, job preferences, occupation diversity, and how details like these are related to salaries.

Luckily, we have a great trove of information in the Stack Overflow Annual Developer Survey, which has been…

Joel B.

Passionate about Data Science. PhD in Physics.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store