A step-by-step approach to predict customer attrition using supervised machine learning algorithms in Python

Photo by Emile Perron on Unsplash

Customer attrition (a.k.a customer churn) is one of the biggest expenditures of any organization. If we could figure out why a customer leaves and when they leave with reasonable accuracy, it would immensely help the organization to strategize their retention initiatives manifold. Let’s make use of a customer transaction dataset from Kaggle to understand the key steps involved in predicting customer attrition in Python.

Supervised Machine Learning is nothing but learning a function that maps an input to an output based on example input-output pairs. A supervised machine learning algorithm analyzes the training data and produces an inferred function, which…

Photo by Myriam Jessier on Unsplash

Objective: To programmatically retrieve Google Analytics data for marketing analytics automation.

Accessing Google Analytics API to retrieve GA records is one of the quintessential requirements to build an end-to-end marketing analytics suite. We could achieve this objective through four major steps as listed here below:

  1. Generate Client ID and Secret Key in Google Cloud.
  2. Update .Renviron variables.
  3. Import relevant libraries and refresh GA tokens locally.
  4. Finally, build the GA dataset in R.

Step 1. Generate Client ID and Secret Key in Google Cloud

Step 1.1. Create a Google Cloud Project: Sign in to Google Cloud Console and create a project.


Key points to evaluate Google’s Looker vs. Tableau/Power BI/Qlik

Photo by Rajeshwar Bachu on Unsplash

“Which Business Intelligence (BI) tool is better for my business”?

With so many key players in the Business Intelligence and Visual Analytics market, this is one of the most important and frequent questions that need to be addressed in any techno-functional department of an organization.

Google is getting deeper into enterprise software with its latest acquisition of Looker for $2.6 billion. I just had a quick look into Looker and I think its definitely worth trialling out for few weeks. Most of my experience with BI tools fundamentally revolves around the traditional ones in the market like Power BI, Tableau…

To quickly identify the best performing categories in any sales dataset using quadrant analysis in Tableau.

Sales vs. Proft Quadrant Analysis in Tableau (Image by Author)

To quickly identify the best performing categories in any sales dataset using quadrant analysis in Tableau.

Details: A quadrant chart is nothing but a scatter plot that has four equal components. Each quadrant upholds data points that have similar characteristics. Here is a step-by-step approach on how to do a quadrant analysis plot in Tableau using the Superstore sales dataset so as to identify the best performing categories in terms of sales and profit:

Step 1: Open a new Tableau file and import the ‘superstore’ public dataset into the workbook. In case if you haven’t worked with Tableau before, please…

Hands-on Tutorials

Photo by Markus Spiske on Unsplash

Objective: To predict forthcoming monthly sales using Autoregressive Models (ARIMA) in Python.

Details: Most of the business units across the industries heavily rely on time-series data to analyze and predict say, the leads/ sales/ stocks/ web traffic/ revenue, etc. to make any strategic business impacts from time to time. Interestingly, time-series models are gold mines of insights when we have serially correlated data points. Let’s look into such a time-stamped sales dataset from Kaggle to understand the key steps involved in the time-series forecasting using Autoregressive (ARIMA) models in Python.

Here we are applying ARIMA models over a transactional sales…

Photo by Ilya Pavlov on Unsplash

Objective: To fetch the latest product prices that have been hosted on the competitor’s website programmatically.

For the purpose of demonstration, let’s look into the websites of WeWork and Regus; two leading players in the co-working industry who competes among each other to serve hot desks, dedicated desks, and private offices across the globe. Let’s try to scrap their websites in California to retrieve the latest product price listings programmatically.

There were four milestones to accomplish the objective:

  1. Web scraped Regus sites using httr/rvest packages.
  2. Cleaned the dataset and incorporated geospatial coordinates.
  3. Repeated steps 1 & 2 for WeWork websites.

Photo by Safar Safarov on Unsplash

Objective: To set up a fully operational machine learning Server on Google Cloud Compute Engine’s virtual machine instance.

In a real-world scenario, cloud computing and machine learning go hand-in-hand to build, transform and scale predictive modelling projects. Being a Linux server application, R Studio server is one of the best solutions that could be hosted on Google Cloud (or Amazon Web Service or Azure) to automatically process large volumes of data in SQL/ R/ Python in a centralized manner. Here’s a step-by-step approach on how to configure a fully functional R Studio Server on Google Cloud:

  1. Configure a virtual machine…


Data Science & Analytics | srees.org

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store