Our website uses cookies. By continuing we assume your permission to deploy cookies, as detailed in our privacy and cookies policy


Mohammed is a student of Marketing and Communications. Today he had a really interesting lesson, at the Data Management and Business Intelligence’s course, about computational approaches to improve the quality of marketing campaign and loyalty strategies.

More precisely, Prof. Doe spoke about customer retention and churn rates. On one hand, customer retention is a measure of how many customers are loyal to a brand and return for another visit. On the other hand, churn is a measure of how many customers do not return to a company after making a purchase. He explained that it is possible to compute the churn probability of each customer to know the customer’s likelihood to come back in a specified period using Machine Learning (ML) approaches.

For this purpose, he gave a pre-processed dataset to all students, asking them to apply a Machine Learning approach and predict the churn probabilities. The dataset contains a set of information related to all the loyal customers of a certain shop. The information is computed on a window of three months and the target variable is a binary variable which takes 0 (No) if a customer did not come back in the next two weeks and 1 (Yes) if a customer made at least a new purchase.

Mohammed is happy for this homework because he will have the chance to use again the Brainiac Cloud platform.

Step 1 – Mohammed’s Excel file and upload the Dataset

Mohammed looks at the excel file. The available features are: number of receipts in the considered period, average receipts, number of days since from first purchase, number of days since from last purchase, days between first and last purchase and the binary variable churn (i.e. target variable).

Mohammed saves the file in TXT format and uploads it into Brainiac Cloud.

Step 2 – Explore the Data

Mohammed knows the Data Explorer and, after looking at the general behavior of each variable, he drags and drops two variables, number of receipts (number) and average receipts (average), to see if there is any correlation between the two. After that, he selects Churn as target variable by clicking on the variable name.

Step 3 – Train the Model

Prof. Doe spoke about Logistic Regression at the lesson and Mohammed decided to test the performance of this approach on the churn dataset. Mohammed selects the Logistic Regression Classifier and hits TRAIN. Instantaneously, Brainiac Cloud returns a set of performance indicators on the trained model. Accuracy and Log-loss are really good, in fact the Brainiac meter is almost colored.

Step 4 – Use the model and Upload new test set

Mohammed plays with the ML model through the Quick Model Application. Than, he uploads the test dataset provided by Prof. Doe. Prof. Doe will evaluate the homework based on this file.
The test dataset is structured as the initial dataset and with the same headers. The only difference is that in the new dataset the target variable is missing. The Brainiac Cloud platform will predict the churn probability for each customer.

Finally, Mohammed previews the results and downloads the resulting file.