Analytics Clickstream

DIY Clickstream -4: Deployment

In this concluding part of this mini-series, we will be deploying our framework on a production server on apache

There are also few final touches we need to close from the previous parts namely

  1. Add Page load as event type in addition to click events
  2. Ensure we send events in blocks after 10 seconds for performance enhancements at the front-end
  3. Handle these at the backend
  1. To add Page Load as event type, we just need to add the following event listener in event_capture.js (which sits in the client machine and we had developed in Part 2 )
document.addEventListener('DOMContentLoaded', function(event)

2. To ensure that we send events in blocks of 10 seconds, we will create an array of events in local storage

We will add events to this array and set a timeout to send to our server every 10 seconds

// send event to server in gap of 10 seconds
    setTimeout(function() {
        var events = localStorage.getItem('events');
        if (events) {
            events = JSON.parse(events);
            var xhr = new XMLHttpRequest();
  'POST', '', true);
            xhr.setRequestHeader('Content-Type', 'application/json');

3. To handle this at the backend, we will make some small changes in Instead of saving each event, we will iterate through the array sent in the request body

This closes our first part i.e some basic optimizations. Of course a lot of things like error handling, test cases is yet to be written. In the interest of time however we’ll now move on to deployment on production server


The next few steps are very similar to the test or development server we have done in the last post

  1. Deploy Python and Django on an AWS EC2 instance
  2. Create a MYSQL database to store the events
  3. Change and code from previous post

Install python, Django and MySQL in AWS EC2

In my case, I have familiarity with AWS EC2 and will go ahead with the same

  1. This post helps with setting up an instance and installing python.
  2. Django + MYSQL: I followed step by step this guide to set up production level server on my ubuntu instance on EC2. Few bothers on the way
    • Setting up the root password for MYSQL server. This post helped resolve it. Amazing how much help is out there for free
    • Post apache configuration and restart, was getting a forbidden error. Ideally apache should be serving the django home page. Basically apache needs all the directories in the path to be executable so for me, making /home/ubuntu executable worked since my django installation is in ubuntu
chmod +x /home/<USERNAME>

Change the code from local django deployment

We need to make following changes (from previous post) to Django server

  1. Install django-cors-headers and make changes to the file
  2. Copy code to (both events and server), and
  3. Run the migration routine to create the database. You should get something like below

Almost there. Now the final step is to modify the url endpoint where you send the asynchronous http request from event_capture.js . Use http://<IP Address>/events/ as the endpoint address

Once run fine, you should be able to capture all clicks like below

This brings us to the end of our nifty clickstream framework. Commercial solutions available can cost thousands of dollars for average websites with 100k visits a month. Here we can manage the storage of event streaming at just $10-20 a month on commercial AWS machines

There are a bunch of things to be improved on this, namely

  1. Improve security viz CSRF & https
  2. Test cases
  3. Error handling

Am releasing this as open source in the hope that it will be built upon and found useful by folks. Hit me up in case of any questions in comments. I will also write some posts on how to analyse this data with python or other tools like Mixpanel.

Github for the project Client side Javascript and Django server