Françoise Provencher
  • Accueil
  • Curriculum Vitae
  • Publications
  • Art
  • Blog
  • Contact

How to measure your e-commerce business

2/23/2016

14 Comments

 
Picture
Image credit : Tech in Asia 
You are growing your e-commerce business, so what’s the next move that will get you your next sale? Exploratory data analysis to identify opportunities is key. We’ll start small, using Python to illustrate some effective and simple data explorations. Then we’ll talk about how we scaled that at Shopify to automatically give data-driven advices to our 243k+ merchants using PySpark.

Just to be clear: data used in this presentation is fake to preserve privacy.

You can see online the Jupyter Notebook that accompanies this blog post, as well as the slides.

1 - Get all the data
You probably have your data scattered across a few different places. For instance:
  • From your e-commerce platform provider. If you’re using Shopify, you can export that data as a CSV (comma separated value) from the admin, or you can automate the process using the Shopify API. Examples of useful data to have : list of customers, discount codes, abandoned checkouts, orders, etc.
  • From Google Analytics. You can download the data as CSV or automate the process using the API. This is a whole subject in and of itself, and Vanessa Sabino has already covered the subject brilliantly.
  • From logs: If you are hosting your own e-commerce solution, you can log you visitors sessions and use those to understand their behaviour on your site.
  • From spreadsheets : Maybe you have a spreadsheet with your products cost and retail price. It’s useful to use that information in conjunction with other sources (like your real orders) to get new information (like your profits).
2 - Explore
For data exploration, I really like Jupyter (formerly iPython Notebook). The notebook form makes it easy to tell a story and keep track of everything we try. You can view the notebook for this presentation on Gist.
Let’s tackle two business questions.
​
Q1 - Which color should I use for my next t-shirt design?

I have my own online store where I sell t-shirts that I design for physicists and their friends (I’m a physicist myself). I have way too many colors on offer for each design. I did that on purpose, to give people the choice and learn which colors are the most popular with my customers. Now it’s time to narrow it down.

First, I load the data from the CSV file I downloaded from my Shopify store.

Then, I locate which column is useful to me, in this case it’s “Lineitem Name”. It’s a string (i.e. text) that looks like this : “Shine on you crazy diamond - M / Cream / Men”. The first part is the product name, then the 3 variant options separated by slashes. But there’s a twist : not all lineitem names have variant options, and of those who do, not all of them have color.

Python is good at handling exceptions with “try / except”, so we can write a function that will take in the lineitem name and return a list of variants (when variants exist), then another function that takes that list of variants and returns the color (if there’s a color in there).

We can chain those functions together, and do a final count of the values to get our most popular colors : sea foam and eggplant.

I like this example because it shows how Python can be useful for parsing text, in this case, extracting the colors from the lineitem description, even when the color is not always in the same position

Q2 - Is there a problem with my shipping rates?

I recently wrote a whole article about his problem for the Shopify e-commerce university blog : How Nijala tweaked their shipping strategy to win more sales. Here’s the gist of it : potential customers can abandon their cart anytime during the checkout process. In particular, the ones leaving at the shipping information page are usually disappointed there is no free shipping, shipping rates are too expensive, or there is no shipping offered to their location. Therefore, it’s a good proxy to diagnose shipping related problems.

We want to see the proportion of shoppers who leave at the shipping method page. In my example (which is a typical Shopify checkout), the only possible page after the shipping page is the payment page, so the number of shoppers who drop out of the checkout funnel at the shipping page is the difference between the number of shoppers who reach the shipping page and the number of shoppers who reach the payment page.

This can be easily visualized in Google analytics if you set up a checkout goal funnel. You can also pull the data from Google Analytics and do the calculation yourself. Once you know the number of people who reached the shipping info page (num_ship) and the number of people who reached the payment page (num_pay), then the drop-off ratio at the shipping info page is (num_ship - num_pay) / num_ship.

3 - Scale

I do this kind of analysis for hundreds of thousands of merchants several times per day. The key is to build a data pipeline that gets launched by a scheduler on a regular basis. We used pySpark for the data pipeline and Oozie for the scheduler.

We basically do same thing as the above Q2, but we also
  • replace the CSV file data input by ready data from HDFS (Hadoop distributed filesystem). It enables storing huge datasets distributed across many machines on a cluster;
  • replace Jupyter by pySpark. It enables fast distributed computing on a cluster;
  • key by Shop ID, so that we can aggregate the data shop by shop;
  • add thresholds, like minimum number of orders, or minimum confidence level for a prediction, so that we don’t report statistical flukes;

More ideas for analysis
  • Pages with high traffic and high bounce rate : this is where your efforts will pay out most.
  • Search terms on your store and to find your store: match the words used by your customers.
  • Track other funnels, like email campaigns : find and address the bottleneck in the conversion.
  • Segment with care (beware of false positives) : by content type, product category, referrer, etc.

Resources
  • “Exploring the Google Analytics API” by Vanessa Sabino : https://www.youtube.com/watch?v=YUqaCkEwr6g
  • “Statistics: Making sense of data” : https://www.coursera.org/course/introstats
  • Jupyter : http://jupyter.org
  • Set up GA goal funnel : https://docs.shopify.com/manual/reports-and-analytics/google-analytics/google-analytics-goals-and-funnels


14 Comments
Barcode Wire link
4/12/2016 08:06:04 am

we used the best technology to provide you security.

Reply
Tender writing Business Plans link
8/12/2016 11:23:59 pm

With no coding skills, but with a lot of eagerness to succeed I decided to create my own website. Where to start?

Reply
http://emeraldbb.ca link
9/10/2016 09:41:27 am

Read on to discover a passive way to invest in real estate, a technique your Wall Street broker would not dare to mention to you.

Reply
best ecommerce sites link
9/25/2016 01:46:55 pm

In this white paper the reader will be introduced to the common problems a business might face when implementing an eCommerce infrastructure.

Reply
harga laptop acer link
9/26/2016 04:00:07 am

common problems a business might face when implementing an eCommerce infrastructure.

Reply
Rose link
12/8/2016 09:04:33 pm

Internet technical details about this topic were discussed as an overview at <a href="http://hadooptraininginhyderabad.co.in/data-scientist-course-in-hyderabad/">data science training in ameerpet</a> but here in this site I found in-depth analytical information..

Reply
Anakan Lovebird Jantan link
6/20/2017 07:01:27 pm

Highly sophisticated technology to facilitate every activity.

Reply
Unrivaled link
6/7/2018 03:11:29 am

Really enjoyed this post on measuring my ecommerce business. Lots to consider! Thanks.

Reply
seal kit link
8/12/2018 11:16:40 pm

I high appreciate this post. It’s hard to find the good from the bad sometimes, but I think you’ve nailed it! would you mind updating your blog with more information?

Reply
takipçi satın al link
8/1/2022 01:10:06 am

Really informative article, I had the opportunity to learn a lot, thank you. https://takipcisatinalz.com/takipci-2/

Reply
define dedektor link
8/2/2022 12:47:11 am

Really informative article, I had the opportunity to learn a lot, thank you. https://www.ugurelektronik.com/

Reply
instagram takipçi satın al link
8/2/2022 06:47:47 am

Really informative article, I had the opportunity to learn a lot, thank you. https://takipcialdim.com/ucuz-takipci-satin-al/

Reply
takipçi satın al link
8/2/2022 01:56:26 pm

Really informative article, I had the opportunity to learn a lot, thank you. https://www.takipcikenti.com/instagram/turk-takipci/

Reply
okex kayıt ol link
8/9/2022 02:34:48 am

Really informative article, I had the opportunity to learn a lot, thank you. https://www.smsbankasi.net/2022/01/okex-nedir-nasl-kayt-olunur-nasl.html

Reply



Leave a Reply.

    Françoise Provencher

    J'ai un doctorat en physique et j'écris à propos de science et de mes trouvailles informatiques.

    Archives

    February 2016
    February 2015
    November 2014
    September 2014
    June 2014
    April 2014
    February 2014
    December 2012
    October 2012
    September 2012
    August 2012

    Categories

    All
    Conference
    English
    Informatique
    Livre
    Logiciel
    Réseau
    Visualisation

    RSS Feed

Powered by Create your own unique website with customizable templates.