“Let’s collect all data we can, and we will fish for insights later.” Have you heard this before?
That approach seldom works. On rare occasions when it does work a little, the RoI is very low w.r.t. the cost of collecting, processing, and storing volumes of data. Analytics yields better returns when you start with a goal.
Besides, not all analytics are equal. Fun stats are amusing. But actionable insights that can guide you to the next steps are way more valuable.
- Define Objective: Start by defining your goal.
- Specify Levers: Specify the inputs that you control, the levers you can pull to influence the outcome.
- Collect Data: Figure out what data you need to collect for measuring the effect of pulling those levers.
- Identify Actions: Analyze data and build statistical models to compute which lever to move and how much to achieve the desired outcome.
This article will apply the Drivetrain approach to a problem: starting with a goal to collecting and analyzing data and identifying actions to achieve that goal.
The first step in the Drivetrain Approach is to define the objective. Business problems and goals may be ambiguous initially, but refine and make them crisp and quantifiable.
Let’s take this problem: I want my work to reach more people through social media. I want to build an audience on LinkedIn. To maximize the result of my efforts, I want to find which of my content works best.
The goal is to write technical posts that attract followers on LinkedIn. In other words, the objective is to identify characteristics of my posts (in my area of expertise) that generate maximum follower growth.
Key Performance Indicators (KPIs)
Once you defined the objective, the next task is to specify one or more KPIs to measure the performance towards that objective. So let’s understand the lifecycle of a LinkedIn post.
When you post something on LinkedIn, it shows up in some of your follower’s newsfeeds. If they interact with your post, LinkedIn shows it to their followers and more of your followers.
The typical audience funnel for a post is as follows:
- Followers: This is the inlet size of the funnel. More followers mean more potential people who will see and interact with the post.
- Views: The fraction or multiple of your followers who see the post in their feed.
- Likes: Some of the followers may find the post interesting enough to like it.
- Shares and Comments: Some might find it helpful to share with their followers and comment on the post.
- New Followers: As more people interact with the post, more people outside the immediate circle will see it. They may check out your profile and may decide to follow it.
There are three indicators of successful content: views, engagement (likes, shares, comments), and new followers.
Views and engagement are valuable. However, the new followers expand the funnel inlet for the next post. So I consider it the most significant measure of the success of a social media post in building an audience.
The next step is to ask: what do you control and can change that may impact the outcome?
“God, grant me the serenity to accept the things I cannot change, courage to change the things I can, and wisdom to know the difference.”
In the case of LinkedIn posts, the levers I control are:
- WHEN do I post: Day and time of the post.
- HOW often do I post: Frequency of the post.
- WHAT do I write in the post: Domain, topic, content complexity, length, structure, embedded media (e.g., image, video, document, link).
There are studies about “when” (Tuesday to Thursday, between 9 am to noon) and “how often” (max once per weekday) to post on LinkedIn. Some studies advise not to include a link in the post. Of course, there are several posts offering tips for writing killer Linkedin posts.
Though these studies focus on marketing posts on company pages, the results and tips seem sensible for individuals too. However, its applicability depends on the subject matter, audience, geography, etc. So, while I took some advice (e.g., max one post a day), I want to find out which levers will work for me.
In a little over the past two months, I posted 53 times. For each post, along with “lever” characteristics, I recorded KPIs:
- Followers at the time of posting, and
- Views, likes, shares, comments, and followers after 24 hours of posting.
For each post, KPIs are the following tuple: (Followers0, Likes24, Shares24, Comments24, Followers24)
In the age of Big Data, this is a laughingly small amount of data. While I will continue collecting more data, I will not have “big data” anytime soon as I make a maximum of one post per day. I want to explore:
- Is it feasible to glean actionable insights from small data?
- Do I need to collect other characteristics (levers) or KPIs that I did not realize earlier?
- How to do data analytics in Excel or Google Sheets? It can work as a dashboard with up-to-date analytics while I record more data daily.
Working with a small dataset also facilitates explaining nuances in interpreting analytics.
4 Types of Data Analytics
Before we jump into analyzing the data and identifying actions, let’s understand four types of data analytics:
- Descriptive Analytics: What happened?
- Diagnostic Analytics: Why did it happen?
- Predictive Analytics: What happens if?
- Prescriptive Analytics: How to make it happen?
Descriptive and Diagnostic Analytics is about examining the past, and Predictive and Prescriptive Analytics is about planning for the future.
This is the most common analytics and answers the fundamental question from past events: what happened?
In Descriptive Analytics, we aggregate data along various dimensions and identify specific patterns or trends.
Most businesses have dashboards to track essential metrics. For example:
- What are the monthly and quarterly sale numbers of various products in different geographic markets?
- Whether the sale of a product is increasing or decreasing?
- What are the inventory levels in different geographies, and are these rising or declining?
The next step in analytics is analyzing a trend and answering: why did it happen?
In Diagnostic Analytics, we drill down and do Root Cause Analysis. We want to find what is fueling a trend. So we slice-and-dice data to identify outliers, isolate patterns, and uncover correlations.
Business Intelligence (BI) products are for doing that. For example:
- Identify the characteristics of a customer segment fueling the sale of a profitable product.
- Compare the performance of a product in different customer segments.
- Compare the performance of different products in a customer segment.
- Explain why is it happening. Is it due to product awareness or due to intense competition? Or is it a general trend for that product category altogether?
With knowledge of patterns and correlations, analytics progresses to forecasting: what is likely to happen if …?
In Predictive Analytics, we use regression analysis, time series forecasting, and other multivariate statistical modeling techniques. The goal is to estimate some quantifiable variable at a given point in time in the future.
This is a big step up from Diagnostic Analytics. It involves a much more sophisticated analysis to forecast:
- Demand for a product line
- Revenue potential of a customer segment
- Revenue growth from an upcoming campaign
- Profit projections for the next fiscal
Once the past is understood, and one can forecast the future, the goal shifts to shaping it: How to make it happen?
In Prescriptive Analytics, we do a “what if” analysis to estimate the impact of moving levers we control. Then, we compare the available alternatives and pick the best course of action.
This is the most challenging level and requires advanced analytics and data science skills. It is a game-changer in deciding resource distribution for maximizing profit. For example:
- Production of which products needs to be ramped up, and by how much?
- How much of each product should be stocked in various geographies?
- What product campaigns to pick, and which customer segments to target?
- Which products or services need a price adjustment, and by how much?
Most organizations are not at this level of maturity. It requires a culture of data-driven decisions to operate at this level. From the very beginning, you need to start thinking about the objective, what data is required, how to collect that data, and perform the previous levels of data analytics to reach this level.
Let’s get back to the Drivetrain approach. The last step is to analyze data and extract actionable insights. To recap:
- Levers: embedded media, post type, domain, weekday
- KPIs: views, engagement (likes, shares, comments), and follower growth
If you look at the raw data, two aspects of the KPIs stand out:
- KPIs are in vastly different scales.
- As the followers grow, all KPIs grow too, which makes them incomparable over time.
Converting KPIs to the “per Follower” basis and taking the percentage for views, likes, shares, comments, and follower growth solves both problems for this dataset:
Views per Followers (V/F): (Views24 / Followers0)
Likes per Followers (L/F): 100 * (Likes24 / Followers0)
Shares per Followers (S/F): 100 * (Shares24 / Followers0)
Comments per Followers (C/F): 100 * (Comments24 / Followers0)
Follower Growth (F+): 100 * (Followers24 - Followers0) / Followers0
The dashboard below shows some of the Descriptive Analytics. It has distributions of levers across posts, aggregations of KPIs for various levers, and some characteristics over time.
For aggregation, I chose median instead of sum because levers are not equally distributed in the posts.
A causal look at it tells:
- GIFs, URL previews (i.e., posts with a link and no other media), and PDFs are the best-performing media types.
- Vision, presentation, white paper, and marketing are the best performing post types.
- Programming, startup, and microservices are the best-performing topic domains.
- Monday, Tuesday, and Friday are the best, and Wednesday is the worst weekday to post.
However, a careful evaluation reveals that many outperforming characteristics are present in only a few posts. For example, there are only:
- Two posts with GIFs and two with URL previews
- One post each on vision and marketing
- One post each on programming, startup, and microservices
Half or more posts are about ML/DS/AI and have an image infographic. If followers increased by 65% in about that many days, those posts must have contributed something.
Does it mean that if I had posted only GIFs on the vision for programming, the follower growth would have doubled (as indices for each is about double of ML/DS/AI image infographics)?
There are nuances in interpreting analytics as Statistics is not Arithmetic.
Let’s try to get some answers from Diagnostic Analytics.
My first questions were:
- What is the correlation among KPIs?
- Should I focus on only Follower Growth?
- Is “per Follower” the right approach? Or shall I follow the funnel, and consider Views per Followers, Likes per Views, and Share per Likes, etc.?
So I plotted histograms of all these potential model features and computed their correlation with Followers Growth (the most significant KPI).
As you can see in the chart below, all of these features follow the Normal Distribution. Views, Likes, and Shares are strongly correlated with Follower Growth (F+) and each other. On the other hand, comments don’t have a strong correlation with F+.
Next, I wanted to identify performant posts. So I created a score by adding normalized values of strongly correlated KPIs (i.e., F+, V/F, L/F, and S/F).
For normalized values, the mean (𝞵) is 0, and the standard deviation (𝞼) is 1. Since the score is the sum of four KPIs, a value of 4 or above (i.e., 𝞵+𝞼) places a post in the top 16% outperformers.
The chart below has scores of all the posts. The box charts on the right are to identify outliers. The golden yellow is for the overall score, and the light blue is for F+.
There are 7 posts with scores of 4 or above, but two of them are outliers. Except for one of these 7, all have Followers Growth score too in the outperformer range.
There are 3 posts (A, B, and C) that outperformed the Followers Growth score, but with an overall score of less than 4. One of them, C, has a negative overall score.
The next step is to dive into Predictive Analytics. But current data is not enough to compute Point-Biserial Correlation between levers and score or do linear regressions. So I decided to examine the 10 posts:
Outperforming Engagement but Average Follower Growth: Post 7
Wierd: Post C
My observations are:
Outliers: Posts 1 and 2 have a high-value image and resource, respectively. Even Post 3 is an outlier as it shares a URL, which is known to reduce views.
Wierd: Post C and Post B recorded high Follower Growth because the previous post was an outlier. I have observed that LinkedIn keeps showing very high-performing posts for up to 3 days.
Average Engagement but Outperforming Follower Growth: Post A contained slides of my talk. It seems like the deck was attractive for people to follow but not attractive enough to like or share. That is odd.
Outperforming Engagement but Average Follower Growth: Post 7 was a long post on a complex topic. I can see it being useful to engage with but too complex to follow and seek more of it.
Outperforms: Post 4, 5, 6 (and Post A too) are at the sweet spot of medium length (130–180 words) and medium complexity.
Though more data is needed, based on overall trends and closely examining the 10 posts, this is what I am going to do:
- Never miss posting on Monday, and take it easy on Wednesday.
- All outperformers (and outliers) posts were in big data and DS/ML/AI, so double down on it.
- Try out more GIFs and see if it pulls the average up, that would convert outliers to outperformers.
- Use more infographics, listicles, cheat sheets, and white papers as these performed well and are in sizable frequency in the data.
I also need to collect more data and try predictive and prescriptive analytics.
These insights might not be generic and may apply to only my kind of content. Also, there is a bias in the data as I already follow conventional best practices:
- Write to provide value
- Use simple language
- Format it for easy reading
- Include visually rich media
- Don’t embed links
The Drivetrain Approach and four types of data analytics provide a good framework for extracting actionable insights. But remember that data analytics and data science are experimental and may take multiple iterations to get it right.