Visualising a vast amount of text data in public discourse about your product offering is tough.
Analysing the vast amount of text data is even harder.
While word clouds and network graphs make it a tad easier, it can take a lot of time and technical skill to extract learnings from it.
Recently I was introduced to this very advanced, public access Network graphing tool that is extremely useful for text data analysis. While a more advanced analysis still requires the more manual method, this tool provides a quick and easy method to create network graphs and perform an analysis.
I'll introduce the InfraNodus tool and provide an interesting use case.
Disclaimer: Though it's a public access tool, it's not free. As of Feb 2019, it requires a one-time payment of €5.
As a use case, I would like to analyse the gaps in public discourse on Twitter for the three biggest enterprise cloud solution provider: Amazon Web Services, Google Cloud and Microsoft Azure.
Via this analysis, I aim to discover the gaps in consumer understanding about the problem (e.g. A particular offering is offered by both Google and Azure, but only mentioned/discussed on #Azure tweets. Is it a lack of consumer education?)
In the meantime, it can also uncover gaps in product offering (e.g. X appears in AWS searches but not Azure. Is it a feature that is unique to AWS?)
Why do I choose this topic?
I did a quick landscape analysis of the hygiene of the tweets on Twitter with several hashtag searches. I found that tweets with hashtags #AWS, #GoogleCloud and #Azure are very clean, consisting of tweets related to the topic I'm interested in (i.e. enterprise cloud solutions provider) and not about a vacation hotel named "Azure" or some SME named "AWS".
That will save a lot of time as I can safely skip the data cleaning step.
Besides, the volume of tweets is huge and heavily concentrated with the developer crowd and cloud solution providers.
Let's get started.
InfraNodus comes packed with several ways to obtain your data. While you can use the Twitter API to get the data on your own and copy paste into the tool, you can use the in-built Twitter search app to obtain the data easily.
Once you've created your account, access the list of apps and select Twitter.
On the next screen, key in "#aws" in the first text box. We'll replace with #GoogleCloud and #Azure later.
On the second text box, key in a name that's descriptive e.g. "awstwit", "awstweet".
For the third text box, I used 10000. You can choose your own number.
Under settings, choose the first option as we would like to analyse the full text of the tweets.
Click "+settings" and untick "exclude search term from the graph".
Once you're, click visualise. You'll be introduced to your beautiful network graph of tweets.
Though I recommend that you watch all the tutorial videos, I'll introduce some basic actions you can take on the tool.
At the right hand sided, you'll see a very, very useful summary on the communities detected on your text data. With this, you can easily ascertain the themes surrounding your text data.
It also highlights the "Most Influential Words" that are the words that have the most connection i.e. most often mentioned alongside a wide range of words.
You may click on one or several nodes on the network graph or on the summary dialogue at the right-hand side.
Doing so will highlight the connection to the node. You'll also see "tags" appearing at the top right-hand side. Clicking those or the back arrow will reset the highlight.
Sometimes, there are nodes that cloud other data (i.e. the bigger ones) or nodes that are irrelevant. You can remove them by clicking the trash icon.
If you're curious about the context to which the text appeared, you can select the node and click on the dialogue bubble icon at the top right-hand corner.
Doing so opens a sidebar on the left.
I find this very useful for contextual analysis. However, note that it is currently a simple "Find" mechanism i.e. "pineapple" will appear for the word "app"
Before proceeding to the next section, please create two similar network graphs for #GoogleCloud and #Azure. You may name them anything.
Once you're done, select the #AWS one from the hamburger menu on the top left-hand corner. Then, click the weighing scale icon at the right. The weighing scale icon enables the "Compare to Context" function.
The button should turn blue once clicked. Then, head to #GoogleCloud network graph.
You'll now see a few black nodes. Those are the nodes that appear on #AWS but do not appear on #GoogleCloud i.e. the context.
In the summary box at the right, you'll see a portion for "In [xxx] but not in [yyy]".
Those are the gaps in the public discourse of Google Cloud. That is to say, those are the topics discussed or talked about in #AWS tweets but not #GoogleCloud tweets.
Though this analysis is dependent on the quality and volume of tweets, it can tell us quickly about:
For my result, I found the following are the (meaningful) gaps in Google Cloud vs Amazon Web Services (tips: you can discover them by removing the bigger nodes):
Though Google Cloud does offer solutions for Nutanix, serverless environment and integration with Go applications, they do not appear in the tweets for #GoogleCloud.
Note: There are of course still dirty data (e.g. tweets from events, promotional tweets, tweets from enterprises promoting themselves). This is where the dialogue bubble function mentioned earlier can be useful to determine if they are insights or just dirty data.
Do note that there are enterprises tweeting that might not be considered "public discourse". I would object to that assessment within this context as enterprises are the target users of these products and are the "influencers" in this space. Hence I believe including their tweets actually make the analysis more wholesome.
Using #AWS vs #Azure with #Azure as the context shows these gaps:
While we've covered how to perform a tweet analysis using the inbuilt Twitter API, there is a way to insert your own data for analysis. The data could be your output in your Python/R code, an Excel report or Word/text document.
To do that, open your sidebar via the hamburger menu at the top-left corner and select "+ new context".
Type the name of your context and click Enter.
Once done, you'll be shown an empty slate. At the bottom left dialogue box, Paste your text data wholesale. Any line breaks are taken as a new paragraph of text (e.g. in tweets context, a new tweet).
After clicking "save", wait a while for your data to be visualised.
I hope you have found this powerful tool really useful. There are many, many use cases to InfraNodus and I'm merely scratching the surface with this post.
Do read my other post of Keyword Network Graphs if you prefer another solution that is more manual but provides more control and freedom.
All the best for your analysis and do spread the word so that this tool continues to be alive and usable.