Visualizing Web Analytics in R Part 4: Interactive Globe
This article is the fourth in a series about visualizing Google Analytics and other web analytics data using R. This article focuses on using interactive globe visualizations to show where search-related website traffic originates. The series hopes to show how R and interactive visualizations can help to answer the following business questions:
- Which articles need work to improve search engine ranking
- Which articles are well ranked but do not get clicked, and need work on titles or meta data
- Where to focus efforts for new content
- How to use passive web search data to focus new product development
The other articles in the series are:
- Visualizing Web Analytics Data in R Part 1: the Problem
- Visualizing Web Analytics Data in R Part 2: Interactive Outliers
- Visualizing Web Analytics Data in R Part 3: Interactive 5D (3D)
- Visualizing Web Analytics Data in R Part 5: Interactive Heatmap
- Visualizing Web Analytics Data in R Part 6: Interactive Networks
- Visualizing Web Analytics Data in R Part 7: Interactive Complex
Globe Showing Relative Page Use
Geographic visualizations are increasingly important in understanding many types of data. The section that follows shows a way to visualize Google Search Console and Google Analytics data on an interactive globe using the gblobejs call in the threejs package. This type of visualization is especially suited to origin-destination pair data like airline or telecommunications data, but is still very useful for point data like web analytics.
The first step in this process was to find geocoded values for the ISO Region Codes provided by Google Analytics. Country data is readily available, but state or province level data is more difficult to obtain. The analysis in this article aggregates data at the country level.
Figure 1 shows the country of origin for sessions for each of four categories of article:
- General articles are blue
- Web commerce articles are red
- Consumer articles are yellow
- Banking articles are green
globejs command does not allow different arc heights, the latitude/longitude of the origin is jittered so that the different arcs do not overlap and are all visible. The session volume is scaled and applied to the line width, but this is not particularly easy to read in this visualization. This visualization makes it clear that the vast majority of traffic comes from the United States and Europe, and that “general” articles are only used in the US and Europe.
Figure 2 shows the same session data, but in a geographic bar chart format. For this particular dataset, this visualization is easier to understand–and easier to generate, as it does not require artificially generating the origin-destination pairs. In Figure 2, it is clear that the US and developed world generate the vast majority of the traffic.
These figures are useful understanding the regional patterns in the data and make it very easy for users to combine the visualization data with their understanding of underlying geographic and demographic information. Unfortunately, the globe visualizations can’t really show article-level detail. For article-level detail, a heatmap is really a better visualization; interactive heatmaps using the
d3heatmap package are demonstrated in the next article in the series, Visualizing Web Analytics in R Part 5: Interactive Heatmap.
This article was written in RStudio and uses the
threejs package for all graphics.