Visualizing Web Analytics in R Part 6: Interactive Networks
This article is the sixth in a series about visualizing Google Analytics and other web analytics data using R. This article focuses on using interactive network visualizations to show where search-related website traffic originates. The series hopes to show how R and interactive visualizations can help to answer the following business questions:
- Which articles need work to improve search engine ranking
- Which articles are well ranked but do not get clicked, and need work on titles or meta data
- Where to focus efforts for new content
- How to use passive web search data to focus new product development
The other articles in the series are:
- Visualizing Web Analytics Data in R Part 1: the Problem
- Visualizing Web Analytics Data in R Part 2: Interactive Outliers
- Visualizing Web Analytics Data in R Part 3: Interactive 5D (3D)
- Visualizing Web Analytics Data in R Part 4: Interactive Globe
- Visualizing Web Analytics Data in R Part 5: Interactive Heatmap
- Visualizing Web Analytics Data in R Part 7: Interactive Complex
Network Diagram Using networkD3 and htmlwidgets
Network diagrams are useful for understanding complex datasets. The networkD3 and htmlwidgets package provides a way to generate interactive network diagrams that can be manipulated in a web browser. Figure 1 shows the by page and country Google Analytics session data displayed using the
simpleNetwork call, while Figure 2 shows the same data displayed using the
forceNetwork call. In Figure 2, the large nodes represent countries, while the smaller nodes in a ring around each country represent the pages that are referenced from that country. The size of each node is
log(sessions + 1) to differentiate nodes based upon the number of sessions where the sessions vary from 1 to about 20,000.
For this particular data set, neither of these visualizations are as useful as the heatmaps discussed previously, but they demonstrate what can be done with the network visualizations. For the behavioral flow in Google Analytics, these networks would be the best possible visualization; unfortunately, I have not gotten that data out of Google Analytics yet.
In preparing the
forceNetwork visualization, it is important to remember that the indexing begins at 0 as in
C rather than 1 as used in R. As you develop a
forceNetwork call is shown in Figure 4.
htmlwidgets package is used to provide the functions to save the visualizations with the
saveWidget(chartWid,saveName,selfcontained = TRUE)
selfcontained = TRUE parameter puts everything into a single HTML file that is easier to manage on some web sites. The files are minified, but are not compressed.
This article was written in RStudio and uses the
networkd3 package for rendering and
htmlwidgets package for saving HTML files.