In Part 1 of this series we moved the first steps into building our Sales Dashboard in R. In this Part 2 we explore additional ways to display sales related data.
In a previous post about creating Pivot Tables in R with melt and cast we covered a simple way to generate sales reports and summary tables from a data set consisting of orders. It is often said that a picture is worth 1000 words, so in this series of posts we will focus on how to create visual representations and summaries of the same data.
Did you know that it is possible to import (read) data into R directly from Mac OS X clipboard? Actually it is easier than it looks like, provided that you know how to address the Mac clipboard within the read.table function.
The trick is to use pipe files. Pipe files in R can be addressed through the pipe function. Next you need to know the proper name of the pipe file that corresponds to the Mac clipboard, which is “pbpaste”.
R with ggplot2 is capable of producing visually appealing charts and is definitely more versatile than Excel for what concerns graphical representation of data. When it comes to presenting the results of an analysis though, PowerPoint is still the most widely used application, at least in the business environment.
This article shows a workflow to bring your ggplot2 charts to PowerPoint automatically, so you can build your analysis presentation directly from an R script within RStudio.
One of the first steps when working with a fresh data set is to plot its values to identify patterns and outliers. When outliers appear, it is often useful to know which data point corresponds to them to check whether they are generated by data entry errors, data anomalies or other causes.
Unfortunately ggplot2 does not have an interactive mode to identify a point on a chart and one has to look for other solutions like GGobi (package rggobi) or iPlots.
However, if all is needed is to give a “name” to the outliers, it is possible to use ggplot labeling capabilities for the purpose. While labeling all points would usually produce a crowded and difficult to read plot, we can limit the labeling only to those points that respect certain conditions, namely our outliers.