It fills a role of guidance and reference, which is, in my mind, the mark of a great technical book. Some technical books are all about guidance with little reference. Others are all about reference with little guidance on how or why.
Stepping back a little
I started with R about 6 months ago. I was looking for something that was a little more advanced than Excel at data visualisation. Excel is great, don’t get me wrong, but it struggles when it comes to really fine tuning graphs and data, not to mention data collection from multiple sources (I delved in to a number of ways of merging data from all over the place in to one source as well, but that is fraught with expensive solutions or complex systems). R was mentioned in a few blog posts I stumbled across. I also saw references to R in some statistical blogs and even some business blogs. I wasn’t really looking for a full blown language to solve my graphing needs, but it seemed right. At first I was trying to use Gephi, but it solves a different problem. It mainly focusses on being an “interactive visualization and exploration platform for all kinds of networks and complex systems, dynamic and hierarchical graphs”. Think Social Network visualisations more than quick analysis of business data (here’s an example). After more searching, I caved in and took a look at R.
It advertises itself as “an environment within which statistical techniques are implemented”. R is the language, but the entire eco system of packages makes it immensely powerful. I was a bit worried that I would have to learn a new language just to gain access to a couple of cool looking graphs. Why would I do that? I already have extensive years of experience in C# and C++, among other languages, so why would I want to spend time in a language when I could probably do the same work in with languages I already knew? I was wrong.
You could use your preferred language to do what R does, but you’d either be writing your own statistical methods, or will attempt to find the specific libraries to do what you need. This is really where R and its ecosystem stands out. The packages (on http://cran.r-project.org) are immense, and whatever analysis you can think of, someone has probably done it already.
Back to the book.
It’s hard to describe R without you actually looking at it. It’s kind of a functional language in a way. It works with “sets of data” (lists, arrays, data frames, vectors; there is no scalar value in R). R in Action is probably the best starting point. I found the online world a little too elaborate and a little unfocussed on the newbie, especially if you do not come from a statistical background.
R in Action really does take you from the start. From installing the R system through to complex statistical analysis. The first section is as far as I have gone as of writing, and I’ve learned more in this section than the 6 months of poking and prodding and online searching. Up to and including chapter 5 is all about getting up & running, getting your data in, manipulating it, and graphing it. Chapter 6 on gets deeper in to data manipulation and analysing your data.
The other really positive; if you buy the physical book you get soft copies free, so you can take the book with you on any device.
I will post the follow up when I get further in the book.