[R] Do you use R for data manipulation?

Warren Young warren at etr-usa.com
Wed May 13 04:48:01 CEST 2009


Farrel Buchinsky wrote:
> Is R an appropriate tool for data manipulation and data reshaping and data
> organizing? I think so but someone who recently joined our group thinks not.
> The new recruit believes that python or another language is a far better
> tool for developing data manipulation scripts that can be then used by
> several members of our research group. Her assessment is that R is useful
> only when it comes to data analysis and working with statistical models.

It's hard to shift people's individual preferences, but impressive 
objective comparisons are easy to come by.  Ask her how many lines it 
would take to do this trivial R task in Python:

	data <- read.csv('original-data.csv')
	write.csv('scaled-data.csv', data * 10)

R's ability to do something to an entire data structure -- or a slice of 
it, or some other subset -- in a single operation is very useful when 
cleaning up data for presentation and analysis.  Also point out how easy 
it is to get data *out* of R, as above, not just into it, so you can 
then hack on it in Python, if that's the better language for further 
manipulation.

If she gives you static about how a few more lines are no big deal, 
remind her that it's well established that bug count is always a simple 
function of line count.  This fact has been known since the 70's.

While making your points, remember that she has a good one, too: R is 
not the only good language out there.  You should learn Python while 
she's learning R.




More information about the R-help mailing list