Copyright (c) 2012 Rexer Analytics, All rights reserved
Insights from R Users
Over the years, respondents to the 2007-2011 Data Miner Surveys have shown increasing use of R. In the 5th Annual Survey (2011) we asked R users to tell us more about their use of R. The question asked, "If you use R, please tell us more about your use of R. For example, tell us why you have chosen to use R, why you use the R interface you identified int he previous question, the pros and cons of R, or tell us how you use R in conjunction with other tools." 225 R users shared information about their use of R. They provided an enormous wealth of useful and detailed information. Below are the verbatim comments they shared.
Best variety of algorithms available, biggest mindshare in online data mining community, free/open source. Previous con of 32-bit in-memory data limitation removed with release of 64-bit. Still suffers some compared to other solutions in handling big data. Stat-ET plugin gives best IDE for R, using widely used Eclipse framework, and is available for free. RapidMiner add-in that supports R is extremely useful, and increasingly used as well.
The main reason for selecting R, back in the late 1990's, was that it had the algorithms I needed readily available, and it was free of charge. After getting familiar with R, I have never seen a reason to exchange it for anything else. R is open-source, and runs on many different system architectures, so distributing code and results is easy. Compared to some latest commercial software I've evaluated, R is sluggish for certain tasks, and can't handle very large datasets (mainly because I do not have a 64-bit machine to work with). On top of that, to be really productive with R, one needs to learn other languages, e.g., SQL, but that's just how things are. Besides, knowledge of those other languages is needed anyway.
I've migrated to R as my primary platform. The fact that it's free, robust, comprehensive, extensible, and open. More than offsets it's shortcomings - and those are disappearing over time anyway. I'm using Revolution's IDE because it improves my productivity. R has a significant learning curve, but once mastered, it is very elegant. R's shortcomings are it's in-memory architecture and immature data manipulation operations (i.e. lack of a built-in SQL engine and inability to handle very large datasets). Nevertheless, I use R for all analytic and modeling tasks once I get my data properly prepared. I usually import flat files in CSV format, but I am now using Revolution's XDF format more frequently.
Why I use R: Initial adoption was due to the availability of the randomForest package, which was the best option for random forest research, as rated on speed, modification possibility, model information and tweaking. Since adopting R, I have further derived satisfaction from the availability of open source R packages for the newest developments in statistical learning (e.g. conditional inference forests, boosting algorithms, etc.). I use the R command line interface, as well as scripting and function creation, as this gives me maximum modification ability. I generally stay away from GUIs, as I find them generally restrictive to model development, application and analysis. I use R in conjunction with Matlab, through the statconnDCOM connection software, primarily due to the very slow performance of the TreeBagger algorithm in Matlab. Pros of R: Availability of state-of-the-art algorithms, extensive control of model development, access to model information, relatively easy access from Matlab. Cons of R: Lack of friendly editor (such as Matlab's Mlint; although I am considering TinnR, and have tried Revolution R); less detailed support than Matlab.
1. I employ R-Extension in the Rapid Miner interface 2. I use R for its graphing and data munging capability 3. I use R for it's ability to scrape data from different sources (FTP, ODBC) and implement frequent and automated tasks Pros: -Highly customizable -Great potential for growth and improved efficiencies -Fantastic selection of packages - versatility -Growing number of video tutorials and blogs where users are happy to share their code CONS: -x64 not fully integrated yet -Steep learning curve -Limited number of coding and output examples in package vignettes -Difficult to setup properly (Rprofile.site examples are scarce online) -Memory constraints with i386 32-bit -Output - reporting design limitations suitable for the business environment.
I have about 12 years experience with R. It's free, the available libraries are robust and efficient. I mostly work with the command line, but I am moving towards RStudio because it's available both as a desktop application and a browser-based client-server tool set. I occasionally use Rcmdr. The only other tool I use as heavily as R is Perl - again, lots of experience with it, it's free, and there are thousands of available library packages.
The analytics department at my company is very new. Hence we haven't yet decided which analytics-tool we'll be using. R is a great interim solution as it is free and compatible with a reasonable amount of commercial analytics tools. The reason I use the standard R GUI and script editor is simply because I haven't invested the time in trying out different GUI's, and I haven't really had any reason to. The advantage of using R (at least compared to SAS, which was the analytics-tool at my old job) is mainly that you have much more control over your data and your algorithms. The main problem with R is that it can't really handle the data sets we need to analyze. Furthermore, your scripts can tend to be a bit messy if you are not sure what kind of analysis or models you are going to use.
I use R for the diversity of its algorithms, packages. I use Emacs for other tasks and it's a natural to use it to run R, and Splus for that matter.I usually do data preparation in Splus, if the technique I want to use is available in Splus I will do all the analysis in Splus. Otherwise I'll export the data to R, do the analysis in R, export results to Splus where I'll prepare tables and graphs for presentations of the model(s). The main drawback to R, in my opinion, is that R loads in live memory all the work space it is linked to which is a big waste of time and memory and makes it difficult to use R in a multi-users environment where typical projects consist of several very large data sets.
We continue to evaluate R. As yet it doesn't offer the ease of use and ability to deploy models that are required for use by our internationally distributed modeling team. "System" maintenance of R is too high a requirement at the moment and the enormous flexibility and range of tools it offers is offset by data handling limitations (on 32 bit systems) and difficulty of standardizing quick deployment solutions into our environment. But we expect to continue evaluation and training on R and other open source tools. We do, for instance, make extensive use of open source ETL tools.
I use R extensively for a variety of tasks and I find the R GUI the most flexible way to use it. On occasion I've used JGR and Deducer, but I've generally found it more convenient to use the GUI. R's strengths are its support network and the range of packages available for it and its weaknesses are its ability to handle very large datasets and, on occasion, its speed. More recently, with large or broad datasets I've been using tools such as Tiberius or Eureqa to identify important variables and then building models in R based on the identified variables.
I'm using R in order to know why R is "buzzing" in analytical areas and to discover some new algorithms. R has many problems with big data, and I don't really believe that Revolution can effectively support that. R language is not mature for production, but really efficient for research : for my personal researches, I also use SAS/IML programming (which is for me the real equivalent for R, not SAS/STAT). I'm not against R, it's a perfect tool to learn statistics, but I'm not really for data mining : don't forget that many techniques used in data mining comes from Operational Research, in convergence with statistics. Good language, but not really conceived for professional efficiency.
R is used for the whole data loading process (importing, cleaning, profiling, data preparation), the model building as well as for creating graphical results through other technologies like Python. We use it also using the PL/R procedural language to do in-database analytics & plotting.
Pro: versatility, big number of algorithms. Con: needs coding. Currently learning to use it and determining usability for what I do. If I continue to use it, it will be via Rapidminer.
Utilize R heavily for survey research sampling and analysis and political data mining. The R TextMate bundle is fantastic although RStudio is quickly becoming a favorite as well. Use heavily in conjunction with MySQL databases.
I make a lot of use of the CDK (Chemistry Development Kit) implementation (rcdk). Mainly for PCA, Clustering, Descriptor profiling and Fragmenting. Pros: reuse of scripts, flexibility, ability to load lot of data, performance. Cons: Long learning curve.
PRO: free of charge, availability of huge variety of algorithms - access to fine-tune method parameters, huge user base with IRC and forum help (stackoverflow). CON: takes much more time to see results of a mining project, takes much longer learning curve to produce output if you are not familiar with the concept. RStudio is a nice interface for developing quick and dirty solutions as well as testing some ideas. STATISTICA is much simpler to use but does not offer the variety of algorithms and access to the parameters, VB for STATISTICA has fewer support (e.g. examples on the web, help for VB is of limited use). For a very quick data mining task PASW Model is even simpler to use than STATISTICA but with a very restricted access to parameters.
I use R in conjunction with Matlab mostly, programming my personalized algorithms in Matlab and using R for running statistical test, ROC curves, and other simple statistical models. I do this since I feel more comfortable with this setting. R is a very good tool for statistical analysis basically because there are many packages covering most of statistical activities, but still I find Matlab more easy to code in.
I greatly prefer to use R and do use it when working on more "research" type projects versus models that will be reviewed and put into production. Because of the type of work that I do, SAS is the main software that everyone is familiar with and the most popular one that has a strong license. Our organization needs to be able to share across departments and with vendors and governmental divisions. We can't use R as much as I would like because of how open the software is - good for sharing code, bad for ensuring regulators that data is safe.
Main reasons for use: 1) Strong and flexible programming language, making it very flexible. 2) No cost, allowing me to also having it on my personal computer so that I can test things at home that I later use at work. I use RODBC to get the data from server and let the server do some of the data manipulation , but control it from within R. Have also started to use RExcel with the goal as using that as a method to deploy models to analysts more familiar with Excel than R.
Personally I find R easier to use than SAS, mostly because I am not constrained in getting where I want to go. SAS has a canned approach. I see using GUI's as a "sign of weakness" and as preventing understanding the language at its core. I have not found Rattle to be particularly helpful. I have also tried JGR and Sciviews and found I could not surmount the installation learning curve. Their documentation did not produce a working environment for me.
I use R primarily for my personal educational use - I am in an Economics PhD program analyzing household surveys from developing countries. I usually load the household survey tables onto an MS SQL server and then manipulate/prepare the data to be imported into R, and then run my analysis (typically OLS/Probit/Logit regressions along with a range of descriptive statistics). I use R gui because I don't know the R syntax very well yet. It helps me generate the 'baseline' scripts that I tweak to get specifically what I want/need. Once I get used to the syntax, I will probably move away from the GUI altogether.
We use R only occasionally. Some of our analysts know R from their university studies and hence are very familiar with R. Whenever we use R, we use it from within RapidMiner processes in RapidMiner (or in RapidAnalytics = RapidMiner server), because we use RapidMiner/RapidAnalytics for the overall data mining process design, automated process optimization, process sharing among team members, automated scheduled process deployment, result and report sharing, etc.
(Primary use of R dates from 2001-2006) Reason to use R: I thaught R and SPSS to students + most diverse statistical package for research. Availability of diverse packages allowed to use R in any domain, e.g. for me: sound analysis, categorization, linear models, ... Pro: the only limit of R, is your knowledge; no GUI limitations; Open source (check by community + free). Con: starting knowledge as limit for analysis. This doesn't stimulate advanced analytics as from the beginning.
Free and so many packages. Very powerful and convenient on small data sets, but slow and a memory hog. Can't handle large data sets (often refuses to allocate memory, so a memory strategy needs to be built into the application code). Extremely inefficient even for simple tasks. For example, I rewrote the "which" function to run twice faster than the native function. Very limited support for parallel processing.
R has a good price point compared to other packages. R-Studio seems to be the best of the crop aside from the Revo GUI. R is mostly used here in conjunction with the python rpy2 library.
We use R because of its very powerful libraries for data mining and forecasting in finance and because of its open source nature. RapidMiner best supports the overall data mining process and seamlessly integrates R. Therefore we use RapidMiner as overall framework and R within RapidMiner whenever it adds value and functionality needed.
I use R primarily for analysis of psychology experiment results. Some data mining based on psychological factors is often necessary. I use R because it is Free, powerful, script-able, and multi-platform. I use emacs because I can write my experiments (Python), process the data (Python and R), analyze the data (R), and write papers (LaTeX) from within the same environment.
It is fast to write code for quick test (answer to both, why R and why command line interface). It has strong visualization tools and is easy to use. Definitely strong tool, but might not be that easy to use for people without (functional) programming background. I use R within KNIME mostly for data manipulation or outside KNIME for plotting.
Powerful scripting language for non standard operations. The integration in RapidMiner enables me to use any R script within my workflows, this ensures the reusability, understandability and eases the deployment a lot! I use the RapidAnalytics Community Edition for deploying everything together into productive environments.
The ease of integrating R via the JMP interface has enabled me to take advantage of several methods I would not otherwise be able to do as easily or expeditiously. The vast of amount work that has been done with R is mind boggling. Unfortunately, my R programming skills are not where I wish they were and therefore integrate with JMP and its strengths.
We've used R (and S) for many years. It has an extensive feature set (with add-ins) and is open source. RapidMiner is a great front-end for R in our in depth analytics. We use WebFOCUS RStat to add predictive analytics to our management reports.
I use R because of the wide range of community support available, the ability to write my own code, its interface with Python, and the large number of packages available. Not to mention it is very stable, able to handle data sets of a moderate size better than WEKA, and the visualization tools are incredible.
R add-on packages are comprehensive and so far reliable. Documentations are also good. The command line interface gives the most natural experience for exploring models and debugging scripts. I frequently use R in conjunction with RapidMiner (the R extension).
R allows for very rapid development and a huge variety of packages. Also, it was what I learned in grad school, so when thinking how to do a problem, R solutions jump to mind first. I use Rstudio b/c of the Linux web service IDE. I have huge troubles with scale, but am investigating RHIPE as a solution.
Many of our newly hired university grads know R and to write scripts in R. RapidMiner nicely integrates R and has the best GUI and highest flexibility for the overall data mining process design from ETL and modeling to automated deployment and reporting.
I tend to write large scripts/small applications in R, and the TextMate editor with an R plugin is a great way of working with moderately sized sets of scripts, libraries, etc.
I'm a bioinformatician and the Bioconductor collection of R packages is one of the best platform for the analysis of 'omic' data. I'm confident with the unix-like terminal so the command line plus a good editor (vim, textmate) is my platform of choice for developing.
R has a huge variety of algorithms originating from the active community. These algorithms can be integrated within the necessary data processing within RapidMiner and thus be easily deployed in conjunction with complex data transformation settings.
I use SPSS base for what it can do. For what it doesn't do well or offer I use R. Found learning curve a bit steep but worth the effort. Mainly confined to Cmdr & Rattle but can use the command line interface. My biggest gripe with R is getting the results into Excel [which my clients use] in a neat fashion.
I recently used R to fit our data. I used command line and script-based interfaced because I am familiar with those in Python.
I use some elements, such as factor analysis, statistical criterias. KNIME is data mining soft, so statistical analysis features is not the primary in this package.
R is in general powerful since you can program everything - although sometimes this is hard work. The reusability of parts and the missing functions for process design are cons and here RapidMiner is much better. I have just started to use the combination with RapidMiner but this seems to be the optimal combination of both pros to me.
Availability and extent of a range of methodologies. Tinn-R allows for programming and debugging.
I've been using R for marketing analytics and customer intelligence for 7 years. R Studio was released only in Feb & has become extremely popular in a short time.
I've chosen R for its flexibility and wide user base in academia. Pros: large user base to hire from; ability to script and automate. Cons: arcane interface, data management/handling is no where near as useful as SAS. We use R for experimental purposes and to close gaps when our primary tool cannot address and we don't want to pay exorbitant SAS/STAT fees.
Pros: powerful and flexible, free and open source, large community, huge number of algorithms. Cons: not friendly to the beginner, poor data import/export, lots of things have to be done in console even when using a GUI, poor handling of big data sets.
RStudio Server is a great tool for working remotely with R. In general, I like R because I can use pure code, no clicking around.
CRAN packages available for R often incorporate procedures not available elsewhere. R is clearly more clumsy and slower than its competitors for doing routine estimation on large datasets.
We're moving to R from SAS because of the cost difference, flexibility, integration with other tools (graphics/reporting/presentation) and community support. Cons are the learning curve and re-work to transition from existing tools.
R is a language in our research group due to its flexibility and good graphics. We increasingly use it in conjunction with KNIME to establish workflows and report results.
Pro: free, open-source, quiet complete. Con: documentation and interface not so easy to exploit, compared to Visual C# with MSDN. Combining RapidMiner & R allow to tweak perfectly inputs & outputs and to extend available functions of RM.
We use R because of its extensive set of machine learning libraries. A single commercial data mining application simply does not have everything that we need. Pros - a rapidly expanding community of users. Cons - speed.
R is an open source with plenty of packages for variety fields. It is very user friendly and easy to be conducted with KNIME.
Using R in preparation to data mining, i.e. primarily for descriptive statistics. Biggest pro is ease of use. Biggest con is the handling of large data sets.
I use R because a) it is free, b) it can do anything I want it to do as long as I can figure out how to do it. It doesn't do data management very well so I have to prepare the data in another software, usually SQL Server or Access and save it as text before I can actually use R. R has the best visualization packages and data mining algorithm packages, it's just not always easy to figure out how to use them. Most of what I would do in R I could do in other software, but it would be several rather than all in one place.
R is easy to use, has a lot of methods, is free. I write my scripts and use them. There are errors in some programs; so we have to be careful when using it. But I find it relevant for teaching and processing data.
Why I have chosen R: - Open Source - Huge user base - Accurate results - Based on C and C++. Cons: - R programming language is not as intuitive like Matlab.
R is very good for academic research. It is really good for student to learn how DM algorithm work. But for industrial and business use, R is not very suitable for me, because of low speed and big data.
I use R because it supports me in many steps of my data analysis needs: data preprocessing, descriptive analysis and plots. I used to plot data with another tool and then convert it to eps. With R I don't have to worry about it. I can focus on the analysis, not on the tool. The disadvantage is that R has a steep learning curve. Because of the many contributors, sometimes it is hard to find the best package for you.
Great programming environment in spite of steep learning curve. I complement R with KNIME for data preparation and presentation.
I use R mostly for graphic capabilities. The modeling tools are rarely used as R cannot easily handle large datasets.
Pros: free, powerful language, excellent choice of models and methods via packages. Horrible user interface.
R seems to be gaining traction in industry; more of my clients are asking me to create models using it. While I don't like the database interface tools, they are useable.
Rarely for some lesser known, nonstandard algorithms. Pros - vast selection of even exotic methods. Cons - unstable, unfriendly.
Pro: Free, easy to identify packages that does what I need. Con: Having to know the command lines.
Pro: Price, general capability, open source encourages growth. Con: Utterly unsupported, so can't base enterprise applications on it.
PRO's R has much innovation. CON's would like web service forecasting, or Java code generation for forecasting.
Pros: large library of statistical function. Cons: interface/syntax very ugly compared to matlab.
Pros: Powerful, free, flexible, scalable. Cons: Does not handle well large datasets.
We use R in combination with the Pentaho BI Suite. Pentaho developed an R plugin which lets a user make selections of data to analyze which is pushed to RServe through the plugin and which pushes back the output to the Pentaho portal.
I use R because it's free and capable of drawing graphs. Now looking for a GUI to be used. Lack of Japanese interface is a con of R, at this moment.
I have used the time series, trees and tests capabilities of R. It is very powerful but : (1) the language is not so easy, and (2) you have to understand what you do, that is it is not possible to do some quick and dirty analysis without reading some statistical materials.
Free; provides statistical methods or relevant packages; I am used to command line and do not need GUIs yet; con: visual output is not as good as in matlab.
Pro: comprehensive. Con: not supported (once called an author of some R code and he could not speak English).
Free, stable, and lots package, but still difficult to combine with other program languages.
Extended range of analytic routines, all callable from within KNIME.
I'm learning how to use efficiently R, to gain access to specific algorithms. Other interfaces used: RExcel, JMP and SAS IML Studio.
Great user community and works well w/ Matlab. Anything that plugs into these 2 packages will be used.
R for sophisticated statistical analysis (analysis scripting), KNIME for the rest.
R is free and many of my measurement friends use this so I had to learn the program. I still prefer SAS.
R offers some things that SPSS Statistics doesn't offer. And SPSS interfaces with R.
RapidMiner offers access to R. The advantage of R is that a new algorithm can easily be developed -- and then be applied within RapidMiner.
Pros: * ease of graphical output for data exploration * full coding capability ( conditional execution - i.e.: "if" -, loops, subroutines - i.e. functions) * broad availability of innovation driven by user community * support from user community * matrix language (vectorized computations) * open source technology
Con: it's not easy to us - that's why I use R from within IBM SPSS Statistics, where you can create user defined dialog boxes for R code and you can also use the tables API to generate very good looking tables.
R has something useful not yet implemented in KNIME. Revolution interface is pretty cool. R has learning curve flatter than KNIME. R has specific capabilities nothing else provides. R has huge forums for help. R is sound.
A mix of ad hoc descriptive analyses and visualizations, along with more engineered code for predictive and prescriptive analytics that integrate with other systems.
R is the only software package that can within a few lines of code run the transformations and modeling calculations I require. Other software packages do not have the flexibility to do high end maths, and nth dimensional matrixs.
No other software package has access to the latest output from the research community. The fields we work in (social network analysis, link analysis) need cutting-edge solutions so it is impossible to stay with a package such as SAS where the main algorithms haven't changed in numerous years. We would be left behind by the competition.
R combines the features of a programming language with the need for cutting edge algorithms, diagnostics, and data management. It can't be replicated elsewhere.
New data mining methods quickly available with R packages on internet. Use available packages to build own data mining methods.
R is limited and not able to support enterprise requirements. Quality of R software is unacceptable.
Prototyping only, not suitable for commercial use given licensing.
I have been using several R interfaces (R Commander GUI, JGR GUI and R's command line interface and RStudio) with my personal projects. I believe that RStudio is very user friendly.
I use it on its own, and occasionally from within SAS or SPSS. RStudio is a huge improvement in ease-of-use for programming, especially with our cross-platform users and cluster users. Our users need a point-and- click interface though. If the RStudio team gets Deducer integrated into it, we'll be all set.
You can always find a library for the algorithm you want (no matter how obscure the algorithm or how new). Cost is zero. Millions of users. You can always find an answer to any doubt you might have. Manuals are always available. Dozens of books available.
I consider R the standard software in statistical matters. I've learned it at University and I continue to use it. I trust in it in terms of mathematical and numeric stability. I use the standard command line interface because I haven't had (yet) the opportunity to explore other user interface, but I would be glad to learn new GUIs (and I hope you can suggest me the best GUI through the results of your survey).
Started using R after being introduced to it in Graduate School. I like it because of a large availability of packages (without cost!). It's a software that respects the intellectual capability of the end user - doesn't dumb down things.
Excellent graphics, wide variety of routines available, runs on multiple platforms (including Linux), many graphical interfaces available (some better than others for specific purpose), flexibility of programming language and interface to various databases.
I am using R since '99 in academic research and then for enterprises needs. His modular structure could very pleasant if you work on a desktop or on a remote server (e.g.. in batch mode). An additional and usable GUI could be a benefit, additional love on performance (execution) a need for the future an big dataset analysis.
Drawbacks of mining in R are difficulty in using surrogate split, and not good input/output handling.
I chose R because is open source and I'm work in a non profit research organization. I like the possibility of development software using R as statistics engine of this application. I'm develop with Pyton and .NET. I like the graphics possibilities. I'm using R for my thesis and I try to introduce this software in my job.
R is open source and easy to expand by creating packages. R is a high-level language that allows one to quickly build a solution to a specific problem at hands. There is an active community that quickly encode new algorithms into R packages.
Not supported, buggy, but comprehensive.
For any advanced analysis. To avoid buying extremely expensive software add-ons.
For decision trees and other algorithms not available in base SAS.
R integration in KNIME makes both extremely powerful together. I usually perform data shaping in KNIME, and many modelling tasks on the shaped data in R.
More options around algorithms not contained or not tweakable in SPSS. Handles different levels of data aggregation more easily. Some tasks easier to automate in R than in SPSS syntax.
Algorithms that are not available in SPSS.
Complement some missing algorithms implementation.
R offers several packages not available on Matlab.
I use it to add algorithms not available in SAS or JMP.
Because it extends the great RapidMiner tool.
Packages like 'caret' are why I use R. Nowhere else will get you as far out on the cutting edge of machine learning research.
I use it just for tests.
I use R because it's free and perfect for casual, interactive, ad hoc analysis that isn't available in the RDBMS.
R is free and offers alternative methods that can easily be incorporated into other packages.
High speed, programmability and flexibility make it the best choice. Combined with acquisition cost and it was a no brainer.
Great for custom code, or to modify existing procedures. Great variety of available algorithm, easy to integrate codes in C or Fortran for speed calculations when needed.
I think that every dataminer must master "R" in todays day, it's a great software that have a long learning curve, and it can be a good alternative for companies who can't afford a commercial software.
Late in 2010, I started using R. It quickly became my analytics software of choice because of its data visualization capabilities and the great amount of available libraries.
It's hard to imagine not using R via scripts - at least for anything except very straightforward tasks. It is also hard to imagine not using R in a research environment - its flexibility is very hard to approach.
It's powerful, I can use it on any platform, it's free, it has nearly every technique that commercial software has, there is seemingly a better user community who are always willing to help, the studies being done with R are cutting edge, and it's easy to expand and modify since it is open source.
It's powerful, it's extensible, it's flexible, it's very well community supported and it's free.
Why use R: for the moment it is simply the best. For this time being, my major use of R is programming and experimenting with new methods.
R is usually good enough for what we do. It's free. We didn't need to bother with interfaces beyond the native command window.
R is widely used (=supported & developed & robust); has strong analytic capabilities (=requirement); has many GUIs (= fun and easy to use).
TeradataR (developer exchange package) provides easy to use and feature rich capabilities to use R against *massive* data sets in Teradata.
Flexibility and good quality and variety of graphics/charts.
It is much easier to program repetitive tasks in R, especially when I want to automate optimization of algorithm parameters by trying all possibilities. That is something that is very difficult to automate in visual tools. The other reason is that almost every algorithm is implemented in R, or will be implemented in the future.
I started by using it to teach graduate classes and ended up growing fond of it.
Free, easy to use, many user written algorithms.
Freeware, lots of implemented methods.
I use R primarily because of its increasingly widespread use. The software is also free, which makes it even more attractive.
I use R because is open source, I have Linux at home, is very good for academic purposes.
Free fallback; double-checking; includes kitchen sink.
Free libre open source software.
I was using R on the university, and it is free of charge and easily extensible.
Low cost. Wide functionality. Access to source/code and authors.
I have chosen to use R because it allows free access to the tool. It also incorporates a large quantity of algorithms and it can manage large databases.
It is free open source and so are all add-on libraries. Graphical facilities, high graphical quality. Ease of combining existing tools with our own code and tools. Possibility to use huge number of already optimized processes.
It's free, there are big amount of algorithms not available directly in SAS/SPSS.
It is free, but there are many limitations that are not easily overcome, so I only use it in specific situations.
It's powerful and free.
It's free and everyone starts to use.
Chosen because free and popular among colleagues.
Cost and capability
Cost! Also much more able to customize analysis as needed.
Free and has a large set of functions.
It is a free open source tool.
Being a free software is a R competitive advantage.
All type of dataset is read in R. So I like this.
As a simple exploratory tool before more serious mining.
I am better at programming with R than with SAS.
Mostly for Copulas.
Graphics, ggplot2, automation of tasks, marketing experimental design.
Pros: superior graphics tools, many packages, objected oriented programming capabilities, open source, free, excellent community for sharing ideas/problems.
Just to incorporate borrowed algorithms when needed.
Large base of contributed analysis packages (CRAN).
SVM, Boosting, and Bagging.
Various time series routines.
I use R together with IBM SPSS Statistics to create time series models using the algorithm ARCH / GARCH.
Also use R scripts inside STATISTICA (to augment analysis flows).
Use scripts that are already available for running R routines from JMP.
Mostly distribution fitting routines, some variance estimation routines.
Mostly simulation related functions.
I use R from KNIME. I am not an R expert. So I do the data manipulation in KNIME and only the statistics in R.
I chose R to deploy time series models. I learned to program the command line interface.
Basically for data regression.
Very specific preprocessings can be quickly coded.
Mostly time series and specialized variance estimation procedures.
Number of models and graphics.
I started to use R for monthly time series forecasts but expanded the use to statistical modelling etc.
Use some open source modules.
Open system, many good algorithms, good programming language.
Power and versatility of available methods/libraries.
Powerful, universally available, free.
R has a wonderfully large collection of algorithms and tools to work with. Papers often accompany these tools explaining the ideas underlying them. The python like nature of R allows one to easily modify data and automate analysis.
Variety of algorithms.
Variety of user contributed modules are available.
Wide breadth of models to test.
Wide variety of algorithms, free.
I also use R from within KNIME.
I use R through KNIME.
Vendor independence; more innovation.
Script based R is actually very useful but learning curve is steeper. Time could be better spent on analyzing outputs.
Because R is free and powerful for data manipulation.
Open source and do not have to deal with licensing issues. Pros and cons - too much to write down here!
Reputation in academic circles, availability of different packages.
Rapid prototyping tool, occasional need for sophisticated graphics.
We currently use R because the subset of our client doing the modeling already know it.
Preliminary scoping out of models, for articulation to developers.
Primarily for social network analysis not available in commercial tools I use.
I don't use other interfaces because I think could be outdated.
Very rare just for R & D, or a one off report.
When I really have to use that crap.
I'd like to, but the learning curve is rather too steep at the moment - especially for neophyte researchers - e.g. research students.
I use add-ons. I think they are R, not really sure.
Occasionally use R for a quick look at data.
Just trying another tools, but not primary.
R is just another tool.
Currently evaluating R; don't know much about the available interfaces.
I would like to use R with STATISTICA, but just haven't explored this.