Natural language processing has come a long way since its foundations were laid in the 1940s and 50s (for an introduction see, e.g., Jurafsky and Martin (2008): Speech and Language Processing, Pearson Prentice Hall). The CRAN task view collects relevant R packages that support computational linguists in conducting analysis of speech and language on a variety of levels – setting focus on words, syntax, semantics, and pragmatics.

In recent years, within this R community, we have elaborated a framework to be used in packages dealing with the processing of written material: the package tm. Extension packages in this area are highly recommended to interface with tm’s foundational routines and useRs are encouraged to join in the discussion on further developments of this framework package. To get into natural language processing, the cRunch service and its collection of tutorials may be helpful.

