Wolfram Technology Guide: High-Level String Computation  previous | next 
Immediate Textual Analysis
Mathematica allows efficient analysis of large-scale textual data--here finding the Zipf-like word frequency distribution from the built-in sample text of Darwin's Origin of Species.
In[1]:=

Click for copyable input
ListLogLogPlot[

 Reverse[Sort[

   Last /@ Tally[ExampleData[{"Text", "OriginOfSpecies"}, "Words"]]]],

  Joined -> True, Filling -> Axis, PlotRange -> All, Frame -> True]
Out[1]=