Wolfram Language

Text & Language Processing

Generate and Verify Stemmed Words

Version 11 includes new tools to find word stems, removing plurals, inflections, etc. The word stem still carries the meaning of the original word, but frequently it will not be a dictionary word itself. This example shows instances of both situations.

Generate a list of 30 random English words with RandomWord.

In[1]:=
Click for copyable input
Short[words = RandomWord[30]]
Out[1]//Short=

Construct their respective stemmed forms with WordStem.

In[2]:=
Click for copyable input
Short[wordstems = WordStem[words]]
Out[2]//Short=

Remove the words that are identical to their stemmed forms.

In[3]:=
Click for copyable input
list = DeleteCases[Transpose[{words, wordstems}], {w_, w_}];

Emphasize in blue the stemmed forms that are also words in the English dictionary used by the new function DictionaryWordQ.

In[4]:=
Click for copyable input
list = Replace[ list, {w_, sw_?DictionaryWordQ} :> {w, Style[sw, Blue]}, {1}];

Visualize each pair in a text grid.

show complete Wolfram Language input
In[5]:=
Click for copyable input
TextGrid[ Prepend[ Partition[Flatten@list, UpTo[4]], {Style["Word", Bold, Italic], Style["Stem", Bold, Italic], Style["Word", Bold, Italic], Style["Stem", Bold, Italic]} ], Spacings -> {2, 1}, Dividers -> {{1 -> True, 3 -> True, 5 -> True}, {1 -> True, 2 -> True, -1 -> True}} ]
Out[5]=

Related Examples

de es fr ja ko pt-br ru zh