WOLFRAM

35 Natural Language Understanding

35Natural Language Understanding
We saw earlier how to use ctrl+= to enter natural language input. Now we’re going to talk about how to set up functions that understand natural language.
Interpreter is the key to much of this. You tell Interpreter what type of thing you want to get, and it will take any string you provide, and try to interpret it that way.
Interpret the string "nyc" as a city:
In[1]:=
Out[1]=
“The big apple” is a nickname for New York City:
In[2]:=
Out[2]=
Interpret the string "hot pink" as a color:
In[3]:=
Out[3]=
Interpreter converts natural language to Wolfram Language expressions that you can compute with. Here’s an example involving currency amounts.
Interpret various currency amounts:
In[4]:=
Out[4]=
Compute the total, doing conversions at current exchange rates:
In[5]:=
Out[5]=
Here’s another example, involving locations.
Interpreter gives the geo location of the White House:
In[6]:=
Out[6]=
In[7]:=
Out[7]=
Interpreter handles many hundreds of different types of objects.
Interpret names of universities (which “U of I” is picked depends on geo location):
In[8]:=
Out[8]=
Interpret names of chemicals:
In[9]:=
Out[9]=
Interpret names of animals, then get images of them:
In[10]:=
In[10]:=
Interpreter interprets whole strings. TextCases, on the other hand, tries to pick out instances of what you request from a string.
Pick out the nouns in a piece of text:
In[11]:=
Out[11]=
Pick out currency amounts:
In[12]:=
Out[12]=
You can use TextCases to pick out particular kinds of things from a piece of text. Here we pick out instances of country names in a Wikipedia article.
In[55]:=
Out[55]=
TextStructure shows you the whole structure of a piece of text.
Find how a sentence of English can be parsed into grammatical units:
In[14]:=
Out[14]=
An alternative representation, as a graph:
In[15]:=
Out[15]=
WordList[ ] gives a list of common words. WordList["Noun"], etc. give lists of words that can be used as particular parts of speech.
Give the first 20 in a list of common verbs in English:
In[16]:=
Out[16]=
It’s easy to study properties of words. Here are histograms comparing the length distributions of nouns, verbs and adjectives in the list of common words.
Make histograms of the lengths of common nouns, verbs and adjectives:
In[17]:=
Out[17]=
So far we’ve only talked about English. But the Wolfram Language also knows about other languages. For example, WordTranslation gives translations of words.
Translate “hello” into French:
In[18]:=
Out[18]=
Translate into Korean:
In[19]:=
Out[19]=
Translate into Korean, then transliterate to the English alphabet:
In[20]:=
Out[20]=
If you want to compare lots of different languages, give All as the language for WordTranslation. The result is an association which gives translations for different languages, with the languages listed roughly in order of decreasing worldwide usage.
In[21]:=
Out[21]=
Let’s take the top 100 languages, and look at the first character in the first translation for “hello” that appears. Here’s a word cloud that shows that among these languages, “h” is the most common letter to start the word for “hello”.
For the top 100 languages, make a word cloud of the first characters in the word for “hello”:
In[22]:=
Out[22]=
Interpreter["type"] specify a function to interpret natural language
TextCases["text","type"] find cases of a given type of object in text
TextStructure["text"] find the grammatical structure of text
WordTranslation["word","language"] translate a word into another language
35.1Use Interpreter to find the location of the Eiffel Tower. »
Expected output:
Out[]=
35.2Use Interpreter to find a university referred to as “U of T”. »
Expected output:
Out[]=
35.3Use Interpreter to find the chemicals referred to as C2H4, C2H6 and C3H8. »
Expected output:
Out[]=
35.4Use Interpreter to interpret the date “20140108”. »
Expected output:
Out[]=
Expected output:
Out[]=
Expected output:
Out[]=
35.7Find cities that can be referred to by permutations of the letters a, i, l and m. »
Expected output:
Out[]=
35.8Make a word cloud of country names in the Wikipedia article on “gunpowder”. »
Sample expected output:
Out[]=
35.9Find all nouns in “She sells seashells by the sea shore.” »
Expected output:
Out[]=
35.10Use TextCases to find the number of nouns, verbs and adjectives in the first 1000 characters of the Wikipedia article on computers. »
Sample expected output:
Out[]=
35.11Find the grammatical structure of the first sentence of the Wikipedia article about computers. »
Sample expected output:
Out[]=
35.12Find the 10 most common nouns in ExampleData[{"Text", "AliceInWonderland"}]»
Expected output:
Out[]=
35.13Make a community graph plot of the graph representation of the text structure of the first sentence of the Wikipedia article about language. »
Sample expected output:
Out[]=
35.14Make a list of numbers of nouns, verbs, adjectives and adverbs found by WordList in English. »
Expected output:
Out[]=
35.15Generate a list of the translations of numbers 2 through 10 into French. »
Expected output:
Out[]=
What possible types of interpreters are there?
Does Interpreter need a network connection?
In simple cases, such as dates or basic currency, no. But for full natural language input, yes.
When I say “4 dollars”, how does it know if I want US dollars or something else?
It uses what it knows of your geo location to tell what kind of dollars you’re likely to mean.
Can Interpreter deal with arbitrary natural language?
If something can be expressed in the Wolfram Language, then Interpreter should be able to interpret it. Interpreter["SemanticExpression"] takes any input, and tries to understand its meaning so as to get a Wolfram Language expression that captures it. What it’s doing is essentially the first stage of what Wolfram|Alpha does.
Can I add my own interpreters?
Yes. GrammarRules lets you build up your own grammar, making use of whatever existing interpreters you want.
Can I find the meaning of a word?
WordDefinition gives dictionary definitions.
Can I find what part of speech a word is?
PartOfSpeech tells you all the parts of speech a word can correspond to. So for “fish” it gives noun and verb. Which of these is correct in a given case depends on how the word is used in a sentenceand that’s what TextStructure figures out.
Can I translate whole sentences as well as words?
TextTranslation does this for some languages, usually by calling an external service.
What languages does WordTranslation handle?
  • TextStructure requires complete grammatical text, but Interpreter uses many different techniques to also work with fragments of text.
  • When you use ctrl+= you can resolve ambiguous input interactively. With Interpreter you have to do it programmatically, using the option AmbiguityFunction.
Next Section