Explore the latest version of An Elementary Introduction to the Wolfram Language »
35Natural Language Understanding
We saw earlier how to use ctrl+= to enter natural language input. Now were going to talk about how to set up functions that understand natural language.
Interpreter is the key to much of this. You tell Interpreter what type of thing you want to get, and it will take any string you provide, and try to interpret it that way.
In[1]:=
Click for copyable input
Out[1]=
The big apple is a nickname for New York City:
In[2]:=
Click for copyable input
Out[2]=
In[3]:=
Click for copyable input
Out[3]=
Interpreter converts natural language to Wolfram Language expressions that you can compute with. Heres an example involving currency amounts.
Interpret various currency amounts:
In[4]:=
Click for copyable input
Out[4]=
In[5]:=
Click for copyable input
Out[5]=
Interpreter gives the geo location of the White House:
In[6]:=
Click for copyable input
Out[6]=
In[7]:=
Click for copyable input
Out[7]=
Interpreter handles many hundreds of different types of objects.
In[8]:=
Click for copyable input
Out[8]=
In[9]:=
Click for copyable input
Out[9]=
In[10]:=
Click for copyable input
Out[10]=
Click for copyable input
Interpreter interprets whole strings. TextCases, on the other hand, tries to pick out instances of what you request from a string.
In[11]:=
Click for copyable input
Out[11]=
Pick out currency amounts:
In[12]:=
Click for copyable input
Out[12]=
You can use TextCases to pick out particular kinds of things from a piece of text. Here we pick out instances of country names in a Wikipedia article.
Generate a word cloud of country names from the Wikipedia article on the EU:
In[13]:=
Click for copyable input
Out[13]=
TextStructure shows you the whole structure of a piece of text.
Find how a sentence of English can be parsed into grammatical units:
In[14]:=
Click for copyable input
Out[14]=
In[15]:=
Click for copyable input
Out[15]=
WordList[ ] gives a lists of common words. WordList["Noun"], etc. gives lists of words that can be used as particular parts of speech.
In[16]:=
Click for copyable input
Out[16]=
Its easy to study properties of words. Here are histograms comparing the length distributions of nouns, verbs and adjectives in the list of common words.
In[17]:=
Click for copyable input
Out[17]=
Translate hello into French:
In[18]:=
Click for copyable input
Out[18]=
In[19]:=
Click for copyable input
Out[19]=
In[20]:=
Click for copyable input
Out[20]=
If you want to compare lots of different languages, give All as the language for WordTranslation. The result is an association which gives translations for different languages, with the languages listed roughly in order of decreasing worldwide usage.
Give translations of hello into the 5 most common languages in the world:
In[21]:=
Click for copyable input
Out[21]=
Lets take the top 100 languages, and look at the first character in the first translation for hello that appears. Heres a word cloud that shows that among these languages, h is the most common letter to start the word for hello.
For the top 100 languages, make a word cloud of the first characters in the word for hello:
In[22]:=
Click for copyable input
Out[22]=
Interpreter["type"] specify a function to interpret natural language
TextCases["text","type"] find cases of a given type of object in text
TextStructure["text"] find the grammatical structure of text
WordTranslation["word","language"] translate a word into another language
35.1Use Interpreter to find the location of the Eiffel Tower. »
Expected output:
Out[]=
35.2Use Interpreter to find a university referred to as U of T»
Expected output:
Out[]=
35.3Use Interpreter to find the chemicals referred to as C2H4, C2H6 and C3H8. »
Expected output:
Out[]=
35.4Use Interpreter to interpret the date 20140108»
Expected output:
Out[]=
35.5Find universities that can be referred to as U of X, where x is any letter of the alphabet. »
Expected output:
Out[]=
Expected output:
Out[]=
35.7Find cities that can be referred to by permutations of the letters a, i, l and m. »
Expected output:
Out[]=
Sample expected output:
Out[]=
35.9Find all nouns in She sells seashells by the sea shore. »
Expected output:
Out[]=
35.10Use TextCases to find the number of nouns, verbs and adjectives in the first 1000 characters of the Wikipedia article on computers. »
Sample expected output:
Out[]=
35.11Find the grammatical structure of the first sentence of the Wikipedia article about computers. »
Sample expected output:
Out[]=
35.12Find the 10 most common nouns in ExampleData[{"Text", "AliceInWonderland"}]»
Expected output:
Out[]=
Sample expected output:
Out[]=
Expected output:
Out[]=
35.15Generate a list of the translations of numbers 2 through 10 into French. »
Expected output:
Out[]=
What possible types of interpreters are there?
Does Interpreter need a network connection?
In simple cases, such as dates or basic currency, no. But for full natural language input, yes.
When I say 4 dollars, how does it know if I want US dollars or something else?
Can Interpreter deal with arbitrary natural language?
If something can be expressed in the Wolfram Language, then Interpreter should be able to interpret it. Interpreter["SemanticExpression"] takes any input, and tries to understand its meaning so as to get a Wolfram Language expression that captures it. What its doing is essentially the first stage of what Wolfram|Alpha does.
Yes. GrammarRules lets you build up your own grammar, making use of whatever existing interpreters you want.
WordDefinition gives dictionary definitions.
Can I find what part of speech a word is?
PartOfSpeech tells you all the parts of speech a word can correspond to. So for fish it gives noun and verb. Which of these is correct in a given case depends on how the word is used in a sentenceand thats what TextStructure figures out.
Can I translate whole sentences as well as words?
TextTranslation does this for some languages, usually by calling an external service.
What languages does WordTranslation handle?
It can translate lots of words for the few hundred most common languages. It can translate at least a few words for well over a thousand languages. LanguageData gives information on over 10,000 languages.
 
Download Notebook Version
es