35 | Natural Language Understanding |
We saw earlier how to use to enter natural language input. Now we
’re going to talk about how to set up functions that understand natural language.
Interpreter is the key to much of this. You tell
Interpreter what type of thing you want to get, and it will take any string you provide, and try to interpret it that way.
“The big apple” is a nickname for New York City:
Interpreter converts natural language to Wolfram Language expressions that you can compute with. Here
’s an example involving currency amounts.
Interpret various currency amounts:
Interpreter handles many hundreds of different types of objects.
Interpreter interprets whole strings.
TextCases, on the other hand, tries to pick out instances of what you request from a string.
Pick out currency amounts:
You can use TextCases to pick out particular kinds of things from a piece of text. Here we pick out instances of country names in a Wikipedia article.
Generate a word cloud of country names from the Wikipedia article on the EU:
TextStructure shows you the whole structure of a piece of text.
Find how a sentence of English can be parsed into grammatical units:
WordList[ ] gives a lists of common words.
WordList["Noun"], etc. gives lists of words that can be used as particular parts of speech.
It’s easy to study properties of words. Here are histograms comparing the length distributions of nouns, verbs and adjectives in the list of common words.
Translate “hello” into French:
If you want to compare lots of different languages, give All as the language for
WordTranslation. The result is an association which gives translations for different languages, with the languages listed roughly in order of decreasing worldwide usage.
Give translations of “hello” into the 5 most common languages in the world:
Let’s take the top 100 languages, and look at the first character in the first translation for “hello” that appears. Here’s a word cloud that shows that among these languages, “h” is the most common letter to start the word for “hello”.
For the top 100 languages, make a word cloud of the first characters in the word for “hello”:
Interpreter["type"] | | specify a function to interpret natural language |
TextCases["text","type"] | | find cases of a given type of object in text |
TextStructure["text"] | | find the grammatical structure of text |
WordTranslation["word","language"] | | translate a word into another language |
35.1Use
Interpreter to find the location of the Eiffel Tower.
»
35.2Use
Interpreter to find a university referred to as
“U of T
”.
»
35.3Use
Interpreter to find the chemicals referred to as C2H4, C2H6 and C3H8.
»
35.5Find universities that can be referred to as
“U of X
”, where x is any letter of the alphabet.
»
35.7Find cities that can be referred to by permutations of the letters a, i, l and m.
»
35.9Find all nouns in
“She sells seashells by the sea shore.
” »
35.10Use
TextCases to find the number of nouns, verbs and adjectives in the first 1000 characters of the Wikipedia article on computers.
»
35.11Find the grammatical structure of the first sentence of the Wikipedia article about computers.
»
35.12Find the 10 most common nouns in
ExampleData[{"Text", "AliceInWonderland"}].
»
35.15Generate a list of the translations of numbers 2 through 10 into French.
»
What possible types of interpreters are there?
In simple cases, such as dates or basic currency, no. But for full natural language input, yes.
When I say “4 dollars”, how does it know if I want US dollars or something else?
Yes. GrammarRules lets you build up your own grammar, making use of whatever existing interpreters you want.
Can I find what part of speech a word is?
PartOfSpeech tells you
all the parts of speech a word can correspond to. So for
“fish
” it gives noun and verb. Which of these is correct in a given case depends on how the word is used in a sentence
—and that
’s what
TextStructure figures out.
Can I translate whole sentences as well as words?
TextTranslation does this for some languages, usually by calling an external service.
It can translate lots of words for the few hundred most common languages. It can translate at least a few words for well over a thousand languages.
LanguageData gives information on over 10,000 languages.