Open live version
Generate Random Pronounceable Words
Generate random English “words” where the probability of each letter is given by the likelihood that the letter occurs after the previous two letters somewhere in the dictionary.
code
trigrams =
Append[#[[1, 1, ;; 2]] -> (#[[All, 2]] -> #[[All, 1, 3]]) & /@
GatherBy[
Tally[Flatten[
Partition[Join[{"A", "A"}, #, {"Z"}], 3, 1] & /@
Characters[DictionaryLookup[RegularExpression["[a-z]+"]]],
1]], #[[1, ;; 2]] &],
{__String} -> {"e"}];
randomWord[] :=
StringJoin[
NestWhile[Append[#, RandomChoice[Take[#, -2] /. trigrams]] &,
{"A", "A"}, #[[-1]] =!= "Z" &][[3 ;; -2]]]
englishWordQ[word_] := DictionaryLookup[word] =!= {}
CloudDeploy[APIFunction[{},
With[{word = randomWord[]},
Pane[Style[word, FontFamily -> "Times", FontSize -> 48,
FontColor -> If[englishWordQ[word], Darker[Red], Black]], 1000,
ImageMargins -> 100]
] &, "PNG"], Permissions -> "Public"]
Make random words using bigrams instead of trigrams.
bigrams =
Append[#[[1, 1, ;; 1]] -> (#[[All, 2]] -> #[[All, 1, 2]]) & /@
GatherBy[
Tally[Flatten[
Partition[Join[{"A"}, #, {"Z"}], 2, 1] & /@
Characters[DictionaryLookup[RegularExpression["[a-z]+"]]],
1]], #[[1, 1]] &],
{_String} -> {"e"}];
bigramRandomWord[] :=
StringJoin[
NestWhile[Append[#, RandomChoice[Take[#, -1] /. bigrams]] &,
{"A"}, #[[-1]] =!= "Z" &][[2 ;; -2]]]
how it works
The data that guides the generation of random “words” is contained in trigrams, a list of elements like this one, that indicates that there is 1 word where “aa” is followed by h, 4 words where “aa” is followed by “r”, and so on. The right side of the first arrow is in precisely the format required by RandomChoice to make a weighted choice from a list.
{"a", "a"} -> {1, 4, 2, 1, 1, 1, 2, 4} -> {"h", "r", "Z", "e", "i",
"s", "l", "m"}
To build the complete list of trigrams, first find all triples of characters in the dictionary, tally them, and gather together those that have the same first two letters. Here is an example of the result when the first two letters are “aa”.
GatherBy[Tally[
Flatten[Partition[#, 3, 1] & /@
Characters[DictionaryLookup[RegularExpression["[a-z]+"]]],
1]], #[[1, ;; 2]] &][[1]]
Using the convention that “AA” indicates the beginning of a word and “Z” the end, special entries in trigrams for {“A”,”A”} and {“A”, “a”} give the frequencies, respectively, of words that begin with a given letter and words whose second letter is a given letter. “Z” elements indicate “no subsequent letter”. A default case is appended to trigrams to avoid errors. This is the complete list of trigrams:
trigrams =
Append[#[[1, 1, ;; 2]] -> (#[[All, 2]] -> #[[All, 1, 3]]) & /@
GatherBy[
Tally[Flatten[
Partition[Join[{"A", "A"}, #, {"Z"}], 3, 1] & /@
Characters[DictionaryLookup[RegularExpression["[a-z]+"]]],
1]], #[[1, ;; 2]] &],
{__String} -> {"e"}];
To generate a random word using the trigrams, start with “AA” and repeatedly add random characters guided by the weights in trigrams until a “Z” is produced.
randomWordCharacters =
NestWhile[
Append[#, RandomChoice[Take[#, -2] /. trigrams]] &, {"A",
"A"}, #[[-1]] =!= "Z" &]
Extract the third to next-to-last characters and join them together to make a pronounceable word.
StringJoin[randomWordCharacters[[3 ;; -2]]]
Package those steps as the function randomWord.
randomWord[] :=
StringJoin[
NestWhile[Append[#, RandomChoice[Take[#, -2] /. trigrams]] &,
{"A", "A"}, #[[-1]] =!= "Z" &][[3 ;; -2]]]
Test randomWord by making a list of random words.
Column[Table[randomWord[], {20}]]
englishWordQ tests if a word is in the dictionary.
englishWordQ[word_] := DictionaryLookup[word] =!= {}
This selects the English words in a sample of 20 random words:
Select[Table[randomWord[], {20}], englishWordQ]
You can estimate the likelihood of getting an actual word with randomWord experimentally:
Row[{Length[Select[Table[randomWord[], {10000}], englishWordQ]]/
10000*100., "%"}]
Here’s how to make a web page that gives you a different, nicely formatted word each time you visit it.
Format a word using Style to set the font and size and Pane to add some white space around the word. Set the color to red if it is an English word.
With[{word = randomWord[]},
Pane[Style[word, FontFamily -> "Times", FontSize -> 48,
FontColor -> If[englishWordQ[word], Darker[Red], Black]],
ImageMargins -> 100]
]
Deploy that to the cloud with APIFunction and CloudDeploy, specifying Permissions”Public” so that everyone has access.
CloudDeploy[APIFunction[{},
With[{word = randomWord[]},
Pane[Style[word, FontFamily -> "Times", FontSize -> 48,
FontColor -> If[englishWordQ[word], Darker[Red], Black]], 1000,
ImageMargins -> 100]
] &, "PNG"], Permissions -> "Public"]