Wolfram Language

Live Version Coming Soon Check out other examples

Determine the Author of a Text

Make a function that determines the author of a text.


code

Othello = Import["http://www.gutenberg.org/cache/epub/2267/pg2267.txt"]; Hamlet = Import[ "http://www.gutenberg.org/cache/epub/2265/pg2265.txt"];
TheImportanceOfBeingEarnest = Import["http://www.gutenberg.org/cache/epub/844/pg844.txt"]; ThePictureofDorianGray = Import["http://www.gutenberg.org/cache/epub/174/pg174.txt"];
LesMiserables = Import["http://www.gutenberg.org/cache/epub/135/pg135.txt"]; NotreDamedeParis = Import["http://www.gutenberg.org/cache/epub/2610/pg2610.txt"];
author = Classify[<| "William Shakespeare" -> {Othello, Hamlet}, "Oscar Wilde" -> {TheImportanceOfBeingEarnest, ThePictureofDorianGray}, "Victor Hugo" -> {LesMiserables, NotreDamedeParis} |>]

how it works

Import examples of the writings of Shakespeare, Oscar Wilde, and Victor Hugo to train a classifier:

Othello = Import["http://www.gutenberg.org/cache/epub/2267/pg2267.txt"]; Hamlet = Import[ "http://www.gutenberg.org/cache/epub/2265/pg2265.txt"];
TheImportanceOfBeingEarnest = Import["http://www.gutenberg.org/cache/epub/844/pg844.txt"]; ThePictureofDorianGray = Import["http://www.gutenberg.org/cache/epub/174/pg174.txt"];
LesMiserables = Import["http://www.gutenberg.org/cache/epub/135/pg135.txt"]; NotreDamedeParis = Import["http://www.gutenberg.org/cache/epub/2610/pg2610.txt"];

Make a classifier function with the training set:

author = Classify[<| "William Shakespeare" -> {Othello, Hamlet}, "Oscar Wilde" -> {TheImportanceOfBeingEarnest, ThePictureofDorianGray}, "Victor Hugo" -> {LesMiserables, NotreDamedeParis} |>]

Test the classifier on texts not in the training set:

Macbeth = Import["http://www.gutenberg.org/cache/epub/2264/pg2264.txt"]; AnIdealHusband = Import["http://www.gutenberg.org/files/885/885-0.txt"]; TheManWhoLaughs = Import["http://www.gutenberg.org/cache/epub/12587/pg12587.txt"];
author[{Macbeth, AnIdealHusband, TheManWhoLaughs}]