Wolfram Language

Systems-Level Functionality

Create a Shakespearean Corpus with FileSystemScan

For this example, a directory is utilized including text files of all Shakespeare's works. Start by importing the textual content of the books with FileSystemMap, collecting only the textual content itself.

show complete Wolfram Language input
In[1]:=
Click for copyable input
booksdir = FileNameJoin[{$HomeDirectory, "Books", "Shakespeare"}]
In[2]:=
Click for copyable input
works = Values[ FileSystemMap[Import, FileNameJoin[{$HomeDirectory, "Books"}], 2, FileNameForms -> "*.txt"][[1]]]
Out[2]=

Construct a single corpus using StringJoin.

In[3]:=
Click for copyable input
corpus = StringJoin[works]
Out[3]=

The corpus can now be treated as a single searchable string, allowing for advanced text processing applications to be trivially utilized. Determine which countries are referenced in these works using TextCases while filtering out duplicates and issues of casing.

In[4]:=
Click for copyable input
countries = ToLowerCase[TextCases[corpus, "Country"]] // DeleteDuplicates
Out[4]=
show complete Wolfram Language input
In[5]:=
Click for copyable input
countries = DeleteCases[countries, "world"];

Construct a GeoListPlot of the countries referred to in the works of Shakespeare.

In[6]:=
Click for copyable input
GeoListPlot[Interpreter["Country"] /@ countries]
Out[6]=

Related Examples

de es fr ja ko pt-br ru zh