Assignment on authorship attribution created for the course Text and Multimedia Mining, Radboud University, Nijmegen.
The file authorship.py
provides a template with steps that are commonly used for training and evaluating of a classifier.
The feature extraction phase is left out. Can you train and evaluate a reliable classifier for authorship attribution?
If you think this code can help you in another project, for example another classification task, feel free to (re)use it.
The code is build for Python 3. You can install the packages used with pip
. I.e. pip3 install -r requirements.txt
.