Skip to main content

File Format

The corpus is in an XML file format.

The corpus is divided into three sub-corpora; messages from the morning, afternoon and evening.

The metadata for each message includes the time it was sent, the anonymised username and the subcorpus ID. There is no additional annotation.

loading