File Format
The corpus is in an XML file format.
The corpus is divided into three sub-corpora; messages from the morning, afternoon and evening.
The metadata for each message includes the time it was sent, the anonymised username and the subcorpus ID. There is no additional annotation.