Skip to main content

Good Morning, Twitter! - Language researchers discover social networks as a rich source of data

Man with dark glasses fells asleep and lies his had on a table. In one hand he holds a cup of coffee. In the foreground is an alarm clock on the table. Photo: Fotolia/katie martynova.
Photo :
Good Morning, Twitter! Photo: Fotolia/katie martynova.

The ringing of the alarm clock is merciless. People who have difficulty getting out of bed, but love to stay up late, are called “owls” in chronobiology, a field of science that deals with biological rhythms and the physiological processes that accompany them. “Larks,” in contrast, find it easy to get up in the morning. But this means that they’re tired in the evening. Daylight and your biological clock determine in a natural way when the day begins and when it ends. Whoever has to go to work or school only enjoys this luxury on the weekend, though. The “owl” chronotype suffers unfortunately from the daily rhythm set by the alarm clock. Researchers call this “social jet lag.” Because “owls” can’t go to sleep early enough, but have to wake up on time anyway, they rack up a lack of sleep that they often have to compensate for on the weekend with more sleep.

Tatjana Scheffler is not a chronobiologist. But she is a computer linguist who is interested in the problem of “social jet lag.” She got into this field because of research by her colleague Christopher Kyba, who works on the subject of light pollution at the German Research Center for Geosciences (GFZ). This is because artificial light also affects people’s sleep-wake cycles. “Normally, researchers collect data about this in sleep studies that are conducted in laboratories, or with surveys,” says Scheffler. However, surveys and subsequent evaluations mean a lot of work. And Scheffler has a solution at hand for precisely this problem. 

Part one is the short message service Twitter, and part two is a computer program that can analyze text automatically. The idea is that the phrase “Good morning!” is the key moment that corresponds to when Twitter users wake up. This enables researchers to see when thousands of Twitter users wake up, and to do so on a daily basis.

Researchers worked on a comprehensive study, which began with collecting Tweets, to see if the idea held up. To do this, Scheffler used a programming interface that allows her to call up Tweets automatically. She inputs certain search criteria, such as keywords, hash tags, or user names. This filters all of the Tweets from masses of messages down to specific messages. Then the filtering process is refined. “If for example I only want German Tweets, then I use a language filter. If I’m only interested in certain language or sentence structures, then I can search for just that,” says Scheffler.  To examine when Twitter users wake up, she filters out all Tweets with the phrase, “Good morning!” and associates these with the times at which they were tweeted.

The computer programs have to be trained how to identify and analyze gigantic volumes of linguistic information; then researchers manually add descriptive attributes to a certain volume of text. “That is the most time-consuming step of them all,” says Scheffler, and it is called “annotating.” The program learns on this basis how to detect and categorize certain text features on its own. This machine learning process enables large volumes of text to be analyzed within the shortest time.

The results of the “Good morning!” study show that this process works. “There are really lots of people who have their cell phone next to their bed, and the first thing that they do in the morning is to Tweet it,” says Scheffler. She collected all Tweets with the phrase “Good morning!” for one year. Overall, she evaluated about 1.5 million Tweets from about 200,000 users. She was particularly interested in the differences between wake-up times on workdays, when the alarm clock gives the signal to awaken, and on Sundays, when the wake-up time is determined more by natural factors. “In winter and spring, wake-up times on Sundays are very close to when the sun comes up,” she explains. “This has also been shown in sleep studies.” This is because in these seasons the internal clock that gives the signal to awaken agrees most with natural light signals. In contrast, wake-up times in summer and winter deviate from sunrise times – which the Twitter data also showed. “We see this as a confirmation of our method,” says Scheffler. There are already plans for a joint research project with a chronobiologist in which the researchers will apply the new method.

Analyzing language from social networks can also benefit other research, says Scheffler, because there is actual communication going on there, not just posts. The structure of these conversations is very interesting for many researchers. What are people talking about? Are they communicating with people who have different opinions? What does the formation of political opinion look like in social media? These questions are interesting above all to social scientists. Computer linguists are supplying the necessary tools to tap into these potential data sources, which can also be useful for political, communication, and media scholars.

 

The Scholars

Dr. Tatjana Scheffler studied computer linguistics in Saarbrücken and completed her doctorate at the University of Pennsylvania (USA). She has been performing research since 2013 at the University of Potsdam.

Contact

University of Potsdam
Department of Linguistics
Karl-Liebknecht-Str. 24-25
D-14476 Potsdam
Email: tatjana.scheffleruni-potsdamde

Text: Heike Kampe
Translation: Dr. Lee Holt 
Published online by: Agnetha Lang
Contact for the editorial office: onlineredaktionuni-potsdamde

onlineredaktionuni-potsdamde

Published