Datenanalyse und Stochastische Modellierung - Dr. Philipp Meyer
Exercise 5

Political approval ratings

In this exercise we will examine the predictability (short-time autocorrelations) of political polls. If the scaling of the time-averaged mean squared displacement (TAMSD) is steeper than linear, it implies correlated dynamics, linear implies uncorrelated (random walk - not predictable) and slower-than-linear implies anti-correlations.

We use polling data for the approval ratings of Angela Merkel during the time she was chancellor of Germany. It can be found at the website of Forschungsgruppe Wahlen https://www.forschungsgruppe.de/Umfragen/Politbarometer/Langzeitentwicklung_-_Themen_im_Ueberblick/Politik-Archiv_1/Legislatur_2017_-_2021/Arbeit_Merkel_2021.xlsx. Import the data using the following python code with the URL as filename

from io import BytesIO
import openpyxl
import urllib.request

datasource = urllib.request.urlopen( filename ).read()
file = openpyxl.load_workbook(filename = BytesIO(datasource), data_only=True)
tabelle = file["Tabelle1"]

allCells = np.array([[cell.value for cell in row] for row in tabelle.iter_rows()])
data = allCells[ 9:296 , 1:4 ] #cells in which the data can be found

x = data[:,1]
      

Alternatively, you can save the data in a .txt-file and use the function np.loadtxt to import it.

  • plot the approval ratings against time
  • calculate the TAMSD and display the result in a loglog-plot
  • for long times, the TAMSD saturates due to the confinement of the percentage values; we concentrate on the short-time scaling, i.e. the first 5 points in the TAMSD function; fit a scaling (linear fit to a loglog-plot) to these first 5 values of the TAMSD; plot the fitted scaling together with the TAMSD in one figure
  • polling data has an uncertainty due to the small sample of people interviewed compared to the whole population; in our case, the sample size is 1250, which leads to a white noise with variance depending on the polling result x (between 0 and 1) \[ \sigma^2 = x(1 − x)/1250 \]
  • calculate the TAMSD of the noise term using the average value of x
  • subtract the TAMSD of the noise from the TAMSD of the data following the superposition principle; fit the first 5 points of the residual TAMSD and fit both the TAMSD and the scaling; is the approval rating predictable on short time scales?