Raw data, script and explanations of: So nah stehen sich die Parteien - und so nah ihre Kandidaten.
The Wahl-O-Mat is a database published by the “Bundeszentrale für politische Bildung” (German Federal Agency for Civic Education), which is used to assess the positions of the political parties taking part in the Bundestagswahlen (German Federal Election 2017). The Wahl-O-Mat compiles the different parties’ opinions on several current political questions through statements the parties can approve, disapprove or stay neutral towards. This year, 32 out of 42 parties gave their opinion towards the 38 current statements about what should change regarding Germany’s future.
First we manually gathered the parties positions and then dummy coded them into a data set where we chose: approval = 1, disapproval = -1, neutrality = 0. We then cleaned, restructured and analyzed the data with a statistical programming environment R.
#install.packages("needs");library(needs)
needs(tidyverse,lsa,magrittr) # load the needed packages
parteien <- read.csv("https://interaktiv.morgenpost.de/parteien-bundestagswahl-2017/data/wahlomat_raw.csv", sep=";", encoding='UTF-8', stringsAsFactors = F, check.names = F) # load data
d_parteien <- parteien %>% select(-2) %>% gather(Partei, Wert, 2:33) %>% spread(These, Wert) # manipulate structure
head(d_parteien[1:3])
## Partei Alle Banken sollen verstaatlicht werden.
## 1 AfD -1
## 2 Allianz Deutscher Demokraten -1
## 3 B* 1
## 4 BGE 0
## 5 BP -1
## 6 BüSo -1
## Alle Bürgerinnen und Bürger sollen bei gesetzlichen Krankenkassen versichert sein müssen.
## 1 -1
## 2 -1
## 3 1
## 4 1
## 5 1
## 6 -1
We calculated the agreement for every possible pair of parties. The loop we used to count the numbers of matching opinions looks overwhelming, but works just as we intended: it compares the answers of party 1 with the answers of party 2. Each agreement is counted as a match. If one party approved or disapproved a statement while the other stayed neutral to it, we counted it as half a match because they both don’t disagree. In the end we get a matrix of agreements where every value is the percentage of an agreement between two parties.
proximity <- function(data){
pr <<- data.frame(data$Partei, matrix(NA, ncol = nrow(data), nrow = nrow(data)))
colnames(pr) <<- c("party", data$Partei)
for(j in 1:nrow(pr)){
for(k in 1:nrow(pr)){
a <- mean(data[j,-1] == data[k,-1], na.rm=T)*100
b <- 0.5*mean(data[j,-1] != data[k,-1] & (data[j,-1] == 0 | data[k,-1] == 0), na.rm=T)*100
pr[j,k+1] <<- ifelse(is.na(sum(a,b, na.rm=T)), 0, sum(a,b, na.rm=T))
}
}
}
proximity(d_parteien)
head(pr[1:4])
## party AfD Allianz Deutscher Demokraten
## 1 AfD 100.00000 53.94737
## 2 Allianz Deutscher Demokraten 53.94737 100.00000
## 3 B* 38.15789 50.00000
## 4 BGE 32.89474 44.73684
## 5 BP 75.00000 60.52632
## 6 BüSo 44.73684 61.84211
## B*
## 1 38.15789
## 2 50.00000
## 3 100.00000
## 4 81.57895
## 5 52.63158
## 6 59.21053
Our interactive application lets the user take a closer look at the percentage agreements of all parties towards a selected party. As an example, we visualize the agreement towards the populist right party “AfD”. As a result you can see that right wing parties like “Die Rechte” and “NPD” have a lot in common with the “AfD” while the left wing party “Die Linke” mostly disagrees.
ggplot(pr) +
geom_point(aes(AfD, reorder(party, AfD))) +
scale_x_reverse() +
ggtitle("Percentage agreement with position of AfD")
Equally, we calculated and examined the political proximity regarding topic clusters like “refugees and integration” between the six established parties.
d_topics <- parteien %>% filter(as.character(Thema) %in% "Flüchtlinge und Integration") %>% select(c(1,3:8)) %>% gather(Partei, Wert, 2:7) %>% spread(These, Wert) # get data of the six parties with topic "refugees and integration"
head(d_topics[1:3])
## Partei
## 1 AfD
## 2 CDU
## 3 FDP
## 4 Grüne
## 5 Linke
## 6 SPD
## Anerkannten Flüchtlingen, die sich Integrationsmaßnahmen verweigern, sollen die Leistungen gekürzt werden können.
## 1 1
## 2 1
## 3 1
## 4 0
## 5 -1
## 6 1
## Für die Aufnahme von neuen Asylsuchenden soll eine jährliche Obergrenze gelten.
## 1 1
## 2 0
## 3 -1
## 4 -1
## 5 -1
## 6 -1
As a result we got a similar matrix, showing how the parties agree on statements about refugees and integration.
proximity(d_topics)
head(pr[1:4])
## party AfD CDU FDP
## 1 AfD 100.0 75.0 50.0
## 2 CDU 75.0 100.0 75.0
## 3 FDP 50.0 75.0 100.0
## 4 Grüne 37.5 62.5 87.5
## 5 Linke 25.0 50.0 75.0
## 6 SPD 50.0 75.0 100.0
To analyze the political proximity between the parties’ candidates we used a different approach. First, we took a look at the raw data as we received them via mail.
d_kandidaten <- read.csv('https://interaktiv.morgenpost.de/parteien-bundestagswahl-2017/data/kandidaten_raw.tsv', sep="\t", encoding='UTF-8', stringsAsFactors = F)
head(d_kandidaten[1:4])
## Username Thesennummer Bereich
## 1 tino-guenther 1 Umwelt
## 2 tino-guenther 2 Flüchtlinge
## 3 tino-guenther 3 Ernährung
## 4 tino-guenther 4 Rente
## 5 tino-guenther 5 Miete
## 6 tino-guenther 6 Lobbyismus
## These
## 1 Dieselfahrzeuge sollen wegen ihres hohen Schadstoffausstoßes aus den Innenstädten verbannt werden.
## 2 Die Politik soll festlegen, wie viele Flüchtlinge Deutschland jedes Jahr aufnimmt.
## 3 Massentierhaltung muss reduziert werden, auch wenn das höhere Fleischpreise bedeutet.
## 4 Zur Vermeidung von Altersarmut müssen die Renten deutlich erhöht werden.
## 5 Vermieter sollen ohne eine staatliche Mietpreisbremse entscheiden können, wie viel Miete sie verlangen.
## 6 Es muss ein verbindliches Lobbyregister geben, in dem u.a. Kontakte zwischen Interessenvertretern und Politikern veröffentlicht werden.
Again, we dummy coded the candidates’ answers as numerical variables.
d_kandidaten$Auswahl[d_kandidaten$Auswahl %in% "nein"] <- -1
d_kandidaten$Auswahl[d_kandidaten$Auswahl %in% "ja"] <- 1
d_kandidaten$Auswahl[d_kandidaten$Auswahl %in% "neutral"] <- 0
d_kandidaten$Auswahl <- as.numeric(d_kandidaten$Auswahl)
We then cleaned the data from neutral candidates as well as candidates without a name or missing answers. Next, we restructured the data frame.
d_kandidaten %<>% group_by(Username) %>% filter(!(all(Auswahl == 0)))
d_kandidaten %<>% select(c(1,4,5)) %>% unique() %>% spread(These, Auswahl) %>% filter(!(Username %in% "")) %>% na.omit()
head(d_kandidaten[1:2])
## # A tibble: 6 x 2
## # Groups: Username [6]
## Username
## <chr>
## 1 abuzar-erdogan
## 2 achim-czylwick
## 3 achim-kessler
## 4 achim-kohler
## 5 achim-post
## 6 adrian-assenmacher
## # ... with 1 more variables: `Afghanistan ist ein sicheres Herkunftsland,
## # in das Abschiebungen möglich sein müssen.` <dbl>
We calculated the political proximity with a cosine dissimilarity distance matrix and rescaled them to determine the candidates’ positions towards each other in a two dimensional room.
dist_kandidaten <- 1 - cosine(t(as.matrix(d_kandidaten[2:23])))
fit_kandidaten <- cmdscale(dist_kandidaten, k=2)
We then merged the coordinates with additional information like the candidates’ party affiliation.
fit_kandidaten <- as.data.frame(cbind(fit_kandidaten, d_kandidaten$Username))
names(fit_kandidaten) <- c('scatter_x', 'scatter_y', 'id')
d_kandidaten_zusatz <- read.csv('https://interaktiv.morgenpost.de/parteien-bundestagswahl-2017/data/kandidaten_2017.csv', sep=';', encoding='UTF-8', stringsAsFactors = F)
d_kandidaten <- merge(d_kandidaten_zusatz,fit_kandidaten,by="id",all.x=T)
d_kandidaten$scatter_x <- as.numeric(as.character(d_kandidaten$scatter_x))
d_kandidaten$scatter_y <- -as.numeric(as.character(d_kandidaten$scatter_y))
With party colors added, we plotted the candidates’ similarities. As a result you can see how the candidates of different parties spread their opinions in contrast to others and where the borders between political differences between them become blurred.
d_kandidaten$color <- "grey"
d_kandidaten$color[d_kandidaten$partei %in% "Linke"] <- "#96276E"
d_kandidaten$color[d_kandidaten$partei %in% "FDP"] <- "#F6BB00"
d_kandidaten$color[d_kandidaten$partei %in% "AfD"] <- "#34A3D2"
d_kandidaten$color[d_kandidaten$partei %in% "SPD"] <- "#DB4240"
d_kandidaten$color[d_kandidaten$partei %in% "CSU"] <- "#373737"
d_kandidaten$color[d_kandidaten$partei %in% "CDU"] <- "#373737"
d_kandidaten$color[d_kandidaten$partei %in% "Grüne"] <- "#4BA345"
# PLOTTEN
ggplot(d_kandidaten, aes(scatter_x, scatter_y, color=color)) +
geom_point(alpha=0.7) +
scale_colour_identity()