In this take-home exercise, I will do some further analysis and improvement based on my classmate, Yeo Kim Siang’s work.
We will use the following packages in this packages.
packages = c('tidyverse', 'patchwork')
for(p in packages){
if(!require(p, character.only = T)){
install.packages(p)
}
library(p, character.only = T)
}
From the existing bar charts of householdSize and haveKids, we can speculate that if household size equals to 3, there would be a kid in the household. We will use the grouped bar chart to verify that.
ggplot(part_data, aes(fill = haveKids, x=householdSize)) +
geom_bar(position="dodge", stat="count") +
geom_text(stat = 'count', aes(label = stat(count)), vjust = -0.5, size = 3)
From the bar chart above, we can confirm that there’s no kids within the households which contain only 1 or 2 members.
p1 <- ggplot(part_data, aes(x=haveKids, y=educationLevel)) +
geom_jitter()
p2 <- ggplot(part_data, aes(x=haveKids, y=interestGroup)) +
geom_jitter()
p1 + p2
ggplot(part_data, aes(fill = haveKids, x = joviality)) +
geom_density(position = "dodge")
From the plot above, we can see that the closer to 1 the higher density for the group which has kids. In the meantime, the lower the joviality, the higher the density for the group which does not have kids.