In 2012, the first World Happiness Report was published. The goal of this report was to measure and evaluate the overall happiness of the world. Recently, the 2017 version of the World Happiness Report was published with updated findings. According to this report, of the many metrics used to identify progress and success throughout the world, happiness is gaining popularity.
The goal of this project was to 1) map the data from the 2017 World Happiness Report so that the happiness distribution around the world could quickly be assessed and 2) test the hypothesis that a country’s population density and Ethnic Fractionalization Index (EFI) are correlated to happiness. The hypothesis is that a country that is either too densely or too sparsely populated, or has an EFI score that is too high or too low will result in lower happiness. The EFI score for a country is a measure of how ethnically diverse it is. If a country has a score of 1, there are two or more completely unrelated languages present and if it has a score of 0, all people within the country speak the same language.
Concept inspiration for this project came directly from the World Happiness Report itself. Within the report is a choropleth map that shows the global distribution of country happiness scores. This map easily reveals the distribution of scores, but offers no information of other variables that may be correlated to the happiness distribution. Along with this, the map uses a two color family palette which is potentially misleading as there is only one variable present.
An example of a successful use of a one color family choropleth map comes from Nikhil Sonnad’s Dude Map. This interactive map shows the U.S. frequency distribution of the use of five different nouns. This map is clear, easy to understand and read, and uses a relatively lighthearted primary subject matter to draw attention to linguistic variations throughout the country.
Finally, Kyle Kim’s visualizations of the California drought that was published in the LA Times was used as inspiration for how a small multiples format can be used effectively in mapping, as well as how a red-yellow color palette works by drawing on the associative power of those colors (red = bad, yellow = good (better)).
Data for this project were gathered from three different sources. The happiness data were downloaded from worldhappiness.report. The current population density data by country were collected from worldometers.info and finally the EFI information was collected by James Fearon and made available on chartsbin.com.
These data were cleaned in Microsoft Excel and mapped in Carto. The final renderings of static maps were produced in Adobe Photoshop.
The data from the World Happiness Report was the primary dataset for this project as this was both the smallest set in regards to number of countries for which there were data, and therefore was the dataset that all others were compared against. The World Happiness Report generates country’s happiness score as follows:
Our analysis of the levels, changes, and determinants of happiness among and within nations continues to be based chiefly on individual life evaluations, roughly 1,000 per year in each of more than 150 countries, as measured by answers to the Cantril ladder question: “Please imagine a ladder, with steps numbered from 0 at the bottom to 10 at the top. The top of the ladder represents the best possible life for you and the bottom of the ladder represents the worst possible life for you. On which step of the ladder would you say you personally feel you stand at this time?” We will, as usual, present the average life evaluation scores for each country, based on averages from surveys covering the most recent three-year period, in this report including 2014-2016.
The data from the World Happiness Report was imported into Excel and organized in alphabetical order by country name. The same was then done with the population density data and EFI data. Because there were population and EFI data present for countries that did not have a happiness score, a VLOOKUP formula was used to eliminate all countries that did not have a least 2 of the 3 variables. Ultimately, there were a few countries for which there were happiness and population density data, but not EFI data. These countries were kept in the dataset.
The first map that was produced was an interactive map based on happiness score. A single color family palette was used because there is only one variable being displayed and colors were chosen based on associative properties. Red was used for those countries with the lowest happiness scores and progressively lighter and more yellow hues were used for higher happiness scores. The countries for which there were no data remained white as to separate them from those that did. Along with this users can zoom into the map and click on individual countries to reveal the absolute values for happiness, EFI, and population density. This map can be reviewed here.
In an effort to help users visually assess the hypothesis, a triptych of static maps were created. Here users see a static version of the above happiness map along with two other maps displaying population density and EFI distributions. Because the hypothesis is that a population density or an EFI score that is either too low or high correlates with a low happiness score, both the high and low end of the scales on these two maps were colored red with hues adjusting progressively to light yellow in the center. The same color palette was used on all three maps to help users visually examine potential correlations and legends were included to make clear what variable is being displayed. Initially, the EFI map introduced an error as Carto interpreted cells with a null value as 0 and colored them as if they had an EFI score of 0. Because 0 is a valid EFI score, these null cells were coded as a negative number in the dataset and filtered out using a frequency histogram widget. This rendered the countries with no EFI score as white, rather than red. The three maps were assembled in Photoshop and can be viewed below.
The interactive happiness map reveals the same distribution of scores as the map contained within the World Happiness Report, which is logical as they utilized the same (or at least a very similar) data set. From this map we can see that Africa has the highest concentration of low happiness scores and, in general, predominately Western European countries, such as the U.S., Canada, Australia, and several northern European countries have the highest happiness scores. Using the pop-up feature we can see that the country with the lowest happiness score (Central African Republic) is sparsely populated (8 pp/sq km) and has a high EFI score (0.791) while the country with the highest happiness score (Norway) has almost double the population density (15 pp/ sq km), but a far lower EFI score (0.1). Based on these two extremes, it is tempting to positively correlate EFI and happiness score, however the static maps reveal that it is not so simple.
Attempting to visually correlate country happiness score, EFI score, and population density quickly becomes complicated with the potential for misleading visualizations. Because of the lack of precision in attributing a color to a given score, as well as the lack of information available for what numerical range each color represents, the maps become too vague for detailed analysis. Along with this, the color distribution in the population density map is evenly distributed, however the underlying data is not, creating a skewed representation where the average is pushed far to one side of the scale. Another issue with this visualization is that they are not overly intuitive for users which could potentially lead to errors in reading the maps based on a lack of familiarity.
These issues aside, based on these maps there doesn’t seem to be a strong correlation between happiness and population density, and perhaps a small negative correlation between EFI score and happiness. There are enough visible exceptions to these however, that the argument is not very strong from this analysis alone.
Other direction of study based around this project could be to identify other variables that may correlate more strongly with the world happiness distribution and map those to support or refute their correlation. Along with this, calculations could be made to combine EFI score and population density in different ways to see how happiness score correlates as function of both EFI score and population density together. It would also be beneficial to include more traditional correlation visualizations such as a scatter plot to support the maps to avoid potential user errors when visually correlating variables by color.