2022 Rewind Repeat | Next - Issue #56
What Goodhart said 46 years ago in Sydney was this, which he jokingly termed Goodhart’s Law: “Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes.” In other words, as the British anthropologist Marilyn Strathern later boiled it down: “When a measure becomes a target, it ceases to be a good measure.”
Related:
What role did the neighbourhood you grew up in shape your economic opportunities? Raj Chetty, an economist at Harvard, and a team of researchers sought to estimate upward mobility across socioeconomic lines. Research showed that outcomes for children could vary widely even when their neighbourhoods were as little as a mile apart. Furthermore, the age at which someone moves profoundly impacts their future earnings but only up until a limit.
Author Aaron Williams tells a data story about migration, community, and returning to his roots. Check it out!
Tools that simplify some basic tasks in using R for exploring and analyzing a dataset in a matrix or data.frame that contains data on demographics (e.g., counts of residents in poverty) and local environmental indicators (e.g., an air quality index), with one row per spatial location (e.g., Census block group).
Hosts Alberto Cairo and Simon Rogers will explore the latest in data journalism. You will meet the world’s top data journalists and find out how they do what they do.
This blog shows how to create a beautiful infographic straight using ggplot2.
Elizabeth Esarove, AT&T; RStudio
Elizabeth Esarove is a data scientist in People Analytics at AT&T. In her role, Elizabeth is part of a larger team focused on embedding data and analytics into the root of decision-making and transforming insights into actionable solutions that improve employee outcomes and drive business value.
People are the face, heart, and hands of a company. In people analytics, we analyze data to reveal actionable insights that provide evidence for decisions regarding employees, work, and business objectives. This talk covers the use of data science for people analytics projects such as workforce planning, improving employee engagement, and retaining talent.
Probably the best example that shows the power of Quarto — programming in R, Python and Julia in a single document. The example shows how to use Tweet data to create a world cloud. Check this blog post to learn how to do this.
sdmTMB is an R package that fits spatial and spatiotemporal predictive-process GLMMs (Generalized Linear Mixed Effects Models) using Template Model Builder (TMB), R-INLA, and Gaussian Markov random fields. One common application is for species distribution models (SDMs). See also the documentation site.
Hadley Wickham, Mine Çetinkaya-Rundel, and Garrett Grolemund
This chapter covers essential parts of Base R commands like lists, [[]], and $. They can simplify many tasks, like selecting a column from a data frame, organising objects together, and more.
Have you ever thought about exploring your Tweets? As the year draws to a close, maybe it is time. This notebook is a starting point for processing tweets from your Twitter archive, which you can download here. The website uses a local file input, so your data doesn't get uploaded anywhere and stays private.
Are people happy at work? The American Time Use Survey asks people to score their happiness from 0 to 6, where 0 is not happy, and 6 is very happy. Here’s how people answered.
An unusual lottery result made the news recently: on October 1, 2022, the PCSO Grand Lotto in the Philippines, which draws six numbers from {1} to {55} at random, managed to draw the numbers {9, 18, 27, 36, 45, 54} (though the balls were actually drawn in the order {9, 45,36, 27, 18, 54}). In other words, they drew exactly six multiples of nine from {1} to {55}.
Numberphile called Terrence Tao "The World's Best Mathematician". In his blog post, he calculates what would be the probability of that rare event.
The Good Country Index is an effort to highlight and rank the countries that are doing good for the rest of the world. The Good Country Index measures what countries contribute to the world outside their borders and what they take away: it’s their balance sheet towards humanity and the planet. Select the metrics that are pertinent to you, and the ranks adjust accordingly.
They also have a podcast: People, Places, Power: The Podcast
Robert is a man of tiny stature from the United States who wears glasses and likes solving puzzles. He is more likely to be, (a) A truck driver, or (b) A computer science student at Stanford?
If your mind was drawn to the answer (b), this is the base rate fallacy at work here. You probably figured that the description of Robert wouldn’t fit in too well with your image or stereotype of truck drivers. But if you would to take a step back and think about the bigger picture - there are clearly way more truck drivers in the United States, than students in a certain university in California. Hence, it follows that Robert is more likely to be a truck driver!
This is a nice visualization of this bias.
In this lab session, I share how to use the apriori algorithm for association mining. The goal is to find useful causal and association rules that can help design company promotions. Plus, you get to see what's served at an Indian cafe.
I use arules and arulesViz for analysis and visualization, respectively.
That's a wrap!
Hope you had a wonderful 2022 and are excited about 2023. Here are some questions to ask yourself. If you plan to craft your New Year's resolution, I suggest trying a theme rather than a goal.
Remember, I'm taking a break for the next two months. The next Next would hit your inbox in February 2023. Happy holidays! 🎄🎅
Harsh