Five Data Science + R People | Next - Issue #54
Five Stories
Vicky Boykis is a machine learning engineer at Duo. Her blog posts are a mixture of intelligent technical rants about data science in production and what implications these technologies have. Following her blog for a long time, I've always enjoyed going back for a new perspective on thinking about things.
Some of my favourite posts are:
Cédric writes about data visualization. His posts concentrate on simple methods that can enhance your plots tenfold. Currently, he is an independent data visualization specialist based in Berlin, Germany.
Some of my favourite posts from his blog are:
Jesse Mostipak hosts the Data + Curiosity podcast, among other things. On her blog, she posts about various tricks she dealt with in her machine learning and data science journey.
Some of her videos and blog posts I like:
Twitter asked, Alison and Allison Answered - Palmer Penguins (Video)
Creating a bump plot using {ggbump} and Parks Access data (Blog)
Business Science produces courses on data science. They aim to narrow the gap between data scientists' skill sets and business objectives. It is run by Matt Dancho, creator of tidyquant and timetk packages, along with Haley Dancho, who manages the finances of the company. Their blog posts are well-illustrated and detailed. Long-time readers would recall many of their blog posts that I've included in this letter.
Some of my favourite posts:
Laura Ellis is the Vice President of Engineering and Platform Analytics for Rapid7.
She is a data leader with the technical skills to implement analytic projects and soft skills to drive project success and speak analytics to a non technical audience. Her mission is to make data science and analytics accessible to everyone in a secure and scalable manner.
Particularly, I enjoy her three-minute #FunDataFridays, which are scrolls about a cool data resource. She has a good blog collection as well. Some of my favourites:
Four Packages
tidyquant is a package that makes using xts, zoo, quantmod, TTT and PerformanceAnalytics packages easy to use with tidyverse. GitHub.
timetk is a package to work with time-series data. It offers many benefits over other time-series packages, as noted in the vignette. GitHub.
inspectdf is a collection of utility functions for exploring data frames. Functions can summarise missing values, categories, distribution, correlation, etc. Vignette.
esquisse is an RStudio add-in which can help you explore a data frame by creating numerous visualisations with almost no code. Vignette.
Three Jargons
A functional in R is a function that takes a function as an input and returns a vector as output. Examples include apply(), lapply(), tapply(), etc.
S3 class is a base class type in R. Most objects in R are defined as S3 class objects. There is no formal definition, which makes it surprisingly useful.
R6 class is the reference class alternative for object oriented programming in R.
Two Tweets
One Meme
That's a wrap!
I hope you enjoyed today’s letter. If you enjoy the newsletter and would like to support it, you can buy me a coffee here. See you next week!
Harsh