Next — Today I Learnt About Data Science

Share this post

How Duolingo’s AI Learns What You Need to Learn

hvsc1708.substack.com

How Duolingo’s AI Learns What You Need to Learn

Next | Issue #59

Harshvardhan
Feb 15
Share this post

How Duolingo’s AI Learns What You Need to Learn

hvsc1708.substack.com

Hi there!

Today, I will share several interesting stories: hidden patterns in Albanian street names, a dark RStudio theme, a tool to embed Python in R, how to make better bar plots, and of course, how Duolingo’s AI works. This article was fascinating for me to read. It goes in-depth while keeping sufficiently general.

Thanks for reading Next — Today I Learnt About R! Subscribe for free to receive new posts and support my work.

There is also a small poll at the end that I’d love for you to respond. Dive in!

Five Stories

1. Hidden Patterns in Street Names

Dea Bardhoshi

Dea explores the gender distribution of street names in Tirana, Albania, using data from OpenStreetMap and hand-labelling (hats off for the effort!). The author found that only 3.3% of the street names were named after women, and most of them were foreign women. She also used natural language processing techniques to analyze the most common words and topics in the street names and found that they reflected the history and culture of Albania.

Dea said she’ll be detailing more learnings in her upcoming newsletter, which I recommend. Jupyter Notebook and Dataset are available on her GitHub.

Read now

2. night-owlish: A RStudio Theme

Mara Averick

This project adapts a popular VS Code theme called Night Owl, created by @sdras, to other editors such as Ace and RStudio. The theme is pleasing to see if you’re a dark-mode person. Here’s the code to install and apply it with one line of code. (To install but not apply, set apply = FALSE.)

rstudioapi::addTheme("https://raw.githubusercontent.com/batpigandme/night-owlish/master/rstheme/night-owlish.rstheme", apply = TRUE)

Check GitHub

3. Embedding Python in R

Sage Bionetworks

Using Python from R is a common task for professional data scientists. This package provides a standalone installation for embedding with R. Thus, keeping the system installation separate from the R version. Quite handy for handling all those conflicting package versions.

Check GitHub

4. How Duolingo’s AI Learns What You Need to Learn

Klinton Bicknell, Claire Brust and Burr Settles, Duolingo AI Team

Duolingo is a language-learning app that uses a gamelike approach with sophisticated AI systems to guide users through a curriculum that leads to language proficiency. One of the AI systems, called Birdbrain, uses algorithms based on decades of research in educational psychology and recent advances in machine learning to continuously improve the learner's experience.

The company's ambitions go beyond language learning, as it recently launched apps covering childhood literacy and third-grade mathematics. Duolingo's founders were inspired by the 2-sigma problem identified by educational psychologist Benjamin Bloom. They aimed to make an easy-to-use online language tutor that could approximate the supercharging effect of individual tutoring. To automate the three critical attributes of good tutors, Duolingo uses machine learning and other cutting-edge technologies to ensure expertise, keep learners engaged, and provide personalized lessons.

This is a fascinating read with sufficient technical details. Do check it out.

Read on

5. Bar plot checklist

Albert Rapp

Bar plots are the most common plots I use daily. Some common modifications I do: flipping coordinate axes, arranging them in decreasing order and sometimes changing colours to show deviation.

In this post, Albert covers a bunch of such techniques. You should bookmark it for the next time you create a bar plot in R.

Read on

Four Packages

rcrossref is the R interface to CrossRef’s API. Github.

ezsummary provides some convenient functions for wrangling data. Github.

sparkline provides jQuery-based sparklines, which can also be used in R Markdown documents. Github.

rticles provides R Markdown and LaTeX templates for various journals. You can see the list. Github.

Three Jargons

Convolutional Neural Network (CNN): A type of neural network commonly used for image recognition, which uses convolutional layers to detect features in the input image.

Ensemble Learning: A technique where multiple machine learning models are trained and combined to improve the system's overall performance.

Bias-Variance Tradeoff: A fundamental concept in machine learning where the goal is to find a model that balances the tradeoff between overfitting and underfitting by managing the bias and variance of the model.

Two Tweets

Twitter avatar for @jhoang314
Jerrick Hoang @jhoang314
I've always wanted to work in AI/ML but the college I attended was a liberal arts college and did not have an ML course. It took me 1.5 years out of college to transition from a backend to ML and here's the exact list of resources I used,
5:56 PM ∙ Feb 12, 2023
4Likes2Retweets
Twitter avatar for @alexxubyte
Alex Xu @alexxubyte
/1 How does ChatGPT work? Disclaimer: since OpenAI hasn't provided all the details, some parts of the diagram may be inaccurate. @sama, we would love to hear your feedback. We attempted to explain how it works in the diagram below. The process can be broken down into two parts.
Image
4:45 PM ∙ Jan 31, 2023
2,498Likes710Retweets

One Meme

Bonus

Twitter avatar for @culturaltutor
The Cultural Tutor @culturaltutor
Have you ever noticed that the save icon is a floppy disk, even though they became obsolete twenty years ago? That's called a "skeuomorph" - when something new takes on the appearance of what it replaced. And once you start to look, they're everywhere...
Image
10:17 PM ∙ Feb 6, 2023
154,525Likes21,355Retweets
Loading...

Thanks for reading Next — Today I Learnt About R! Subscribe for free to receive new posts and support my work.

Share this post

How Duolingo’s AI Learns What You Need to Learn

hvsc1708.substack.com
Previous
Next
Comments
TopNew

No posts

Ready for more?

© 2023 Harshvardhan
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing