Next — Today I Learnt About Data Science

Share this post

RMarkdown/Quarto Tips and Tricks

hvsc1708.substack.com

RMarkdown/Quarto Tips and Tricks

Next | Issue #58

Harshvardhan
Feb 8
Share this post

RMarkdown/Quarto Tips and Tricks

hvsc1708.substack.com

Yesterday, Microsoft announced the new Bing. It’s an exciting time for search. Speaking of AI, you should check futuretools.io: a directory of free, freemium and paid AI tools.

Let’s dive in!

Thanks for reading Next — Today I Learnt About R! Subscribe for free to receive new posts and support my work.

Five Stories

1. RMarkdown/Quarto Tips and Tricks

Indrajeet Patil

This website provides several tips and tricks for working with R Markdown and Quarto files. My favourite ones:

  • Displaying data frames side-by-side

  • View live document while editing, no knitting needed.

  • Tufte-style margin content

Indrajeet also posted an R function daily on Twitter for a year, which I always looked forward to.

Read now

2. Microsoft kickstarts the AI arms race

Casey Newton

Search hasn’t changed a lot in the past twenty years. It was a collection of links, and it is a collection of links. The collection may have improved, but it’s still a collection of links. Microsoft has partnered with OpenAI to bring an improved version of ChatGPT to the masses. The new version of Bing, powered by ChatGPT, will offer a new paradigm in search with rapid innovation. Microsoft executives were exultant in the launch event today as they demonstrated how the reimagined Bing could instantly generate results.

We’re walking closer and closer to a world organised by search.

It will be integrated closely with Edge providing services like summarising a PDF with a click, generating LinkedIn posts with prompts, converting a StackOverflow answer written in Python to R and much more. Let’s see what Google’s Bard has to show us. (Google’s special event is today.)

Read now

3. Principal Components Analysis (in Python)

Steven Morse

The Principal Components Analysis is a staple in the data science world, but it seems that it has been misunderstood and underappreciated. With multiple definitions and approaches, getting lost in the jumble of information is not hard. In this post, Steven Morse takes a fresh look at PCA, exploring it in a way that is informative and easy to understand.

In the final section, he also shows its performance with image compression, which is really cool!

Read now

4. Open source in pharma from five perspectives

Posit PBC.

The pharmaceutical industry uses R extensively. Posit joined hands with five pharmaceutical stakeholders to discuss investment in and long-term commitment to open-source software, particularly in R as a statistical platform.

  • Shifting to an Open-Source Backbone in Clinical Trials with Roche

  • R at AstraZeneca

  • Data Science Hangout with Christina Fillmore (GSK)

  • Data Science Hangout with Eric Nantz (Eli Lilly)

  • Data Science Hangout with Mike Smith (Pfizer)

Read now

5. My stripper earnings per shift over four years

u/nerdydancing

Reddit is rightly called the front page of the internet. After seeing this post, you’d agree. u/nerdydancing tracked her earnings on each shift for four years. If any dataset promised stories behind each data point, it is probably this one.

See it

Four Packages

MakeItTalk is a Python package developed by Adobe that converts a picture to an audio-led animated video. Marlene Mhangami showed us an example!

GPfit is an R package for fitting computationally stable Gaussian processes (GP) model to a deterministic simulator.

ggdist is an R package for visualising uncertainty. It’s an extension of the ggplot2 world. Matt Dancho shows how to create raincloud plots with them.

graphlayouts provides methods for plotting graphs that are missing from igraph. Website.

Three Jargons

GPT stands for Generative Pre-training Transformer. It is a deep-learning model by OpenAI that is trained to generate human-like text by predicting the next word in a sequence of words, given a prompt.

Transformer is a deep-learning model invented by Google for natural-language processing. It is based on a deep learning algorithm that uses self-attention to process text and generate meaningful representations of words and phrases.

Self-attention is an ML technique that helps a model understand the relationships between different elements in a sequence. It helps the model determine which words in a sentence are most important for determining their meanings. The model does this by assigning weights to each word, indicating how much attention should be paid to it.

Two Tweets

Twitter avatar for @marlene_zw
Marlene Mhangami @marlene_zw
I recently saw a thread of an AI tool (D-ID) that can turn a single image into a video when given text or an audio file✨ I decided to see if I could build something similar using only #python and #opensource models🐍💕 I managed to do it in a few hours!!! Here's the result👩🏾‍💻
6:02 PM ∙ Feb 4, 2023
1,768Likes269Retweets
Twitter avatar for @dickiebush
Dickie Bush 🚢 @dickiebush
After 20 Hours With ChatGPT, I Found These 7 "Goals" To Be The Best Instructions When Asking It To Rewrite Something:
Image
12:58 PM ∙ Feb 1, 2023
4,009Likes692Retweets

One Meme

That’s a wrap!

I hope you enjoyed today’s letter. See you next week!

Harsh

Thanks for reading Next — Today I Learnt About R! Subscribe for free to receive new posts and support my work.

Share this post

RMarkdown/Quarto Tips and Tricks

hvsc1708.substack.com
Previous
Next
Comments
TopNew

No posts

Ready for more?

© 2023 Harshvardhan
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing