2  Workflow: RStudio, Quarto Markdown, Help, Projects, …

Author
Affiliation

Ryan McShane, Ph.D.

The University of Chicago

Published

Oct. 2nd, 2024

Modified

Oct. 25th, 2024

2.0.1 Overview

  1. Markdown
  2. Quarto
  3. \(\LaTeX\)
  1. R4DS Wrap-up:
    1. Ch 28: Quarto
    2. Ch 29: Quarto formats
    3. Ch 2: Workflow: Basics
    4. Ch 8: Getting Help
  2. Ch 6: Scripts and Projects

2.1 Markdown

2.1.1 “Quarto-flavored” Markdown

Markup

**Bold-face**, *Italic*, ~~Striking Through Text~~

Here is the start of the next paragraph where this text is bold, whereas this text is in italics. You can even strike through text.

Itemized List

Here is an itemized list. Leave a blank line before the list. Indent using at least 4 spaces.

* item 1
* item 2
    + sub-item 1
    + sub-item 2
  • item 1
  • item 2
    • sub-item 1
    • sub-item 2

Ordered List

Here is an ordered list. Leave a blank line before the list. Indent using at least 4 spaces.

1. item 1
2. item 2
    a. sub-item 1
    b. sub-item 2
  1. item 1
  2. item 2
    1. sub-item 1
    2. sub-item 2

This Markdown syntax also works on GitHub (“GitHub-Flavored Markdown”).

2.1.2 Markdown Tables

Vanilla Table

Table Header  | Second Header
------------- | --------
Table Cell 1  | Cell 2
Cell 3        | Cell 4 
Table Header Second Header
Table Cell 1 Cell 2
Cell 3 Cell 4

Column Justification

You can center, right-, or left-justify text in your tables as you like.

Centered | Right-Justified | Left-Justified
:-------:|----------------:|:----------------
A        | 24              |  My Friend's Name
B        |  5              |  My Name
CC       | 167             |  Firstname Lastname
Centered Right-Justified Left-Justified
A 24 My Friend’s Name
B 5 My Name
CC 167 Firstname Lastname

This Markdown syntax also works on GitHub (“GitHub-Flavored Markdown”).

2.2 Quarto

2.2.1 A Quick Note on our Data

```{r}
# install.packages("palmerpenguins")
library(palmerpenguins)
favstats(data = penguins, ~ bill_length_mm) |> kable()
```
min Q1 median Q3 max mean sd n missing
32.1 39.225 44.45 48.5 59.6 43.92193 5.459584 342 2

2.2.2 Tidying your code

Best practice is to write your code so that you don’t overflow the margins.

Good

```{r}
#| warning: false
#| code-line-numbers: true
penguins |>
  gf_density(~ bill_length_mm) |> 
  gf_labs(
    x = "Bill Length in millimeters", 
    y = "Density"
  )
```

Bad

Some dynamic output formats (like reveal.js and other html) have horizontal scrolling, but it should not be relied upon (especially not in a presentation context).

```{r}
#| warning: false
gf_density(penguins, ~ bill_length_mm) |> gf_labs(x = "Bill Length in millimeters", y = "Density")
```

We’ll discuss code style more later!

2.2.3 How to Adjust the Sizes of Plots (R Markdown style)

You can resize your plots by adjusting fig.width and fig.height in the code chunk. The unit of fig.width and fig.height are in inches. Recall that letter size paper is 8.5 \(\times\) 11 inches.

```{r fig.width=5.5, fig.height=2.5}
gf_density(penguins, ~ bill_length_mm)
```

2.2.4 How to Adjust the Sizes of Plots (Quarto style)

```{r}
#| fig-width: 4.5
#| fig-height: 2.5
#| warning: false
gf_density(penguins, ~ bill_length_mm)
```

Please ALWAYS adjust the sizes of plots appropriately in your homework submission. The default settings are usually fine, but sometimes you may need to resize them for readability.

2.2.5 Center, Right-justify, or Left-justify Plots

One can adjust the alignment of plots by specifying fig.align to be "center", "right" or "left". The default alignment is left.

Center (in column)

```{r}
#| warning: false
#| fig-align: "center"
gf_density(penguins, ~ bill_length_mm)
```

Right (in column)

```{r}
#| warning: false
#| fig-align: "right"
gf_density(penguins, ~ bill_length_mm)
```

2.2.6 Two Plots Placed Side-by-Side

We generally don’t have the two-column format in reveal.js available to us as in Section 2.2.5, so we may need to cowplot package to combine plots:

```{r}
#| warning: false
#| fig-width: 6.5
p1 = gf_density(penguins, ~ bill_length_mm)
p2 = gf_density(penguins, ~ flipper_length_mm)
cowplot::plot_grid(plotlist = list(p1, p2))
```

2.2.7 There are many more code chunk options!

Read here for R Markdown-style code chunk options, and read here for Quarto style code chunk (“cell”) options. Note! Many of these options are output format-dependent. We can’t produce animations in PDFs, for example, and fig-pos is only meaningful in PDF documents.

2.2.8 Section Citation

On Section 2.2.8, we learned how to reference a section header (often a problem number in this course).

## Section Citation {#sec-meta-citation}

In @sec-meta-citation, we

\({}\)

On Section 2.1.2, we discussed Markdown tables.

## Tables {#sec-table-citation}

In @sec-table-citation, we

\({}\)

Read more on Quarto and all of the things you can cross-reference (table, equations, sections, code listings, theorems and proofs, OH MY).

2.3 \(\LaTeX{}\)

2.3.1 \(\LaTeX{}\) in Quarto

Quarto supports \(\LaTeX\) math symbols & expressions and can produce output with pretty math equations like \[f(x)= \frac{1}{\sqrt{2\pi}\sigma}\exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right),\quad -\infty<x<\infty.\]

¡COMMON RENDER ISSUE!: Spaces at ends of \(\LaTeX\) equations

Good:

$x$

\(x\)

Bad:

$x $ or $ x$ or $ x $

(fails to render or shows math as \(\LaTeX{}\) code)

2.3.2 Inline Math Symbols & Expressions

Inline math symbols & expressions are placed between $...$.

You can write Greek letters: $\alpha$, $\beta$, $\delta$, $\epsilon$, $\varepsilon$

\[\alpha, \;\beta, \;\delta, \;\epsilon, \; \varepsilon ...\]

Subscripts are written as $a_b$ \(a_b\) and superscripts as $a^b$ \(a^b\).

Note that multiple characters in a subscript need to be grouped by {...}: $x_ij$ vs $x_{ij}$ \[x_ij \text{ vs } x_{ij}\]

Simple linear regression model: $Y_{ij}=\beta_0+\beta_1 X +\varepsilon_{ij}$ \[Y_{ij}=\beta_0+\beta_1 X +\varepsilon_{ij}\]

2.3.3 Display Math Mode

Place two dollar signs around math expressions in display mode like $$... $$

Fractions in inline math mode: $\frac{a}{b}$ \(\frac{a}{b}\) and in display math mode: $$\frac{a}{b}$$ \[\frac{a}{b}\]

Display fractions in inline math mode: $\dfrac{a}{b}$ \(\dfrac{a}{b}\)

Logistic regression model, $$P(Y=1) = \frac{e^{\alpha+\beta x}}{1+e^{\alpha+\beta x}}$$

\[P(Y=1) = \frac{e^{\alpha+\beta x}}{1+e^{\alpha+\beta x}}\]

2.3.4 Text mid-equation

Use \text{}: $f(x) = 0 \text{ for all } x$ \[f(x) = 0 \text{ for all } x\]

And use \quad or \; to add space for readability

$f(x) = 0 \; \text{ for all } x$

$f(x) = 0 \quad \text{ for all } x$:

\[f(x) = 0 \; \text{ for all } x\] \[f(x) = 0 \quad \text{ for all } x\]

2.3.5 Common Math Functions

Named math functions: Logarithm, Exponent, and Square root

$\log(x)$, $\exp(x)$, $\sqrt{x}$

\[\log(x), \exp(x), \sqrt{x}\]

\({}\)

Be sure to place the back-slash \, or they would look like this rendered output:

\[log(x), exp(x), sqrt{x}\]

The binom environment

$\binom{n}{k}$

\[\binom{n}{k}\]

2.3.6 Custom Definitons

You may sometimes see new shortcut definitions at the top of a .qmd

\newcommand{\E}{\mathrm{E}}

\newcommand{\Var}{\mathrm{Var}}

\({}\)

These define helpful shortcuts. E.g., type $\E[X]$ instead of $\mathrm{E}[X]$:

\[\mathrm{E}[X]\]

This can be used for “ill” as well:

\newcommand{\Loki}{\mathsf{L}\mathbb{O}\mathscr{K}\mathbf{I}}

\[\text{Loki: }\mathsf{L}\mathbb{O}\mathscr{K}\mathbf{I}\]

\newcommand{\Twitter}{\colorbox{black}{\textcolor{white}{\reflectbox{$\mathbb{X}$}}}}

(Not quite the Twitter logo, but many speculated it was \(\LaTeX\) at the time of the switch to “X”)

2.3.7 align* environment: multiline sequence of equations

\begin{align*}
(5x + 3y) - (4x - 7y +3y) &= (5x + 3y) - (4x - 4y)\\
&= 5x + 3y - 4x + 4y \\
&= x + 7y
\end{align*}

\[\begin{align*} (5x + 3y) - (4x - 7y +3y) &= (5x + 3y) - (4x - 4y)\\ &= 5x + 3y - 4x + 4y \\ &= x + 7y \end{align*}\]

\begin{align*}
(5x + 3y) - (4x - 7y +3y) &= (5x + 3y) - (4x - 4y) & \text{grouping like terms}\\
&= 5x + 3y - 4x + 4y & \text{distributing negative}\\
&= x + 7y & \text{grouping like terms}
\end{align*}

\[\begin{align*} (5x + 3y) - (4x - 7y +3y) &= (5x + 3y) - (4x - 4y) & \text{grouping like terms}\\ &= 5x + 3y - 4x + 4y & \text{distributing negative}\\ &= x + 7y & \text{grouping like terms} \end{align*}\]

2.3.8 The array environment – tables the manual way

$$
\begin{array}{c|ccccc}
x & 0 & 1 & 2 & 3 & \text{o/w} \\\hline
p(x) & 0.6 & 0.25 & 0.1 & 0.05 & 0
\end{array}
$$

\[ \begin{array}{c|ccccc} x & 0 & 1 & 2 & 3 & \text{o/w} \\\hline p(x) & 0.6 & 0.25 & 0.1 & 0.05 & 0 \end{array} \]

2.3.9 The cases environment – piecewise definitions

$$
p(x)=
\begin{cases}
0.6 & \text{if }x =0 \\
0.25 & \text{if }x =1 \\
0.1 & \text{if }x =2 \\
0.05 & \text{if }x =3 \\
0 & \text{if } x\notin \{0, 1, 2, 3\}
\end{cases}
$$

\[ p(x)= \begin{cases} 0.6 & \text{if }x =0 \\ 0.25 & \text{if }x =1 \\ 0.1 & \text{if }x =2 \\ 0.05 & \text{if }x =3 \\ 0 & \text{if } x\notin \{0, 1, 2, 3\} \end{cases} \]

2.3.10 Summation and Integration

Inline versions:

$\sum_{i=1}^{n} x_ip(x_i)$

\(\sum_{i=1}^{n} x_ip(x_i)\)

\({}\)

\({}\)

$\int_a^b xf(x)dx$

\(\int_a^b xf(x)dx\)

Display versions:

$$\sum_{i=1}^{n} x_ip(x_i)$$

\[\sum_{i=1}^{n} x_ip(x_i)\]

\({}\)

$$\int_a^b xf(x)dx$$

\[\int_a^b xf(x)dx\]

2.3.11 Resizing Parentheses

Instead of \((\dfrac{5}{3})(\dfrac{3}{8})\), we can use \left( and \right) to produce \(\left( \dfrac{5}{3}\right)\left(\dfrac{3}{8} \right)\)!

$\left( \dfrac{5}{3}\right)$

\(\left( \dfrac{5}{3}\right)\)

2.3.12 Symbol decorations

\dot

$X ~ \dot \sim ~ \mathrm{Bin}(n, p)$

\(X ~ \dot \sim ~ \mathrm{Bin}(n, p)\)

\({}\)

\bar

$\bar{x}$

\(\bar{x}\)

\overline

$\frac{1}{7} = 0.\overline{142857}$

\(\frac{1}{7} = 0.\overline{142857}\)

\({}\)

\hat

$\hat \mu$

\(\hat \mu\)

\({}\)

\(\LaTeX{}\) Cheatsheet

Here’s my \(\LaTeX\) for R Markdown (and thus Quarto) cheatsheet. It summarizes everything we’ve discussed here.

2.4 R4DS Wrap-up (Ch 28, 29, 2, 8)

2.4.1 Ch 28 and 29: Quarto and Quarto Formats

  • Next generation of R Markdown (which was formerly R Sweave).
  • Posit will continue to roll out features for Quarto and R Markdown will only be maintained by the community – now is a good time to jump on the Quarto train.
  • We’ve already discussed the basic features of Quarto, Markdown, LaTeX, code chunks, and citations in this lecture –
  • For further (and up-to-date) information on Quarto, go to quarto.org/docs/guide/, which will tell you the basics for getting started in a certain output format (PDF, HTML, Bookdown, Article, etc) and quarto.org/docs/reference/ for every YAML (and code cell) option, syntax for cross-references, etc.
  • If you have used minimal output-dependent syntax and customization, Quarto is meant to be able to change from one output (multiple web pages) to another (Bookdown website) by simply changing the output format in the YAML header (and perhaps adding a standalone YAML file to organize traffic).
  • If you are outputting to PDF (most assignments in this course), I recommend not using the visual editor.

2.4.2 Ch 2: Workflow: Basics

Important notes from the chapter:

  • Use a space to surround left-assignment operators!
    • Good: x = 5 or x <- 5
    • Bad: x=5 or x<-5
  • R4DS recommends snake_case for variable names, rather than camelCase or PascalCase or using.periods, etc. We will stick to snake_case where possible, although not every programmer (e.g., the former students that wrote the Qwixx program) follows these conventions, and sometimes (e.g., with kable), it is difficult to make use of snake_case.
  • Use named arguments!
    • Good: seq(from = 1, to = 10)
    • Bad: seq(1, 10)

2.4.3 Ch 8: Getting Help

Much of this advice is germane, although I suggest the following sequence for troubleshooting:

  1. Use help(FUNCTION_NAME) first
  2. Package documentation sometimes is better organized than help pages (see, for example, any package that uses the pkgdown package to organize their web documentation: dplyr, bayesplot, mkin, and many, many more).
  3. If you are looking for advanced features that the documentation does not discuss, you may find some luck reviewing the GitHub repo, where you will find issues (both closed and open), pull requests, as well as the source code. pkgdown pages helpfully links to the public-facing GitHub repo (usually top right).
    • Consider dplyr’s issues. Here, we can see (as of this writing), 60 open issues and 4,886 closed issues. If you’re looking for help, the closed issues are often more informative than the open ones – just click “4,886 Closed” and change the search query to is:issue is:closed.
    • A reminder not to post on GitHub unless you have exhausted all other possible options, and even then, there are usually contributing guidelines!

2.4.4 Ch 8: Getting Help

  1. (cont’d)
    • Pull requests similarly can be helpful; open or closed.
    • Finally, you can check out the source code for a function by navigating to the code tab and usually the R folder. From here, package organization varies, but you’ll find the exported functions in one or more of the *.R scripts here!
  2. Stack overflow (or similar pages – Posit Community) is a good place to see users and programmers mingling, although there are pretty strict posting guidelines here, so stick to searching and reading at first to get a feel for etiquette.
  1. If you do want to post, create a reprex (reproducible example) as described in Ch 8. Here’s a successful example for mosaic::xqt.
  2. Google is fine if you’ve exhausted the first five options (or if you use Google to find one of the first five options). After that, you start running into blogs, tweets (X), toots (Mastodon), etc., and your mileage will really vary.
  3. A reminder that the R programming produced by ChatGPT, GitHub Copilot, etc is built on said blogs, and will rapidly deteriorate in quality after one or two lines of code (and repeats bad habits of other programmers). DON’T USE IT during this course!

2.5 Ch 6: Scripts and Projects

2.5.1 Scripts

  • Every code chunk in a .qmd is essentially a miniature R script.
  • As in Hands on Programming in R, the Console is where we can interactively code, submitting one line at a time.
  • Testing code out or typing single-use code (install.packages) in the Console is fine, but it’s better to keep track of code in a reproducible format. We’ve been using .QMDs (and will continue to do so), but one can also write longer sets of code in R Scripts.
  • R Scripts are simply plain text files (with a .R file extension) with R code in them.

Here is an example script called my_script.R:

"Hello World"
2+2
library(ggplot2)
ggplot(data = cars, aes(x = speed, y = dist)) + 
  geom_point()
# This is a comment
  • On its own, it’s plain text. But in R or RStudio or by reference elsewhere, we can run it line-by-line or all at once.

If we have set our working directory, (automatically handled by a .qmd or .rproj), then we could run the entire script with a single line:

base::source(file = "my_script.R")

2.5.2 Projects: Preliminaries

  • Make sure you have option to save your workspace to .RData on exit to Never in Tools -> Global Options....
  • Run usethis::use_blank_slate() first to make this choice effective.
  • This will help ensure that your work is self-contained and reproducible.

2.5.3 Projects: Setup

  • When using GitHub Desktop and RStudio, here’s how to create a project for a repo (after you’ve cloned your repo to your machine):
  1. Open a new RStudio window. You should see this at the top right of the window (if you’ve never used a project):

  1. Click the little dropdown arrow and select “New Project”. You’ll have three options. Assuming you have cloned your repo but haven’t set up Git with RStudio, the correct option is “Existing Directory”.
  1. Click “Browse…” and navigate to the top-level folder of the GitHub repo. Click “open”, check the “Open in new session” box at the bottom left, and finally click the “Create Project” box.
  • A new window is now open! In the files tab, you should see all of the GitHub repo files as well as a new file with the .Rproj extension.

  • Now, when you want to open a repo (after you’ve closed the RStudio window), you should:

    1. Open GitHub Desktop and navigate to the appropriate repo at the top left (“Current repository”).
    2. Click on “Show in [Explorer/Finder]”.
    3. And open the *.Rproj file!
  • This will open a new RStudio window dedicated to just this repo!