In this thought-provoking piece, I explore the nuances of reading code in different programming languages, particularly focusing on R. My journey led me to reflect on how certain features are not as intuitive as they could be, specifically regarding the {glue} package and its functionality. This is not meant to criticize the design choices made by the tidyverse team, who have undoubtedly written far more code than I have and carefully considered their approaches. Instead, I believe it is valuable to challenge our assumptions and examine how we interpret code.

The blog post that details the latest updates to the tidyverses {scales} package provides an excellent demonstration of its new features. However, one function in particular caught my attention: label_glue("The {x} penguin"), which is used as follows:

label_glue("The {x} penguin")(c("Gentoo", "Chinstrap", "Adelie"))

In this code snippet, the label_glue function takes a string formatted with placeholders and generates a labeling function. This function is then called with a vector of penguin species, effectively producing labels like:

# The Gentoo penguin# The Chinstrap penguin# The Adelie penguin

For those with a background in Python, its interesting to note that {glue} is Rs equivalent to Pythons f-strings, functioning in a nearly identical manner. For example, consider the following:

## R:name <- "Jonathan"glue::glue("My name is {name}")# My name is Jonathan## Python:>>> name = 'Jonathan'>>> f"My name is {name}"# 'My name is Jonathan'

While the mechanics behind the label_glue()() function call may not seem magical, they warrant deeper examination, especially when we encounter unexpected results when reading code. To clarify things further, let us discuss a simplified version of label_glue:

tmp_label_glue <- function(pattern = "{x}") {  function(x) {    glue::glue_data(list(x = x), pattern)  }}

This function returns another function that takes a single argument. Evaluating it yields:

tmp_label_glue("The {x} penguin")# function(x) {#   glue::glue_data(list(x = x), pattern)# }

We can then create a named function with this result:

penguin_label <- tmp_label_glue("The {x} penguin")penguin_label(c("Gentoo", "Chinstrap", "Adelie"))# The Gentoo penguin# The Chinstrap penguin# The Adelie penguin

The versatility of this approach is noteworthy, as different {glue} strings can yield various functions. However, if we are only utilizing one specific pattern, it may feel counterintuitive to invoke it inline as shown earlier with label_glue("The {x} penguin")(c("Gentoo", "Chinstrap", "Adelie")).

From a coding perspective, one might expect that we could consolidate all our arguments into a single function call like this:

label_glue("The {x} penguin", c("Gentoo", "Chinstrap", "Adelie"))

However, this does not hold because the label_glue function does not accept labels as an argument and only returns a function. The rationale behind the design of the functions within the {scales} package necessitates them accepting functions as arguments, as this allows for lazy evaluation. This means we do not need to know the values we want to pass to the generated function at the point of call. Those values might be determined during the plotting process.

Furthermore, it is advantageous to allow the scale_* function to manage the extraction and computation of labels. For instance, in the context of a plot, it may look something like this:

library(ggplot2)library(palmerpenguins)p <- ggplot(penguins[complete.cases(penguins), ]) +   aes(bill_length_mm, species) +   geom_point()p + scale_y_discrete(labels = penguin_label)

Here, the labels argument is assigned a function, penguin_label, which we defined earlier. Alternatively, we could write:

p + scale_y_discrete(labels = label_glue("The {x} penguin"))

Both approaches effectively achieve the same result, but the choice of using a named function versus inline representation can significantly influence readability and comprehension.

As we delve deeper into what gets passed to the generated function, it can be a bit murky. However, one can reasonably expect that the supplied function will eventually be called with the available labels as its arguments. My understanding aligns with the principles of Haskell, where every function expects precisely one argument, and this concept leads to a different interpretation of function application.

In Haskell, a function that seems to take multiple arguments actually consists of layers of functions, each accepting one argument. This concept resonates with R, where:

do_thing <- function(x) {  function(y) {    function(z) {      x + y + z    }  }}

By utilizing this layered function approach, we can effectively peel off layers to generate new functions. Understanding this framework can illuminate why functions, such as those for labeling, are excellent candidates for this form of partial application.

It would be more intuitive if label_glue(pattern, labels) were to allow us to call label_glue(pattern) as a function awaiting a labels argument. However, this is not the case, as label_glue returns a function instead of directly accepting arguments, leading to confusion when inline calls are made.

In functional programming, passing around functions is a popular practice. For example, when using the sapply function in R:

sapply(mtcars[, 1:4], mean)

We are passing the first four columns of the mtcars dataset alongside the mean function, which returns the means of each column as a named vector. This practice becomes even more powerful when partial application is available, allowing for shorthand functions like:

add_5 <- \(x) x + 5sapply(1:10, add_5)

Ultimately, as I engage with these concepts, I recognize that my current pattern-recognition may be overly influenced by the idea that in R, 'no parentheses' indicates a function while 'parentheses' imply a result. This realization sparked a discussion about naming conventions and clarity in coding practices. In conclusion, while I appreciate the ingenuity behind the {scales} approach, I continue to ponder whether my perceptions of the arg = fun() design are unique or if others share similar sentiments.

If you have thoughts or insights on this topic, I invite you to share them on Mastodon or in the comment section below.

devtools::session_info()