Area Under the Curve (AUC) in R

Author

Michael Großbach

Published

2022-11-22

1 Introduction

In pharmacokinetics, levels of a drug or a hormone in the blood are often measured repeatedly over a fixed and pre-planned time, following a schedule when to take the samples. These measurements provide information about the temporal concentration development, e.g. following ingestion or some treatment. To be able to compare these profiles either within subjects—between different conditions—or across groups—between treatment and control groups—, the area under the curve (AUC) is often employed.

2 Methods

The most prominent algorithm applied to obtain the AUC is the trapezoidal rule, yielding a estimation of the definitive integral of the curve between the first and the last sampling point.

2.1 Area Under the Curve

The practice of using the AUC to estimate the average release of a substance into a bodily fluid over a period of time has been around since the at least 1990 (Matthews et al., 1990). Pruessner et al. (2003) called for a standardization of AUC usage across laboratories to make comparisons between outcomes easier. Their equations are reported in the following.

2.1.1 Total Area Under the Curve (\(AUC_G\))

\[AUC_G = \sum_{i=1}^{n-1} \frac{(m_{(i+1)}+m_i) \cdot t_i}{2} \tag{1}\]

with \(n\), the number of repeated measurements; \(t_i\), the duration of the \(i\)th time interval, i.e. the time between the current and the next measurement; and \(m_i\), the \(i\)th cortisol value. For simplified versions of this equation, for instance, in case of isochronic intervals, we refer you to Pruessner et al. (2003).

2.1.2 Increase From Baseline (\(AUC_I\))

Individual response levels to the same treatment can vary, so Pruessner et al. (2003) suggested to only use increase from some pre-treatment baseline to compare between conditions or groups. The equation’s first term calculates the entire area under the curve as in Equation 1, and then subtracts the AUC between the first and the second measurements:

\[AUC_I = \left( \sum_{i=1}^{n-1}\frac{(m_{i+1} + m_i) \cdot t_i}{2} \right) - \left( m_1 \cdot \sum_{i=1}^{n-1}t_i \right) \tag{2}\]

This equation was later ratified in the expert consensus guidelines for the assessment of the cortisol awakening response (Stalder et al., 2016) to be used as standardized method.

2.2 Implementation of Function auc() in \(\text{R}\)

The following code chunk shows the code of the function auc(). Simply hover your mouse over the code and you will see a clipboard icon appear in the upper right corner of the chunk. Clicking on it copies the entire code of the function which you can then paste into your \(\text{R}\) script.

Code
#' Area Under the Curve.
#' 
#' Estimates the definitive integral with the trapezoid method.
#'
#' @param x A long-format data.frame (or an R object coercible to one) with
#'  columns for individual IDs, measurement times, and the 
#'  dependent variable.
#' @param DV A character string; column name of the dependent variable.
#' @param Time A character string. 
#' The column name of measurements times. If `NULL`, intervals are assumed to be `1`.
#' @param total A boolean.
#'  * `TRUE` (default) returns the total area under the curve (Pruessner et al.'s 
#'  AUC_G).
#'  * `FALSE` returns the increase in AUC since baseline, i.e. the first 
#'  measurement.
#' @return
#' @export
#' @references Pruessner, J. C., Kirschbaum, C., Meinlschmid, G., and 
#' Hellhammer, D. H. (2003). Two formulas for computation of the area under the 
#' curve represent measures of total hormone concentration versus time-dependent 
#' change. Psychoneuroendocrinology 28, 916–931. 
#' doi: 10.1016/s0306-4530(02)00108-7.
#' @author Michael Großbach, michael.grossbach@hmtm-hannover.de
#' @examples
#' concentration <- c(1, 2, 3, 3, 2)
#' auc(concentration) # total concentration over time
#' auc(concentration, total = FALSE) # change in conc from baseline
#' mydat <- data.frame(C = concentration, T = c(1, 2, 4, 5, 6))
#' auc(mydat, DV = "C", Time = "T")
#' auc(mydat, DV = "C", Time = "T", total = FALSE)
auc <- function(x, DV = NULL, Time = NULL, total = TRUE) {
  if (!is.null(x)) {
    if (!is.data.frame(x)) {
      x <- as.data.frame(x)
      if (ncol(x) == 1) {
        names(x) <- "DV"
        DV <- "DV"
      }
    }
  } else {
    stop("Error: x must be a data.frame (or an R object coercible to one).")
  }
  stopifnot(is.character(DV),
            is.character(Time) | is.null(Time),
            is.logical(total))
  if (is.null(Time)) {
    x$Interval <- 1 # Add isochronous intervals
  } else {
    x$Interval <- c(0, # Calc intervals from Time
                    diff(unlist(x[Time]))) 
  }
  Interval <- "Interval"
  n_measurements <- nrow(x)
  concentration <- vector(mode = "double",
                         length = n_measurements - 1)
  for (i in 1:(n_measurements - 1)) {
    concentration[i] <- unlist(((x[(i + 1), DV] + x[i, DV]) * x[(i + 1), Interval]) / 2)
  }
  if (total) {
    ret <- sum(concentration)
  } else {
    ret <- sum(concentration) - (concentration[1] * sum(x[2:(n_measurements), Interval]))
  }
  return(ret)
}

Let us assume that we have taken five successive measurements of a salivary hormone concentration:

Code
measured <- c(17.8, 20, 26.7, 27.5, 21.3)

Then we submit this vector to the function auc() above:

Code
auc(measured)
[1] 93.75

In a quick control for correctness we can see that the function does its job correctly:

(20 + 17.8) / 2 + (26.7 + 20) / 2 + (27.5 + 26.7) / 2 + (21.3 + 27.5) / 2 = 93.75

The increase in concentration since baseline ammounts to:

[1] 18.15

which can be checked by subtracting four times the AUC during the first interval from the total area (because there remain four intervals, each of length 1; see Equation 2):

93.75 - ((20 + 17.8) / 2 * 4) = 18.15

In the previous example, we assumed isochronous intervals between measurements and therefore could use a vector as input to auc(). The next example showcases different intervals between measurements:

Code
my_data <- data.frame(Concentration = measured, # re-use the vector from above
                      Time = c(0, 15, 60, 70, 75)) # measured 0, 15, 60, 70, and 75 minutes after baseline
Code
auc(my_data, DV = "Concentration", Time = "Time")
[1] 1727.25

A quick check of the results:

(20 + 17.8) / 2 * 15 + (26.7 + 20) / 2 * 45 + (27.5 + 26.7) / 2 * 10 + (21.3 + 27.5) / 2 * 5 = 1727.25

3 References

Matthews, J. N., Altman, D. G., Campbell, M. J., and Royston, P. (1990). Analysis of serial measurements in medical research. BMJ 300, 230–235. doi: 10.1136/bmj.300.6719.230.
Pruessner, J. C., Kirschbaum, C., Meinlschmid, G., and Hellhammer, D. H. (2003). Two formulas for computation of the area under the curve represent measures of total hormone concentration versus time-dependent change. Psychoneuroendocrinology 28, 916–931. doi: 10.1016/s0306-4530(02)00108-7.
Stalder, T., Kirschbaum, C., Kudielka, B. M., Adam, E. K., Pruessner, J. C., Wüst, S., et al. (2016). Assessment of the cortisol awakening response: Expert consensus guidelines. Psychoneuroendocrinology 63, 414–432. doi: 10.1016/j.psyneuen.2015.10.010.