Skip to content

Commit

Permalink
Merge pull request #128 from amices/dev
Browse files Browse the repository at this point in the history
Just some documentation and a small bugfix (colors in `plot_pred()` #125)
  • Loading branch information
hanneoberman authored Dec 21, 2023
2 parents 6a69f13 + 6b5863c commit 3210413
Show file tree
Hide file tree
Showing 9 changed files with 50 additions and 20 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: ggmice
Title: Visualizations for 'mice' with 'ggplot2'
Version: 0.1.0
Version: 0.1.0.9000
Authors@R: c(
person("Hanne", "Oberman", email = "[email protected]", role = c("aut", "cre"),
comment = c(ORCID = "0000-0003-3276-2141")),
Expand Down
14 changes: 13 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,15 @@
# ggmice (development version)

## Bug fixes

* Correct labeling of 'exclusion-restriction' variables in `plot_pred()` (#128)

## Minor changes

* Miscellaneous documentation and vignette updates (#128)

---

# ggmice 0.1.0

## Breaking changes
Expand Down Expand Up @@ -25,7 +37,7 @@
* Input validation for `data` argument `plot_*` functions (#85)
* Input validation for `vrb` argument `plot_*` functions (#80)
* Input validation for `mapping` argument `ggmice()` (#34, #90)
* Vignette updates (PRs #31, #35, #38) and other documentation (#45, #51)
* Vignette updates (#31, #35, #38) and other documentation (#45, #51)
* The `plot_pattern()` function creates missing data pattern plot with more informative labels (#59, #111)

---
Expand Down
14 changes: 11 additions & 3 deletions R/ggmice.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,19 @@
#' @param data An incomplete dataset (of class `data.frame`), or an object of class [`mice::mids`].
#' @param mapping A list of aesthetic mappings created with [ggplot2::aes()].
#'
#' @return An object of class [`ggplot2::ggplot`].
#' @return An object of class [`ggplot2::ggplot`]. The [`ggmice::ggmice`] function returns output
#' equivalent to [`ggplot2::ggplot`] output, with a few important exceptions:
#'
#' - The theme is set to [`ggmice::theme_mice`].
#' - The color scale is set to the [`mice::mdc`] colors.
#' - The `colour` aesthetic is set to `.where`, an internally defined variable which distinguishes
#' observed data from missing data or imputed data (for incomplete and imputed data, respectively).
#'
#' @examples
#' dat <- mice::nhanes
#' ggmice(dat, ggplot2::aes(x = age, y = bmi)) + ggplot2::geom_point()
#' imp <- mice::mice(dat, print = FALSE)
#' ggmice(imp, ggplot2::aes(x = age, y = bmi)) + ggplot2::geom_point()
#' @seealso See the `ggmice` vignette to use the `ggmice()` function on
#' [incomplete data](https://amices.org/ggmice/articles/ggmice.html#the-ggmice-function)
#' or [imputed data](https://amices.org/ggmice/articles/ggmice.html#the-ggmice-function-1).
Expand Down Expand Up @@ -99,8 +107,8 @@ ggmice <- function(data = NULL,
.imp = 0,
.id = rownames(data$data),
data$data
)[!miss_xy, ],
data.frame(.where = "imputed", mice::complete(data, action = "long"))[where_xy, ]
)[!miss_xy,],

Check warning on line 110 in R/ggmice.R

View workflow job for this annotation

GitHub Actions / lint

file=R/ggmice.R,line=110,col=20,[commas_linter] Commas should always have a space after.
data.frame(.where = "imputed", mice::complete(data, action = "long"))[where_xy,]

Check warning on line 111 in R/ggmice.R

View workflow job for this annotation

GitHub Actions / lint

file=R/ggmice.R,line=111,col=88,[commas_linter] Commas should always have a space after.
),
.where = factor(
.where,
Expand Down
10 changes: 5 additions & 5 deletions R/plot_pred.R
Original file line number Diff line number Diff line change
Expand Up @@ -50,13 +50,13 @@ plot_pred <-
ind = matrix(data, nrow = p * p, byrow = TRUE)
) %>% dplyr::mutate(clr = factor(
.data$ind,
levels = c(-2, 0, 1, 2, 3),
levels = c(-3, -2, 0, 1, 2),
labels = c(
"inclusion-restriction variable",
"cluster variable",
"not used",
"predictor",
"random effect",
"inclusion-restriction variable"
"random effect"
),
ordered = TRUE
))
Expand All @@ -78,11 +78,11 @@ plot_pred <-
) +
ggplot2::scale_fill_manual(
values = c(
"inclusion-restriction variable" = "orangered",
"cluster variable" = "lightyellow",
"not used" = "grey90",
"predictor" = "palegreen3",
"random effect" = "deepskyblue",
"inclusion-restriction variable" = "orangered"
"random effect" = "deepskyblue"
)
) +
ggplot2::labs(
Expand Down
10 changes: 4 additions & 6 deletions R/utils.R
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Functions for internal use

# util functions
# Shorthand 'not in' for code readability
`%nin%` <- Negate(`%in%`)

#' Pipe operator
Expand All @@ -19,11 +19,6 @@
#' @return The result of calling `rhs(lhs)`.
NULL

# suppress undefined global functions or variables note
utils::globalVariables(c(".id", ".imp", ".where", ".id", "where", "name", "value"))

# Alias a function with `foo <- function(...) pkgB::blah(...)`

#' Utils function to validate data argument inputs
#'
#' @param data The input supplied to the 'data' argument.
Expand Down Expand Up @@ -104,3 +99,6 @@ verify_data <- function(data,
}
}
}

# suppress undefined global functions or variables note
utils::globalVariables(c(".id", ".imp", ".where", ".id", "where", "name", "value"))
1 change: 1 addition & 0 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ knitr::opts_chunk$set(
<!-- badges: start -->
[![CRAN_Status_Badge](https://www.r-pkg.org/badges/version/ggmice)](https://cran.r-project.org/package=ggmice)
[![Total CRAN downloads](https://cranlogs.r-pkg.org/badges/grand-total/ggmice)](https://cranlogs.r-pkg.org/badges/grand-total/ggmice)
[![r-universe status badge](https://amices.r-universe.dev/badges/ggmice)](https://amices.r-universe.dev/ggmice)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.6532702.svg)](https://doi.org/10.5281/zenodo.6532702)

[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-blue.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)
Expand Down
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@
[![CRAN_Status_Badge](https://www.r-pkg.org/badges/version/ggmice)](https://cran.r-project.org/package=ggmice)
[![Total CRAN
downloads](https://cranlogs.r-pkg.org/badges/grand-total/ggmice)](https://cranlogs.r-pkg.org/badges/grand-total/ggmice)
[![r-universe status
badge](https://amices.r-universe.dev/badges/ggmice)](https://amices.r-universe.dev/ggmice)
[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.6532702.svg)](https://doi.org/10.5281/zenodo.6532702)

[![Lifecycle:
Expand Down
11 changes: 10 additions & 1 deletion man/ggmice.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions vignettes/ggmice.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ The `mapping` argument in `ggmice()` cannot be empty. An `x` or `y` mapping (or

## Incomplete data

If the object supplied to the `data` argument in `ggmice()` is a `data.frame`, the visualization will contain observed data in blue and missing data in red. Since missing data points are by definition unobserved, the values themselves cannot be plotted. What we *can* plot are sets of variable pairs. Any missing values on one variable can be displayed on top of the axis of the other. This provides a visual cue that the missing data is distinct from the observed values, but still displays the observed value of the other variable.
If the object supplied to the `data` argument in `ggmice()` is a `data.frame`, the visualization will contain observed data in blue and missing data in red. Since missing data points are by definition unobserved, the values themselves cannot be plotted. What we *can* plot are sets of variable pairs. Any missing values in one variable can be displayed on the axis of the other. This provides a visual cue that the missing data is distinct from the observed values, but still displays the observed value of the other variable.

For example, the variable `age` is completely observed, while there are some missing entries for the height variable `hgt`. We can create a scatter plot of these two variables with:

Expand All @@ -99,7 +99,7 @@ ggmice(dat, aes(age, hgt)) +
geom_point()
```

The `age` of cases with missing `hgt` are plotted on top of the horizontal axis. This is in contrast to a regular `ggplot()` call with the same arguments, which would leave out all cases with missing `hgt`. So, with `ggmice()` we loose less information, and may even gain valuable insight into the missingness in the data.
The `age` of cases with missing `hgt` are plotted on the horizontal axis. This is in contrast to a regular `ggplot()` call with the same arguments, which would leave out all cases with missing `hgt`. So, with `ggmice()` we loose less information, and may even gain valuable insight into the missingness in the data.

Another example of `ggmice()` in action on incomplete data is when one of the variables is categorical. The incomplete continuous variable `hgt` is plotted against the incomplete categorical variable `reg` with:

Expand All @@ -108,7 +108,7 @@ ggmice(dat, aes(reg, hgt)) +
geom_point()
```

Again, missing values are plotted on top of the axes. Cases with observed `hgt` and missing `reg` are plotted on top of the vertical axis. Cases with observed `reg` and missing `hgt` are plotted on top of the horizontal axis. There are no cases were neither is observed, but otherwise these would be plotted on the intersection of the two axes.
Again, missing values are plotted on the axes. Cases with observed `hgt` and missing `reg` are plotted on the vertical axis. Cases with observed `reg` and missing `hgt` are plotted on the horizontal axis. There are no cases were neither is observed, but otherwise these would be plotted on the intersection of the two axes.

The 'grammar of graphics' makes it easy to adjust the plots programmatically. For example, we could be interested in the differences in growth data between the city and other regions. Add facets based on a clustering variable with:

Expand Down

0 comments on commit 3210413

Please sign in to comment.