Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix typo in definition of tidy data in README #1521

Merged
merged 2 commits into from
Nov 1, 2023
Merged

Conversation

mfgeary
Copy link
Contributor

@mfgeary mfgeary commented Sep 27, 2023

The README (README.md and README.Rmd) files have a typo in the definition of tidy data. I changed the text from "Every column is variable" to "Every column is a variable".

It's a very minor change but important because these sentences have different meanings. Without the "a", it is more difficult to understand what tidy data is, and this is the first thing you see when you go to tidyr's homepage.

@matthewjnield
Copy link
Contributor

matthewjnield commented Oct 2, 2023

@mfgeary Good catch!

I'm adding an idea here for the maintainers' consideration:

Should the definition of tidy data in the tidyr README and website be updated to match the one that is presented in "R for Data Science (2e)", or perhaps vice-versa? The two definitions are very close, and I think they get the same point across, but tidy data is a core concept in the tidyverse, and having a uniform definition (whichever of these two makes the concept clearer) in these two places may be a good idea.

@hadley hadley merged commit 11aa98f into tidyverse:main Nov 1, 2023
@hadley
Copy link
Member

hadley commented Nov 1, 2023

Thanks @mfgeary!

@matthewjnield isn't it the same definition?

@matthewjnield
Copy link
Contributor

@hadley they are very close; I am only wondering if the extra wording in R for Data Science 2(e) gets the point across slightly better and should be added to the tidyr README.

Here are the parts from each source I am referring to:

R for Data Science 2(e):

  1. Each variable is a column; each column is a variable.
  2. Each observation is a row; each row is an observation.
  3. Each value is a cell; each cell is a single value.

tidyr README:

  1. Every column is a variable.
  2. Every row is an observation.
  3. Every cell is a single value.

@hadley
Copy link
Member

hadley commented Nov 2, 2023

Oh I see. Do you want to do a PR?

@mfgeary
Copy link
Contributor Author

mfgeary commented Nov 2, 2023

@matthewjnield @hadley I chose the shorter definition to be consistent across tidyr as it is also used in the tidy data vignette here. If we switch to the definition from R4DS, I suggest switching all definitions of tidy data in tidyr to be consistent

@matthewjnield
Copy link
Contributor

@hadley Certainly; could you please take a look at #1530 to determine whether that change will be merged or not, before I add another pull request affecting the README for this? That way I can keep the unrelated changes in separate PRs, and keep them coming from a single branch to avoid conflicts in case the first change isn't merged.

@mfgeary Thanks for pointing out the appearance of the definition in the vignette.

@matthewjnield
Copy link
Contributor

This is ready, in #1532.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants