Skip to content

Commit

Permalink
Add R package for users
Browse files Browse the repository at this point in the history
  • Loading branch information
almahmoud committed Jan 10, 2025
1 parent 7fb0a0b commit 3368d0d
Show file tree
Hide file tree
Showing 11 changed files with 238 additions and 0 deletions.
5 changes: 5 additions & 0 deletions BiocHubsIngestR/.Rbuildignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
^.*\.Rproj$
^\.Rproj\.user$
^_pkgdown\.yml$
^docs$
^pkgdown$
5 changes: 5 additions & 0 deletions BiocHubsIngestR/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
.Rproj.user
.Rhistory
.RData
.Ruserdata
docs
20 changes: 20 additions & 0 deletions BiocHubsIngestR/DESCRIPTION
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
Package: BiocHubsIngestR
Title: Upload Data to Bioconductor Hubs Ingest Endpoints
Version: 0.99.0
Authors@R:
person("Alexandru", "Mahmoud", email = "[email protected]", role = c("aut", "cre"))
Description: Facilitates uploading data to temporary S3 endpoints created by the Bioconductor Hubs Ingest stack.
Provides simple functions to configure credentials and upload local data directories.
License: Artistic-2.0
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.2
Imports:
aws.s3
Suggests:
testthat (>= 3.0.0),
knitr,
rmarkdown
Config/testthat/edition: 3
biocViews: Infrastructure
VignetteBuilder: knitr
4 changes: 4 additions & 0 deletions BiocHubsIngestR/NAMESPACE
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Generated by roxygen2: do not edit by hand

export(auth)
export(upload)
30 changes: 30 additions & 0 deletions BiocHubsIngestR/R/credentials.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
#' Set AWS credentials for Bioconductor Hubs Ingest endpoint
#'
#' @param username Character string of the username provided by administrator
#' @param password Character string of the password/key provided by administrator
#' @param endpoint Optional custom endpoint URL. If NULL, constructs from username
#'
#' @return Invisible NULL, sets environment variables as side effect
#' @export
#'
#' @examples
#' \dontrun{
#' BiocHubsIngestR::auth("myusername", "mypassword")
#' }
auth <- function(username, password, endpoint = NULL) {
if (!is.character(username) || !is.character(password))
stop("Username and password must be character strings")

if (is.null(endpoint)) {
endpoint <- sprintf("https://%s.hubsingest.bioconductor.org", username)
}

Sys.setenv(
AWS_ACCESS_KEY_ID = username,
AWS_SECRET_ACCESS_KEY = password,
AWS_DEFAULT_REGION = "",
AWS_S3_ENDPOINT = endpoint
)

invisible(NULL)
}
44 changes: 44 additions & 0 deletions BiocHubsIngestR/R/upload.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
#' Upload local directory to Bioconductor Hubs Ingest S3 endpoint
#'
#' @param path Character string path to local directory or file to upload
#' @param bucket Character string of bucket name, defaults to "userdata"
#'
#' @return Invisible list of uploaded files
#' @export
#'
#' @examples
#' \dontrun{
#' BiocHubsIngestR::upload("path/to/data")
#' }
upload <- function(path, bucket = "userdata") {
if (!file.exists(path))
stop("Path does not exist: ", path)

if (!aws.s3::bucket_exists(bucket)) {
message("Creating bucket: ", bucket)
aws.s3::put_bucket(bucket)
}

if (file.info(path)$isdir) {
files <- list.files(path, recursive = TRUE, full.names = TRUE)
} else {
files <- path
}

uploaded <- lapply(files, function(f) {
rel_path <- if(file.info(path)$isdir) {
sub(paste0("^", path, "/?"), "", f)
} else {
basename(f)
}
message("Uploading: ", rel_path)
aws.s3::put_object(
file = f,
object = rel_path,
bucket = bucket
)
return(rel_path)
})

invisible(uploaded)
}
26 changes: 26 additions & 0 deletions BiocHubsIngestR/man/auth.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

24 changes: 24 additions & 0 deletions BiocHubsIngestR/man/upload.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions BiocHubsIngestR/tests/testthat.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
library(testthat)
library(BiocHubsIngestR)

test_check("BiocHubsIngestR")
22 changes: 22 additions & 0 deletions BiocHubsIngestR/tests/testthat/test-credentials.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
test_that("auth sets environment variables correctly", {
old_env <- Sys.getenv(c("AWS_ACCESS_KEY_ID", "AWS_SECRET_ACCESS_KEY",
"AWS_DEFAULT_REGION", "AWS_S3_ENDPOINT"))
on.exit(do.call(Sys.setenv, as.list(old_env)))

username <- "testuser"
password <- "testpass"
auth(username, password)

expect_equal(Sys.getenv("AWS_ACCESS_KEY_ID"), username)
expect_equal(Sys.getenv("AWS_SECRET_ACCESS_KEY"), password)
expect_equal(Sys.getenv("AWS_DEFAULT_REGION"), "")
expect_equal(
Sys.getenv("AWS_S3_ENDPOINT"),
"https://testuser.hubsingest.bioconductor.org"
)
})

test_that("auth validates inputs", {
expect_error(auth(123, "pass"))
expect_error(auth("user", 123))
})
54 changes: 54 additions & 0 deletions BiocHubsIngestR/vignettes/BiocHubsIngestR.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
---
title: "Using BiocHubsIngestR"
author: "Alexandru Mahmoud"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Using BiocHubsIngestR}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

## Introduction

The `BiocHubsIngestR` package provides functionality for uploading data to temporary S3 endpoints created by the Bioconductor Hubs Ingest stack. Before proceeding with data upload, you must:

1. Have an S3 ingestion endpoint generated by a Bioconductor admin
2. Receive access credentials (username and password) from the admin

## Authentication

First, set up your credentials using the `auth()` function:

```{r eval=FALSE}
library(BiocHubsIngestR)
# Replace with your actual credentials provided by the admin
auth(username = "your_username", password = "your_password")
```

## Uploading Data

Once authenticated, you can upload either individual files or entire directories:

```{r eval=FALSE}
# Upload a single file
upload("path/to/your/file.txt")
# Upload an entire directory
upload("path/to/your/data/directory")
```

The `upload()` function will create a bucket if it doesn't exist and handle the transfer of files to the S3 endpoint. Progress messages will be displayed during the upload process.

## Important Notes

1. After completing your data upload, notify the Bioconductor admin you are coordinating with.
2. Once you inform the admin that the upload is complete, the endpoint will be locked for processing and no further uploads will be possible.
3. Make sure all your data is uploaded completely before notifying the admin.

For questions or issues, please contact the Bioconductor team or open an issue on the package's GitHub repository.

0 comments on commit 3368d0d

Please sign in to comment.