provision S3 buckets and credentials for temp space to hold 1) the incoming content to run through speech-to-text generation, and 2) the generated transcript/caption files. #6

jmartin-sul · 2024-09-12T02:42:05Z

Ultimately we'll want these defined and provisioned via Terraform, but it's fine to do initial experimentation by manually creating things in AWS web console or using one off aws CLI commands. Whatever feels like the most comfortable path forward is fine (i.e. it's also fine to do it the hard way and try to use Terraform the whole time). See the guidelines at the top of #1

we want this for stage to start, at least. we can add prod and QA as we move close to productionizing.
we probably want to assume we'll use AWS via Cardinal Cloud (see ops) for this.
but as an experiment, try doing this in GCP (same cardinal cloud caveat!), and connecting with the Ruby AWS SDK gem. The gem should work with any S3 compatible service (we use it with IBM for preservation, and we expect to use it with GCP when we migrate our IBM archived content to GCP next year sometime).

The text was updated successfully, but these errors were encountered:

edsu · 2024-09-12T17:01:59Z

I think we want to be doing this in Terraform?

edsu · 2024-09-16T15:08:25Z

I created the stage and dev buckets manually using our sul-developers credentials:

aws --profile sul-developers s3 mb s3://sul-speech-to-text-stage
aws --profile sul-developers s3 mb s3://sul-speech-to-text-dev

s3://sul-speech-to-text-stage
s3://sul-speech-to-text-dev

I think ultimately ops will want to create the production bucket with prod credentials and Terraform.

jmartin-sul · 2024-09-24T18:37:46Z

marking this done and tracking the terraform work in https://github.com/sul-dlss/terraform-aws/issues/1170

jmartin-sul mentioned this issue Sep 12, 2024

[EPIC] Prototype workflow for generating and accessioning speech-to-text extraction #1

Open

37 tasks

andrewjbtw assigned andrewjbtw and edsu and unassigned andrewjbtw Sep 18, 2024

jmartin-sul closed this as completed Sep 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

provision S3 buckets and credentials for temp space to hold 1) the incoming content to run through speech-to-text generation, and 2) the generated transcript/caption files. #6

provision S3 buckets and credentials for temp space to hold 1) the incoming content to run through speech-to-text generation, and 2) the generated transcript/caption files. #6

jmartin-sul commented Sep 12, 2024 •

edited

Loading

edsu commented Sep 12, 2024

edsu commented Sep 16, 2024

jmartin-sul commented Sep 24, 2024

provision S3 buckets and credentials for temp space to hold 1) the incoming content to run through speech-to-text generation, and 2) the generated transcript/caption files. #6

provision S3 buckets and credentials for temp space to hold 1) the incoming content to run through speech-to-text generation, and 2) the generated transcript/caption files. #6

Comments

jmartin-sul commented Sep 12, 2024 • edited Loading

edsu commented Sep 12, 2024

edsu commented Sep 16, 2024

jmartin-sul commented Sep 24, 2024

jmartin-sul commented Sep 12, 2024 •

edited

Loading