Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provision S3 buckets and credentials for temp space to hold 1) the incoming content to run through speech-to-text generation, and 2) the generated transcript/caption files. #6

Closed
Tracked by #1
jmartin-sul opened this issue Sep 12, 2024 · 3 comments
Assignees

Comments

@jmartin-sul
Copy link
Member

jmartin-sul commented Sep 12, 2024

Ultimately we'll want these defined and provisioned via Terraform, but it's fine to do initial experimentation by manually creating things in AWS web console or using one off aws CLI commands. Whatever feels like the most comfortable path forward is fine (i.e. it's also fine to do it the hard way and try to use Terraform the whole time). See the guidelines at the top of #1

  • we want this for stage to start, at least. we can add prod and QA as we move close to productionizing.
  • we probably want to assume we'll use AWS via Cardinal Cloud (see ops) for this.
  • but as an experiment, try doing this in GCP (same cardinal cloud caveat!), and connecting with the Ruby AWS SDK gem. The gem should work with any S3 compatible service (we use it with IBM for preservation, and we expect to use it with GCP when we migrate our IBM archived content to GCP next year sometime).
@edsu
Copy link
Contributor

edsu commented Sep 12, 2024

I think we want to be doing this in Terraform?

@jmartin-sul jmartin-sul changed the title provision S3 buckets and credentials for temp space to hold 1) the incoming content to be transcribed/captioned, and 2) the generated transcript/caption files. provision S3 buckets and credentials for temp space to hold 1) the incoming content to run through speech-to-text generation, and 2) the generated transcript/caption files. Sep 13, 2024
@edsu
Copy link
Contributor

edsu commented Sep 16, 2024

I created the stage and dev buckets manually using our sul-developers credentials:

aws --profile sul-developers s3 mb s3://sul-speech-to-text-stage
aws --profile sul-developers s3 mb s3://sul-speech-to-text-dev
  • s3://sul-speech-to-text-stage
  • s3://sul-speech-to-text-dev

I think ultimately ops will want to create the production bucket with prod credentials and Terraform.

@andrewjbtw andrewjbtw assigned andrewjbtw and edsu and unassigned andrewjbtw Sep 18, 2024
@jmartin-sul
Copy link
Member Author

marking this done and tracking the terraform work in https://github.com/sul-dlss/terraform-aws/issues/1170

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants