Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add samples for Headless PrPr (single node) #156

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

sfc-gh-dhung
Copy link
Collaborator

@sfc-gh-dhung sfc-gh-dhung commented Jan 21, 2025

Add guides for getting started with Headless Runtime PrPr

# Headless Container Runtime Jobs

## Setup

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add an intro section here so we put these steps into context? Like "The remote ML execution framework relies on SPCS jobs (link to doc) to execute the user's code inside the Container Runtime environment..."

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also let's inform the user that these below steps are steps to get your environment in place to use our remote execution framework

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do that in a separate document. I'm envisioning this README as a super concise "getting started" guide with the assumption that the user already wants to use headless. Overall we'd probably want each of these as separate documents:

  1. Overview - provide background, motivations, and introduce concepts. Similar to https://docs.snowflake.com/en/developer-guide/snowflake-ml/container-runtime-ml
  2. API Reference - e.g. https://docs.snowflake.com/en/developer-guide/snowpark-ml/reference/latest/index
  3. Quick start (i.e. this README) - short technical guide to get user up and running in <5 minutes, with pointers to additional resources for more advanced usage
  4. Tutorials (e.g. pytorch-cifar10/README.md and xgb-loan-apps/README.md) - full end-to-end walkthroughs

The intro sections would go in No.1 (Overview). WDYT?

@@ -0,0 +1,260 @@
# Headless Container Runtime Jobs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine for now. But let's talk to marketing and consider a more appropriate name

INSTANCE_FAMILY = CPU_X64_S -- See https://docs.snowflake.com/en/sql-reference/sql/create-compute-pool
```

### Function Dispatch
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also include an intro here.

Maybe you can pull from the PRD. So say something like "Users who are looking to gain the benefits offered by Snowflake's Container Runtime for ML (link to docs), including flexibility with packages, choice of CPU vs GPUs, and ability to use distributed APIs to scale your workloads, but want to do so from their own IDE can instrument their code to execute remotely..."

)
```

### Airflow Integration
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here as well. Let's provide an intro for why this is different than any other Airflow integration. i.e. Building an ML pipeline so that steps in the workflow can execute in the Container Runtime () with benefits such as ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants