Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: expose interface for getting info at building #1267

Open
cutecutecat opened this issue Dec 5, 2022 · 11 comments
Open

feat: expose interface for getting info at building #1267

cutecutecat opened this issue Dec 5, 2022 · 11 comments

Comments

@cutecutecat
Copy link
Member

cutecutecat commented Dec 5, 2022

Description

We introduce a global module env with a getter function os() in starlark API

This can exposed some variables for subsequent building procedure, especially for some library function developing, they can automatically detect os version or language, and be compatible with them without user input these information again.
There could be more getters implementation in this module in the future.

Examples

def build():
    base(os="ubuntu18.04", language="python3.8")
    print(env.os())
    # output "ubuntu18.04"
def build():
    base(os="ubuntu20.04", language="python3.8")
    print(env.os())
    # output "ubuntu20.04"

PS. It would be better to be a lazy-evaluate variable env.os for design, but starlark doesn't support lazy-evaluate!

ref: tensorchord/envdlib#14


Message from the maintainers:

Love this enhancement proposal? Give it a 👍. We prioritise the proposals with the most 👍.

@cutecutecat
Copy link
Member Author

/assign

@cutecutecat cutecutecat changed the title feat: Exposed interface for getting info of building feat: expose interface for getting info at building Dec 5, 2022
@VoVAllen
Copy link
Member

VoVAllen commented Dec 5, 2022

We had some discussions here #1132 (comment) about the interface. I think it's better to make it as a key-value interface instead of methods.

@kemingy
Copy link
Member

kemingy commented Dec 5, 2022

We had some discussions here #1132 (comment) about the interface. I think it's better to make it as a key-value interface instead of methods.

What's the advantage of key-value interface? I think this info is not static, thus a method would be suitable.

@VoVAllen
Copy link
Member

VoVAllen commented Dec 5, 2022

@kemingy So that every function can provide such information. For example envdlib can provide torch as function. Contributor can also provide torch version in the function. Other packages relying on torch can get such information.

@kemingy
Copy link
Member

kemingy commented Dec 5, 2022

@kemingy So that every function can provide such information. For example envdlib can provide torch as function. Contributor can also provide torch version in the function. Other packages relying on torch can get such information.

Do you mean that it's also mutable and can be modified with assignment statements?

@VoVAllen
Copy link
Member

VoVAllen commented Dec 5, 2022

Yes. Something like env.get("os/version", "20.04"), env.get("os/name", "ubuntu") or env["package/torch"]="1.12.0"

env.get(key, default_value). The same syntax as python dict

@cutecutecat
Copy link
Member Author

cutecutecat commented Dec 5, 2022

I like this way, it could act as a hack of lazy-evaluate, if the os is a value, it would be more pythonic.

def get_os():
    info_dict = get_build_info()
    return info_dict["os"]

As it actually convey a copy/value of graph, not reference of graph, people may mistakenly think they can write the graph.

Apart from that, we couldn't provide the exact graph, but only parts of it and flatten one, as starlark don't support class/struct.

Rather than named get_build_graph, get_build_status/get_build_data/get_build_info might be better.

As it is a dict of fixed keys, We can use TypedDict for Python type hint:

from typing import TypedDict

class Movie(TypedDict):
    name: str
    year: int

movie: Movie = {'name': 'Blade Runner',
                'year': 1982}

ref: https://peps.python.org/pep-0589/

@VoVAllen
Copy link
Member

VoVAllen commented Dec 5, 2022

@cutecutecat Actually I think it's fine for the users to write into the key-value graph. For example, some library will depends on tensorrt version, and tensorrt is only provided in envdlib. User may want to check the tensorrt version when installing other libraries

@cutecutecat
Copy link
Member Author

cutecutecat commented Dec 5, 2022

@cutecutecat Actually I think it's fine for the users to write into the key-value graph. For example, some library will depends on tensorrt version, and tensorrt is only provided in envdlib. User may want to check the tensorrt version when installing other libraries

@VoVAllen @kemingy

If we support write to build graph and let it build, than user or envdlib developer are able to overwrite the key set in build procedure, which destroys envd encapsulation, like:

def build():
    base(os="ubuntu18.04", language="python3.8")
    info_dict = get_build_graph()
    info_dict["os"] = "centos6.5"
    info_dict["language"] = "R"

If these setters occurs in different extensions, it is difficult to debug where the change happens.

I think we should only support read to build graph, as it's primary element, could be revised by envd api only.

Obviously, for envdlib developer to set some persistent variables, we could introduce another API, like environment variable, it supports both read and write, and may work like this

env.set('tensorchord.envdlib.tensorrt.version', "1.0")
version = env.get('tensorchord.envdlib.tensorrt.version')

and the build graph API likes this:

os = env.graph.os()

@cutecutecat
Copy link
Member Author

There might need more discussion before proceeding this.

/unassign

@VoVAllen
Copy link
Member

VoVAllen commented Dec 6, 2022

Some thoughts:

  • Override can be a feature instead of a bug. Just like dynamic language as Python, you can monkey patch any methods on your own. For now, I don't know whether we need to let users do this.
  • To solve the overwrite problem, we can simply raise errors when the program writes to an existing key. Thus env can be a special dictionary, with some internal constraint. We can also add protection by prefixes such as envd/os/version, key starts with envd/ is protected. The only positive point I can see about env.os() over env['os'] is that method seems guaranteed to be accessible to the users, but the key might be missing in the dictionary. However, I still think the flexibility provided key-value like behavior is needed in this proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants