Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] No default image is assigned to an agent running a containerized task #6157

Open
2 tasks done
JiangJiaWei1103 opened this issue Jan 10, 2025 · 2 comments
Open
2 tasks done
Assignees
Labels
bug Something isn't working

Comments

@JiangJiaWei1103
Copy link
Contributor

JiangJiaWei1103 commented Jan 10, 2025

Describe the bug

When developing and testing a new agent locally following this guide, we found that there exists a case in which no default image is assigned:

Screenshot 2025-01-10 at 9 35 47 PM

Possible Patch

In execute method of AsyncAgentExecutorMixin, SerializationSettings's image_config field isn't well initialized, as shown here. Hence, the error is raised when:

  1. container_image isn't specified in the corresponding task and
  2. ctx.serialization_settings is None

SyncAgentExecutorMixin may have the same issue.

Expected behavior

The local test should pass even if an user doesn't explicitly specify container_image parameter in the corresponding task.

Additional context to reproduce

Test Slurm agent locally with the following example,

import os

from flytekit import workflow
from flytekitplugins.slurm import Slurm, SlurmTask


echo_job = SlurmTask(
    name="echo-job-name",
    task_config=Slurm(
        slurm_host="slurm",
        # Specify a remote Slurm batch script
        batch_script_path="/home/abaowei/echo.slurm",
        batch_script_args=["--debug"],
        sbatch_conf={
            "partition": "debug",
            "job-name": "tiny-slurm",
        }
    )
    # No container_image is specified here
)


@workflow
def wf() -> None:
    echo_job()


if __name__ == "__main__":
    from flytekit.clis.sdk_in_container import pyflyte
    from click.testing import CliRunner

    runner = CliRunner()
    path = os.path.realpath(__file__)

    # Local run
    print(f">>> LOCAL EXEC <<<")
    result = runner.invoke(pyflyte.main, ["run", path, "wf"])
    print(result.output)

Screenshots

No response

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@JiangJiaWei1103 JiangJiaWei1103 added bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers labels Jan 10, 2025
@eapolinario eapolinario removed the untriaged This issues has not yet been looked at by the Maintainers label Jan 16, 2025
@eapolinario
Copy link
Contributor

@JiangJiaWei1103 , are you looking into this (I'm asking because I see you working on flyteorg/flytekit#3005)?

Let me know if that's not the case, ok?

@JiangJiaWei1103
Copy link
Contributor Author

Hi @eapolinario,

Yeh, I'm working on Slurm agent currently and encounter this issue when running a local agent test. I'll look into this and fix it.

@davidmirror-ops davidmirror-ops moved this from Backlog to Assigned in Flyte Issues/PRs maintenance Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Assigned
Development

No branches or pull requests

2 participants