Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue when setting up a flow with concurrency limits #2709

Closed
peauc opened this issue Dec 12, 2023 · 9 comments
Closed

Issue when setting up a flow with concurrency limits #2709

peauc opened this issue Dec 12, 2023 · 9 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@peauc
Copy link

peauc commented Dec 12, 2023

Explain the bug

There might be an issue with the latest Kestra release.
I modified an existing flow and added the new concurrency limits -> https://kestra.io/docs/developer-guide/concurrency.

I deployed my changes with a CI/CD pipeline https://kestra.io/docs/developer-guide/cicd.
However jobs are still able to start concurrently and the job definition is up to date in the UI.

After making a misc change in the flow and saving it, then the concurrency limit was enforced.
I have been able to reproduce this bug two times on two different deployments of Kestra.

Environment Information

  • Kestra Version: 13.0
  • Operating System and Java Version (if not using Kestra Docker image): docker image 13.0-full
@peauc peauc added the bug Something isn't working label Dec 12, 2023
@anna-geller anna-geller added this to the v0.15.0 milestone Dec 12, 2023
@anna-geller
Copy link
Member

anna-geller commented Dec 12, 2023

do you suggest that concurrency limits are not active when you deploy your flow from CI? what kind of CI do you use - terraform, GitHub Actions, sth else?

can you provide a reproducer i.e. a specific flow for which it happened + context how exactly you deployed the flow? did you see concurrency limits being configured in the flow after deploying it from CI?

it might be some issue with the trigger or some way executions are triggered that used an old version of the flow - can you validate and provide an exact flow version for which this happened?

So far I couldn't reproduce with Terraform deployment of this flow:

id: concurrency_limited_flow
namespace: dev

concurrency:
  behavior: QUEUE # or CANCEL or FAIL
  limit: 1

tasks:
  - id: bash
    type: io.kestra.plugin.scripts.shell.Commands
    commands:
      - sleep 10

image

@peauc
Copy link
Author

peauc commented Dec 13, 2023

Hello, thanks for your answer.

This is my flow


id: XXX
namespace: XXX
description: XXX
variables:
  crontab: "XXX"

concurrency:
  behavior: QUEUE
  limit: 1

tasks:
  - id: call-sentry-start-process
    type: io.kestra.core.tasks.flows.Flow
    namespace: admin
    flowId: post-cron-sentry-call
    inputs:
      crontab: "{{ vars.crontab }}"
    wait: true
    outputs:
      checkindId: "{{outputs.returnCheckinId.value}}"

  - id: XXX
    type: io.kestra.plugin.jdbc.postgresql.Query
    url: "{{ envs.postgis_jdbc_url }}"
    username: "postgres"
    password: "{{ secret('PGPASSWORD') }}"
    sql: XXX
  - id: XXX
    type: io.kestra.plugin.jdbc.postgresql.Query
    url: "{{ envs.postgis_jdbc_url }}"
    username: "postgres"
    password: "{{ secret('PGPASSWORD') }}"
    sql: XXX
  - id: XXX
    type: io.kestra.core.tasks.flows.Parallel
    concurrent: 2
    tasks:
      - id: XXX
        type: io.kestra.plugin.jdbc.postgresql.Query
        url: "{{ envs.postgis_jdbc_url }}"
        username: "postgres"
        password: "{{ secret('PGPASSWORD') }}"
        sql: XXX
      - id: XXX
        type: io.kestra.plugin.jdbc.postgresql.Query
        url: "{{ envs.postgis_jdbc_url }}"
        username: "postgres"
        password: "{{ secret('PGPASSWORD') }}"
        sql: XXX
  - id: "XXX"
    type: "io.kestra.plugin.fs.http.Request"
    uri: "{{ envs.XXX }}{{ envs.XXX }}"
    method: "GET"
  - id: launch
    type: io.kestra.core.tasks.flows.Flow
    namespace: varnish
    flowId: XXX
    wait: true
    transmitFailed: true
  - id: ingest
    type: io.kestra.plugin.scripts.shell.Commands
    runner: PROCESS
    warningOnStdErr: false
    commands:
      - XXX

  - id: call-sentry-ok
    type: io.kestra.core.tasks.flows.Flow
    namespace: admin
    flowId: put-cron-sentry-call
    inputs:
      status: "ok"
      checkinId: "{{ outputs['call-sentry-start-process'].outputs.checkindId }}"

triggers:
  - id: schedule
    type: io.kestra.core.models.triggers.types.Schedule
    cron: "XXXX"
  - id: webhook
    type: io.kestra.core.models.triggers.types.Webhook
    key: XXXX

listeners:
  - tasks:
      - id: call-sentry-error
        type: io.kestra.core.tasks.flows.Flow
        namespace: admin
        flowId: put-cron-sentry-call
        inputs:
          status: "error"
          checkinId: "{{ outputs['call-sentry-start-process'].outputs.checkindId }}"
      - id: mail
        type: io.kestra.plugin.notifications.slack.SlackExecution
        url: "{{ envs.backoffice_slack_channel_webhook }}"
        customMessage: "[{{ envs.env }}]"
    conditions:
      - type: io.kestra.core.models.conditions.types.ExecutionStatusCondition
        in:
          - FAILED

For context:
I use git to version my code and a gitlab CI/CD pipeline to deploy my code.
The flow was already existing in my Kestra deployment so it was only an update.
The concurrency limit of 1 was not enforced by my Kestra and I could start many executions.
I modified the flow definition in Kestra's UI and changed the limit from 1 to 2, saved, and changed from 2 to 1 again. Now concurrency limits were enforced and my new executions were properly queued.

@anna-geller
Copy link
Member

thank you! it seems that there must have been some transient issue with the revision update. Can you schedule a meeting with @Ben8t to try to reproduce live in a call? (we have no GitLab account so hard to reproduce)

there is no difference between how flow metadata is stored in the backend between code deployed via CI vs. from the UI, so the only possibility seems to be with the revision that was used for that execution.

@peauc
Copy link
Author

peauc commented Dec 13, 2023

I had a call with @Ben8t. He escalated it internally.

@Ben8t
Copy link
Member

Ben8t commented Dec 13, 2023

I'm able to reproduce with GitLab too 👍

@anna-geller
Copy link
Member

It's fascinating; it really could be something with our GitLab CI setup. I was trying multiple variations of deploying with Terraform and the concurrency limit was always respected regardless of how I made updates:

image

@Ben8t
Copy link
Member

Ben8t commented Dec 14, 2023

FYI, I tested only with the CLI (not GitLab) and it shows the same issue !
cc @Skraye

@Skraye
Copy link
Member

Skraye commented Dec 14, 2023

fix by #2714

@Skraye Skraye closed this as completed Dec 14, 2023
@peauc
Copy link
Author

peauc commented Dec 15, 2023

Thanks 💪 great work.

@anna-geller anna-geller modified the milestones: v0.15.0, v0.14.0 Jan 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants