Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-0.9] Use Pod IP for peer communication #234

Conversation

openshift-cherrypick-robot

This is an automated cherry-pick of #220

/assign slintes

clobrano added 5 commits July 9, 2024 22:15
- re-enable and fix api check log tests in e2e test
  - use service IP for killing API connection
  - kill API connection on SNR DS pod
  - add peer check server logs and use them for test which can't
    get logs from unhealthy node's SNR agent pod
  - wait for pod deletion only, not restart (restart is caused by
    reboot, not SNR)
- refactor / cleanup e2e tests
- fix owner check / node name / machine name in peer check server
  and agent reconciler
- update sort-imports, which ignores generated files now
At startup (but it might happen in other moments too), some peers' Pod
IP can still be empty, which means that until the next peers update we
cannot check the connection with the other peers.

Return an error in case a peer's Pod IP is empty.

Signed-off-by: Carlo Lobrano <[email protected]>
@slintes
Copy link
Member

slintes commented Jul 9, 2024

why does this have merge conflicts? let's try again...

/close

Copy link
Contributor

openshift-ci bot commented Jul 9, 2024

@slintes: Closed this PR.

In response to this:

why does this have merge conflicts? let's try again...

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci openshift-ci bot closed this Jul 9, 2024
@slintes
Copy link
Member

slintes commented Jul 9, 2024

argh, it doesn't create a new PR because of merge conflict. Let's fix it here...

/reopen

Copy link
Contributor

openshift-ci bot commented Jul 9, 2024

@slintes: Reopened this PR.

In response to this:

argh, it doesn't create a new PR because of merge conflict. Let's fix it here...

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@slintes
Copy link
Member

slintes commented Jul 9, 2024

/lgtm
/approve

@openshift-ci openshift-ci bot added the lgtm label Jul 9, 2024
Copy link
Contributor

openshift-ci bot commented Jul 9, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: openshift-cherrypick-robot, slintes

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved label Jul 9, 2024
@slintes
Copy link
Member

slintes commented Jul 10, 2024

4.12, 4.13 same issue

"container create failed: time="2024-07-10T00:00:01Z" level=warning msg="cgroup: subsystem does not exist" time="2024-07-10T00:00:01Z" level=warning msg="cgroup: subsystem does not exist" time="2024-07-10T00:00:01Z" level=warning msg="cgroup: subsystem does not exist" time="2024-07-10T00:00:01Z" level=error msg="runc create failed: unable to start container process: exec: \"/manager\": stat /manager: no such file or directory" "

4.14 is new:

  [FAILED] Failed after 4.522s.
  Expected
      <time.Time>: 0001-01-01T00:00:00Z
  to be ==
      <time.Time>: 2024-07-09T22:56:51Z

seems we need to better check result of getBootTime

however, in general this is working

/override ci/prow/4.12-openshift-e2e
/override ci/prow/4.13-openshift-e2e
/override ci/prow/4.14-openshift-e2e

Copy link
Contributor

openshift-ci bot commented Jul 10, 2024

@slintes: Overrode contexts on behalf of slintes: ci/prow/4.12-openshift-e2e, ci/prow/4.13-openshift-e2e, ci/prow/4.14-openshift-e2e

In response to this:

4.12, 4.13 same issue

"container create failed: time="2024-07-10T00:00:01Z" level=warning msg="cgroup: subsystem does not exist" time="2024-07-10T00:00:01Z" level=warning msg="cgroup: subsystem does not exist" time="2024-07-10T00:00:01Z" level=warning msg="cgroup: subsystem does not exist" time="2024-07-10T00:00:01Z" level=error msg="runc create failed: unable to start container process: exec: \"/manager\": stat /manager: no such file or directory" "

4.14 is new:

 [FAILED] Failed after 4.522s.
 Expected
     <time.Time>: 0001-01-01T00:00:00Z
 to be ==
     <time.Time>: 2024-07-09T22:56:51Z

seems we need to better check result of getBootTime

however, in general this is working

/override ci/prow/4.12-openshift-e2e
/override ci/prow/4.13-openshift-e2e
/override ci/prow/4.14-openshift-e2e

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-bot openshift-merge-bot bot merged commit 45688b3 into medik8s:release-0.9 Jul 10, 2024
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants