Add option to skip blocking pod startup if driver is not ready to create a request yet #20

munnerz · 2022-02-18T15:23:16Z

This will help facilitate cert-manager/csi-driver#17 and any other uses where we want to permit the pod to startup even if the CSI driver is not yet ready.

This feature comes with the caveat that user applications/pods MUST be designed to tolerate certificate/private key data not being available when the pod first starts up (as the driver will no longer block pod startup until the volume is ready to request a certificate).

jetstack-bot · 2022-02-18T15:23:19Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: munnerz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [munnerz]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

manager/interfaces.go

7ing

overall lgtm, thanks @munnerz for this effort.

7ing · 2022-02-18T17:33:50Z

storage/memory.go

@@ -146,5 +146,10 @@ func (m *MemoryFS) ReadFiles(volumeID string) (map[string][]byte, error) {
 	if !ok {
 		return nil, ErrNotFound
 	}
-	return vol, nil
+	// make a copy of the map to ensure no races can occur


Possibly using sync.RWMutex and RLock() to avoid the read race condition ?

Given we return this map, it's not possible to do so as we can't enforce call-sites create a mutex. To push this onus onto the caller creates a fair bit of extra complexity and I'm not convinced that outweighs the performance gains.

got your idea. thanks.

7ing · 2022-02-18T18:03:20Z

driver/nodeserver.go

+	// Only wait for the volume to be ready if it is in a state of 'ready to request'
+	// already. This allows implementors to defer actually requesting certificates
+	// until later in the pod lifecycle (e.g. after CNI has run & an IP address has been
+	// allocated, if a user wants to embed pod IPs into their requests).
+	isReadyToRequest := ns.manager.IsVolumeReadyToRequest(req.GetVolumeId())
+	if isReadyToRequest || !ns.continueOnNotReady {


Could we simplify the logic here that, wait for volumeReady only if continueOnNotReady is false ?
Here, if a driver use default AlwaysReadyToRequest func, isReadyToRequest may always return true. In this case, continueOnNotReady becomes useless ?

Correct - the logic is that if we are ready to request straight away, we wait/block. This was intentional so that we only skip waiting if we aren't 'ready' to wait (and it's been configured to skip in these cases)

got it. okay to me.

JoshVanL

PR makes sense to me and gives users a way to begin implementing these features which are blocked on kube order of operation.

The main thing that sticks out to me is that there is no way for a ReadyToRequest to signal early that it is ready, and instead has to wait for the check at the next backoff step (maximum of 1 min currently). A lack of this signal does potentially increase the issue time significantly, being exponential. Just missing the ~30 sec step for example, and then having to wait for another ~minute is quite painful. Even the fact it starts at 2 seconds could be quite significant when added up over a large number of volumes/pods.

Perhaps this is an non-issue as ReadyToRequestFunc is expected to return true after a brief period anyway, but something to keep in mind.

manager/interfaces.go

test/integration/ready_to_request_test.go

…ate a request yet Signed-off-by: James Munnelly <[email protected]>

Signed-off-by: James Munnelly <[email protected]>

JoshVanL · 2022-03-29T08:56:35Z

PR is looking good to me!

/lgtrm

JoshVanL · 2022-03-29T08:56:38Z

/lgtm

munnerz requested a review from JoshVanL February 18, 2022 15:23

jetstack-bot added the dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. label Feb 18, 2022

jetstack-bot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 18, 2022

munnerz commented Feb 18, 2022

View reviewed changes

manager/interfaces.go Outdated Show resolved Hide resolved

7ing reviewed Feb 18, 2022

View reviewed changes

JoshVanL requested changes Feb 21, 2022

View reviewed changes

manager/interfaces.go Outdated Show resolved Hide resolved

test/integration/ready_to_request_test.go Outdated Show resolved Hide resolved

munnerz force-pushed the continue-on-error branch 3 times, most recently from 38e3cb2 to 24ecb2c Compare March 24, 2022 14:23

munnerz added 6 commits March 29, 2022 09:28

Add option to skip blocking pod startup if driver is not ready to cre…

82b787e

…ate a request yet Signed-off-by: James Munnelly <[email protected]>

fix race in in-memory storage implementation

16fe316

Signed-off-by: James Munnelly <[email protected]>

Add reason text to ReadyToRequest function

4ec29c9

Signed-off-by: James Munnelly <[email protected]>

Fix flakey test that expects storage to be cleaned up in a timely manner

556e0be

Signed-off-by: James Munnelly <[email protected]>

Reduce context timeout to 1s for faster tests

f3a7f8a

Signed-off-by: James Munnelly <[email protected]>

test bug: remember to assign err

e152da4

Signed-off-by: James Munnelly <[email protected]>

munnerz force-pushed the continue-on-error branch from 24ecb2c to e152da4 Compare March 29, 2022 08:49

jetstack-bot assigned JoshVanL Mar 29, 2022

jetstack-bot added the lgtm Indicates that a PR is ready to be merged. label Mar 29, 2022

jetstack-bot merged commit 69abbbc into cert-manager:main Mar 29, 2022

munnerz deleted the continue-on-error branch March 29, 2022 08:57

munnerz mentioned this pull request Mar 31, 2022

storage: create data directory when RegisterMetadata is called #21

Merged

mnaser mentioned this pull request Jan 8, 2024

ability to specify pod IP in volume attributes cert-manager/csi-driver#17

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add option to skip blocking pod startup if driver is not ready to create a request yet #20

Add option to skip blocking pod startup if driver is not ready to create a request yet #20

munnerz commented Feb 18, 2022

jetstack-bot commented Feb 18, 2022

7ing left a comment

7ing Feb 18, 2022

munnerz Mar 15, 2022

7ing Mar 16, 2022

7ing Feb 18, 2022

munnerz Mar 15, 2022

7ing Mar 16, 2022

JoshVanL left a comment

JoshVanL commented Mar 29, 2022

JoshVanL commented Mar 29, 2022

Add option to skip blocking pod startup if driver is not ready to create a request yet #20

Add option to skip blocking pod startup if driver is not ready to create a request yet #20

Conversation

munnerz commented Feb 18, 2022

jetstack-bot commented Feb 18, 2022

7ing left a comment

Choose a reason for hiding this comment

7ing Feb 18, 2022

Choose a reason for hiding this comment

munnerz Mar 15, 2022

Choose a reason for hiding this comment

7ing Mar 16, 2022

Choose a reason for hiding this comment

7ing Feb 18, 2022

Choose a reason for hiding this comment

munnerz Mar 15, 2022

Choose a reason for hiding this comment

7ing Mar 16, 2022

Choose a reason for hiding this comment

JoshVanL left a comment

Choose a reason for hiding this comment

JoshVanL commented Mar 29, 2022

JoshVanL commented Mar 29, 2022