Use bulk flag for all odf data pools for performance gain #2980

malayparida2000 · 2025-01-28T17:41:55Z

Ref-https://issues.redhat.com/browse/RHSTOR-6774
The bulk flag makes the autoscaler start with the max number of PGs, and the autoscaler then decreases the pg count only if the PG usage starts to skew too much. This could improve performance for ODF users due to great amount of parallelism during read/write on large clusters.

openshift-ci · 2025-01-28T17:42:04Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: malayparida2000
Once this PR has been reviewed and has the lgtm label, please assign nb-ohad for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

travisn · 2025-01-28T17:52:33Z

controllers/storagecluster/cephobjectstores.go

@@ -173,6 +173,7 @@ func (r *StorageClusterReconciler) newCephObjectStoreInstances(initData *ocsv1.S
 					EnableCrushUpdates: true,
 					FailureDomain:      initData.Status.FailureDomain,
 					Replicated:         generateCephReplicatedSpec(initData, "data"),
+					Parameters:         map[string]string{"bulk": "true"},
 				},
 				MetadataPool: cephv1.PoolSpec{


@BlaineEXE Should we be setting the bulk flag on the object and filesystem metadata pools, in addition to the data pools?

Anthony and Kyle seemed to be saying yes when we chatted. It sounds a bit counterintuitive to me as well. My best guess is that the performance gains from increased parallelism must end up being a bigger positive effect than the performance loss from having info split into more chunks.

It might be good if we could test the performance effects of bulk on the meta pools to see if there's an impact either way.

The bulk flag makes the autoscaler start with the max number of PGs, and the autoscaler then decreases the pg count only if the PG usage starts to skew too much. This could improve performance for ODF users due to great amount of parallelism during read/write on large clusters. Signed-off-by: Malay Kumar Parida <[email protected]>

malayparida2000 · 2025-01-29T12:36:55Z

/cc @travisn @BlaineEXE

BlaineEXE

Are there plans to allow this to be configured via any means? I trust Kyle and Anthony implicitly, but we still don't have a measure of how the bulk flag impacts ODF specifically. Further, we don't know whether ODF's common 3-OSD/4-OSD clusters will be impacted positively or negatively by this change. My vague understanding is that 3-/4-OSD clusters are much less common for RHCS than for ODF, and my intuition says that cluster size may play an important part in the performance impacts.

I would recommend having some mechanism for enabling/disabling this behavior. We don't have to document it for users, but I think it will still be useful to us internally. That configurability will make it easier for the performance team to do A:B testing to quantify the performance changes, and it will allow us an easy way to disable this change close to ODF release if we have to by changing the "business logic" instead of the code. It'll also give us the ability to disable/enable it for customer environments in cases where that is imporant to them.

Travis is also asking an important question here. His and my intuition both suggest that the RGW metadata pools might respond to the bulk flag differently than other pools. Thus, I would also suggest allowing the CephObjectStore metadata pools have bulk enabled/disabled independently from the rest of the cluster. This is essentially for the same reasons above.

Here's my guess about what we will find with bulk flag performance testing:

I think we will find that the bulk flag has no impact on 3-OSD clusters on the whole
I suspect there will be noticeable positive impact on clusters with 9-12 or more OSDs
There is a risk we find S3 metadata operation slowdown with 3-OSD clusters, possibly resulting in minor S3 I/O loss until the cluster has 6-9 OSDs, where we will break even and then see gradual improvements

travisn reviewed Jan 28, 2025

View reviewed changes

malayparida2000 force-pushed the bulk_flag branch from b521570 to bd84812 Compare January 28, 2025 18:24

openshift-ci bot requested review from BlaineEXE and travisn January 29, 2025 12:36

BlaineEXE reviewed Jan 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use bulk flag for all odf data pools for performance gain #2980

Use bulk flag for all odf data pools for performance gain #2980

malayparida2000 commented Jan 28, 2025 •

edited

Loading

openshift-ci bot commented Jan 28, 2025

travisn Jan 28, 2025

BlaineEXE Jan 29, 2025 •

edited

Loading

malayparida2000 commented Jan 29, 2025

BlaineEXE left a comment

Use bulk flag for all odf data pools for performance gain #2980

Are you sure you want to change the base?

Use bulk flag for all odf data pools for performance gain #2980

Conversation

malayparida2000 commented Jan 28, 2025 • edited Loading

openshift-ci bot commented Jan 28, 2025

travisn Jan 28, 2025

Choose a reason for hiding this comment

BlaineEXE Jan 29, 2025 • edited Loading

Choose a reason for hiding this comment

malayparida2000 commented Jan 29, 2025

BlaineEXE left a comment

Choose a reason for hiding this comment

malayparida2000 commented Jan 28, 2025 •

edited

Loading

BlaineEXE Jan 29, 2025 •

edited

Loading