Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed refactor with ~90% performance improvement on Export-TargetResource (for affected resources) #5615

Open
niwamo opened this issue Jan 9, 2025 · 1 comment

Comments

@niwamo
Copy link

niwamo commented Jan 9, 2025

Description of the issue

The interaction between Export-TargetResource and Get-TargetResource is highly inefficient for multiple-instance resources (e.g. AADUser or other resources where multiple instances are expected). In pseudo-code:

function Export-TargetResource
{
    Connect-ToWorkload
    SendTelemetry
    GetAllResources
    foreach ($instance in $instances)
    {
        Get-TargetResource -params
    }
}

function Get-TargetResource
{
    Connect-ToWorkload
    SendTelemetry
    FetchResource
    FormatResourceAndFetchAdditionalProperties 
    return
}

There are three significant performance impacts:

  1. Calling New-M365DSCConnection for every iteration of Get-TargetResource. While it avoids reconnecting unnecessarily, it still results in thousands of extraneous + identical API calls for Get-AcceptedDomain when it is invoked for the EXO workload (and I have verified this with several Fiddler traces). Even if that could be fixed, there's no need to call the function, and invoking it thousands of times unnecessarily should be avoided.
  2. Sending telemetry on every iteration of Get-TargetResource is enormously impactful to the performance of this module. Without any other changes, I determined that the same export took 8,925 seconds with telemetry enabled and 1,400 seconds without. In other words, telemetry - just telemetry - accounted for ~85% of my export's runtime. I would by no means suggest removing telemetry altogether, but it does seem highly redundant to log the invocation of Export-TargetResource and each invocation of Get-TargetResource from within the export. If there is a desire to track the number of Get-TargetResource calls or the average function time, I would suggest tracking these items within the export function and sending that data as a single telemetry call. There's just no good reason for anyone to keep telemetry enabled when it causes an export to run (literally) 8x slower; reducing the telemetry without reducing the useful information captured seems like a win for everyone.
  3. For many resources, the Get-TargetResource function actually re-fetches the same instance information from the same API as Export-TargetResource. This should clearly be avoided, and has been for many modules: Export-TargetResource declares $Script:exportedInstances and Get-TargetResources checks if this variable exists, then filters the array to find the right instance. This is better but can be further improved. Why not declare $Script:exportedInstance inside of the export's foreach loop and avoid the filtering inside of Get-TargetResource? For resources with large numbers of instances, this makes a meaningful impact.

Altogether, my code changes the relevant section of a typical Get-TargetResource from this:

New-M365DSCConnection -Workload 'MicrosoftGraph' `
        -InboundParameters $PSBoundParameters

Confirm-M365DSCDependencies

#region Telemetry
$ResourceName = $MyInvocation.MyCommand.ModuleName -replace 'MSFT_', ''
$CommandName = $MyInvocation.MyCommand
$data = Format-M365DSCTelemetryParameters -ResourceName $ResourceName `
    -CommandName $CommandName `
    -Parameters $PSBoundParameters
Add-M365DSCTelemetryEvent -Data $data
#endregion

$nullReturn = $PSBoundParameters
$nullReturn.Ensure = 'Absent'
try
{
    # fetch resource
    if ($null -eq $resource) { return $nullReturn }
    # format and enrich
    return
}
catch
{
    # log
}

to this:

try
{
    if (-not $Script:exportedInstance)
    {
        New-M365DSCConnection -Workload 'MicrosoftGraph' `
                -InboundParameters $PSBoundParameters
        
        Confirm-M365DSCDependencies
        
        #region Telemetry
        $ResourceName = $MyInvocation.MyCommand.ModuleName -replace 'MSFT_', ''
        $CommandName = $MyInvocation.MyCommand
        $data = Format-M365DSCTelemetryParameters -ResourceName $ResourceName `
            -CommandName $CommandName `
            -Parameters $PSBoundParameters
        Add-M365DSCTelemetryEvent -Data $data
        #endregion
        
        $nullReturn = $PSBoundParameters
        $nullReturn.Ensure = 'Absent'

        # fetch resource
        if ($null -eq $resource) { return $nullReturn }
    }
    else
    {
        $resource = $Script:exportedInstance
    }
    # format and enrich
    return
}
catch
{
    # log
}

I have made these changes locally for 61 resources. In my regression testing I found no "breakage" and actually found that this model fixed two bugs I was previously unaware of. (In both cases, the relevant API - strangely - returned more properties for bulk-retrieved items than individually retrieved items, so leveraging the exportedInstance resulted in keeping previously missing properties.

My performance test results with identical configurations (runtime, measured in seconds):

Telemetry Enabled Dev Branch without changes Dev Branch with recommended changes
True 8925 865
False 1400 650

Since this would be a large PR, I am hoping to get initial feedback from the maintainers before proceeding.
@NikCharlebois @ykuijs

Microsoft 365 DSC Version

DEV / 1.24.1218.1

Which workloads are affected

Azure Active Directory (Entra ID), Exchange Online, Security & Compliance Center, SharePoint Online, Office 365 Admin

The DSC configuration

Verbose logs showing the problem

Environment Information + PowerShell Version

@FabienTschanz
Copy link
Contributor

I love it. I've been thinking about improvements in the speed as well because I wasn't very satisfied with it in my environment too, and your changes look great to me. I'll take a look at the telemetry function and why it could take so long.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants