简体   繁体   中英

powershell - combine array properties in new one

I have some array with lot of rows (thousands and more). It have columns like 'Column XXXXX '

Row example:

Group_name = "proxy_users"
Column1   = "domain\Igor"
Column2   = null
.......
Column989 = 'domain\Andrew'
Column999 = 'domain\Mike'

What is the right and FAST way to create new variable that will be the sum of 'ColumnXXX' ignoring 'null' values?

Like "domain\\igor, domain\\Andrew, domain\\mike"

I can use smth like $group | select -Property "Column*"...but how to sum and how to ignore null ?

You can list all properties using ex. $_.psobject.properties , filter out the ones you want and use -join to combine the values. Ex

$o = [pscustomobject]@{
    Group_Name = "Test"
    Column1 = "Foo"
    Column2 = $null
    Column3 = "1"
}, [pscustomobject]@{
    Group_Name = "Test2"
    Column1 = $null
    Column2 = "Bar"
    Column3 = "2"
}

#Add property as constant
$o | ForEach-Object {
    $_ | Add-Member -NotePropertyName Summed -NotePropertyValue (($_.psobject.Properties | Where-Object { $_.Name -ne 'Group_name' -and $_.Value -ne 'null' } | Select-Object -ExpandProperty Value) -join '' )
}
$o | ft

Or you can use a ScriptProperty to calculate the value on every call

#Remember to exclude itself to avoid infinite recursion
$o | Add-Member -MemberType ScriptProperty -Name Summed -Value {($this.psobject.Properties | Where-Object { $_.Name -ne 'Group_name' -and $_.Name -ne 'Summed' -and $_.Value -ne 'null' } | Select-Object -ExpandProperty Value) -join '' }
$o | ft

Result:

Group_Name Column1 Column2 Column3 Summed
---------- ------- ------- ------- ------
Test       Foo             1       Foo1
Test2              Bar     2       Bar2

As an alternative, you can use $_.Name -like 'Column*' -and $_.Value -ne 'null' as the property-filter if they are actually called ColumnXXX .

To complement Frode F.'s elegant ScriptProperty-based solution with a more concise PSv4+ version that also performs better due to avoiding the use of cmdlets (a pipeline ) in the property-definition script block in favor of operators (with a large number of input objects, this may matter):

$o | Add-Member -PassThru -MemberType ScriptProperty -Name Joined -Value { 
 @($this.psobject.Properties.Where({$_.Name -match 'Column*'}).Value) -ne 'null' -join ', ' 
}

Note the use of:

  • The .Where() collection operator (PSv4+), a faster alternative to the Where-Object cmdlet.

  • Member enumeration (PSv3+), where accessing a property at the collection level yields an array of the elements' property values;
    eg, $this.psobject.Properties.Name yields the .Name property values of all elements of the $this.psobject.Properties collection.

  • Applying comparison operator -ne to an array-valued LHS, in which case the operator acts as a filter : the operator is applied to each element , and the matching elements are returned as an array; note the @(...) around the LHS, which ensures that it is treated as an array even if it happens to return only a single value.

With the sample data in the question, the above yields (look for property Joined ):

Group_name : proxy_users
Column1    : domain\Igor
Column2    : null
Column989  : domain\Andrew
Column999  : domain\Mike
Joined     : domain\Igor, domain\Andrew, domain\Mike

With the above optimizations, you could even consider a simpler Select-Object solution that constructs new custom-object instances that contain all of the input object's properties ( * ) plus a new calculated property that contains the combined column values of interest:

$o | Select-Object *, @{ n='Joined'; e={ 
  @(($_.psobject.Properties.Where({$_.Name -match 'Column*'})).Value) -ne 'null' -join ', ' } 
}

This solution is slower than the one above, but not by much.
A caveat is that if you collect all output in memory, the newly constructed objects take up space in addition to the originals.


Optional reading: Performance comparison; the impact of pipeline use:

The code at the bottom times the various approaches from Frode's and my answer.

Here's a sample timing from my machine - the absolute numbers aren't important, but their ratios are (though I don't know how factors such as CPU core count and disk speed would come into play) - input set size is 1000 objects:

Approach                                 TotalSeconds
--------                                 ------------
add-member w/ scriptproperty - operators 0.35
select-object w/ noteproperty            0.40
add-member w/ scriptproperty - pipeline  0.72
add-member w/ noteproperty - pipeline    0.98

Conclusions:

  • The solutions that use pipelines in the property definition are noticeably slower, by a factor of about 2, which doesn't seem to vary even with larger input counts.

  • The optimized Select-Object -based solution that creates new objects (rather than adding properties to the input objects) is only slightly slower than the optimized Add-Member solution.

Source code for the tests:

# The number of input objects to test with.
# Adjust to experiment.
$objCount = 1000

Write-Verbose -Verbose "# of input objects: $objCount"

$ndx = 0; $approaches = 'select-object w/ noteproperty', 
                        'add-member w/ scriptproperty - operators', 
                        'add-member w/ scriptproperty - pipeline', 
                        'add-member w/ noteproperty - pipeline'
$tempFile = New-TemporaryFile # requires PSv5+
$(foreach($approach in $approaches) {

  # Create test collection (recreate it in every iteration to support repeated Add-Member operations)
  $al = [System.Collections.ArrayList]::new()
  for ($i = 0; $i -lt $objCount; ++$i) {
    $null = $al.add([pscustomobject] @{ one = 1; Column1 = 'col1'; Column2 = 'col2'; Column3 = 'null'; Column4 = 'col4' })
  }

  Measure-Command {

    Write-Verbose -Verbose "Running: $approach..."
    switch ($ndx) {
      0 { # select-object w/ noteproperty
        $al | Select-Object *, @{ n='Joined'; e={ @(($_.psobject.Properties.Where({ $_.Name -match 'Column*'})).Value) -ne 'null' -join ', ' } } | 
          Export-Csv $tempFile  
        break
      }

      1 { # add-member w/ scriptproperty - operators
        $al | Add-Member -PassThru -MemberType ScriptProperty -Name Joined -Value { @($this.psobject.Properties.Where({ $_.Name -match 'Column*'}).Value) -ne 'null' -join ', ' } |
          Export-Csv $tempFile          
        break
      }

      2 { # add-member w/ scriptproperty - pipeline
        $al | Add-Member -PassThru -MemberType ScriptProperty -Name Joined -Value { ($this.psobject.Properties | Where-Object { $_.Name -match 'Column*'  -and $_.Value -ne 'null' } | Select-Object -ExpandProperty Value) -join ', ' }  | 
         Export-Csv $tempFile
         break
        }

        3 { # add-member w/ noteproperty - pipeline; requires an intermediate ForEach-Object call
          $al | ForEach-Object {
            $_ | Add-Member -PassThru -NotePropertyName Joined -NotePropertyValue (($_.psobject.Properties | Where-Object { $_.Name -match 'Column*' -and $_.Value -ne 'null' } | Select-Object -ExpandProperty Value) -join ', ' )
          } |
            Export-Csv $tempFile
        break
      }
      default { Throw "What are you doing here?" }
    }

    # Import-Csv $tempFile | Select-Object -First 1 Joined | Write-Verbose -Verbose

    ++$ndx

  } | Select-Object @{ n='Approach'; e={ $approach }}, @{ n='TotalSeconds'; e={ '{0:N2}' -f $_.TotalSeconds } }

}) | Sort-Object { [double] $_.TotalSeconds }

Remove-Item $tempFile

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM