简体   繁体   中英

Is there a way to get number of groups created by 'Group-Object' cmdlet?

I'm pretty sure the answer is no, but it keeps bugging me.

I have been tasked with finding duplicate files in certain location, recursively. I can do that with no problem. But seeing as some of the files have 3 or 4 duplicates I cannot answer the question of "How many files are originals?" without resorting to excel editing.

Code:

gci -path $path -recurse -file -erroraction silentlycontinue|
Select @{l='Original Filename';e={$_.PSChildName}}, @{l='Compare Filename';e={$_.BaseName.replace('_','*').replace(' ','*').replace('-','*')}}, @{l="Path";e={$_.PSParentPath.Substring(38,$_.PSParentPath.Length-38)}}, @{l="Link";e={$_.FullName}}|
group -Property 'Compare Filename'|
Where {$_.count -ge 2}|
%{$_.group}|
Export-Csv -Path $path2 -NoTypeInformation

Path variables are irrelevant, so I will not be listing them.

To reliably count the number of groups ( Microsoft.PowerShell.Commands.GroupInfo instances) that Group-Object outputs , use either of the following:

  • Pipeline -based, as suggested by zett42 ; while slow, this results in streaming processing that doesn't require collecting all Group-Object output in memory first:
(1, 1, 1 | Group-Object | Measure-Object).Count  # -> 1 (group)
  • Concise, expression -based, as suggested by Lee Dailey ; note that this involves collecting all output objects in memory first.
@(1, 1, 1 | Group-Object).Count   # -> 1 (group)

# Alternative
(1, 1, 1 | Group-Object).Length   # -> 1 (group)

Note:

  • To count all original (non-duplicate) objects, ie those that are in a group of their own, simply append | Where-Object Count -eq 1 | Where-Object Count -eq 1 to Group-Object above.

  • The use of @() , the array-subexpression operator is crucial in this case: It ensures that the Group-Object output is considered an array even if only a single group happens to be output.

    • This ensures that it is the array's .Count property that is queried rather than a single GroupInfo instance's own .Count property - which reflects the count of members of the group, and would be 3 in the example above (try (1, 1, 1 | Group-Object).Count ).
  • Alternatively, using .Length instead of .Count bypasses this naming conflict: .Length and .Count are aliases of each other and are both provided as intrinsic properties even on scalars (single objects), as part of the unified handling of scalars and collections in PowerShell: That is, PowerShell presents even any single object with .Length / .Count properties that indicate the count of that object, which by definition is 1 - unless preempted by a type-native property of the same name .

    • The intrinsic .Length property therefore works as expected, given that GroupInfo has no .Length property.

    • The inverse scenario can be demonstrated with a string scalar: 'foo'.Length is 3 - the value of the type-native .Length property reflecting the character count - where a 'foo'.Count is 1 - the intrinsic .Count property that "counts" the single object.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM