简体   繁体   English

如何在Powershell中转置数据

[英]How to transpose data in powershell

I have a file that looks like this: 我有一个看起来像这样的文件:
a,1 1
b,2 ,2
c,3 约3
a,4 a,4
b,5 b,5
c,6 约6
(...repeat 1,000s of lines) (...重复1,000线)

How can I transpose it into this? 我怎样才能将它转换成这个?
a,b,c a,b,c
1,2,3 1,2,3
4,5,6 4,5,6

Thanks 谢谢

Here's a brute-force one-liner from hell that will do it: 这是一个来自地狱的蛮力一线将做到的:

PS> Get-Content foo.txt | 
      Foreach -Begin {$names=@();$values=@();$hdr=$false;$OFS=',';
                      function output { if (!$hdr) {"$names"; $global:hdr=$true}
                                        "$values";
                                        $global:names=@();$global:values=@()}} 
              -Process {$n,$v = $_ -split ',';
                        if ($names -contains $n) {output};
                        $names+=$n; $values+=$v } 
              -End {output}
a,b,c
1,2,3
4,5,6

It's not what I'd call elegant but should get you by. 这不是我所说的优雅,但应该可以帮助您。 This should copy/paste correctly as-is. 这应该照原样正确复制/粘贴。 However if you reformat it to what is shown above you will need put back-ticks after the last curly on both the Begin and Process scriptblocks. 但是,如果将其重新格式化为上面显示的内容,则需要在Begin和Process脚本块上的最后一次卷曲之后放回勾号。 This script requires PowerShell 2.0 as it relies on the new -split operator. 该脚本需要PowerShell 2.0,因为它依赖于新的-split运算符。

This approach makes heavy use of the Foreach-Object cmdlet. 此方法大量使用了Foreach-Object cmdlet。 Normally when you use Foreach-Object (alias is Foreach) in the pipeline you specify just one scriptblock like so: 通常,当您在管道中使用Foreach-Object(别名为Foreach)时,只需指定一个脚本块,如下所示:

Get-Process | Foreach {$_.HandleCount}

That prints out the handle count for each process. 打印出每个进程的句柄计数。 This usage of Foreach-Object uses the -Process scriptblock implicitly which means it executes once for each object it receives from the pipeline. Foreach-Object的这种用法隐式使用-Process脚本块,这意味着它对从管道接收到的每个对象执行一次。 Now what if we want to total up all the handles for each process? 现在,如果我们要总计每个进程的所有句柄,该怎么办? Ignore the fact that you could just use Measure-Object HandleCount -Sum to do this, I'll show you how Foreach-Object can do this. 忽略您仅可以使用Measure-Object HandleCount -Sum来执行此操作的事实,我将向您展示Foreach-Object如何执行此操作。 As you see in the original solution to this problem, Foreach can take both a Begin scriptblock that is executed once for the first object in the pipeline and a End scripblock that executes when there are no more objects in the pipeline. 正如您在此问题的原始解决方案中看到的那样,Foreach可以采用对管道中的第一个对象执行一次的Begin脚本块和在管道中没有更多对象时执行的End scripblock。 Here's how you can total the handle count using Foreach-Object: 这是使用Foreach-Object总计句柄计数的方法:

gps | Foreach -Begin {$sum=0} -Process {$sum += $_.HandleCount } -End {$sum}

Relating this back to the problem solution, in the Begin scriptblock I initialize some variables to hold the array of names and values as well as a bool ($hdr) that tells me whether or not the header has been output (we only want to output it once). 与此相关的问题解决方法是,在Begin脚本块中,我初始化一些变量以保存名称和值的数组以及一个布尔值($ hdr),该布尔值告诉我是否已输出标头(我们只想输出一次)。 The next mildly mind blowing thing is that I also declare a function (output) in the Begin scriptblock that I call from both the Process and End scriptblocks to output the current set of data stored in $names and $values. 下一个令人不寒而栗的事情是,我还在Begin脚本块中声明了一个函数(输出),我从Process和End脚本块中调用该函数以输出存储在$ names和$ values中的当前数据集。

The only other trick is that the Process scriptblock uses the -contains operator to see if the current line's field name has already been seen before. 唯一的另一招是Process脚本块使用-contains运算符来查看当前行的字段名称是否以前已经被查看过。 If so, then output the current names and values and reset those arrays to empty. 如果是这样,则输出当前名称和值并将这些数组重置为空。 Otherwise just stash the name and value in the appropriate arrays so they can be saved later. 否则,只需将名称和值存储在适当的数组中,以便以后保存。

BTW the reason the output function needs to use the global: specifier on the variables is that PowerShell performs a "copy-on-write" approach when a nested scope modifies a variable defined outside its scope. 顺便说一句,输出函数需要对变量使用global:说明符的原因是,当嵌套作用域修改其作用域之外定义的变量时,PowerShell将执行“写时复制”方法。 However when we really want that modification to occur at the higher scope, we have to tell PowerShell that by using a modifier like global: or script:. 但是,当我们确实希望该修改在更高的范围内发生时,我们必须通过使用诸如global:或script:这样的修改器来告诉PowerShell。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM