[英]Transpose repeating data from rows into columns Excel
I have data set of basic housing data in the following format:我有以下格式的基本住房数据数据集:
Existing data format:现有数据格式:
That format is the same and reapeats for hundrets of properties.该格式是相同的,并且重复用于数百个属性。 I would like to transform that that into a table format like the following example:
我想将其转换为表格格式,如下例所示:
Property Type![]() |
Price![]() |
Location![]() |
Region![]() |
Additional info![]() |
Area![]() |
---|---|---|---|---|---|
House![]() |
252000 ![]() |
London![]() |
Kensington![]() |
4500 square meters ![]() |
|
... ![]() |
... ![]() |
... ![]() |
... ![]() |
... ![]() |
etc ![]() |
In other words I want to make the text before ":" symbol column name with the text after it the data that goes into into the corresponding cell and to repeat that for hundrets of sites.换句话说,我想在“:”符号列名之前制作文本,之后的文本是进入相应单元格的数据,并为数百个站点重复该操作。 Usually there is missing(no data) in Additional info but sometimes there is.
通常附加信息中缺少(无数据),但有时有。 I am not shure which is the best program to do this.
我不确定哪个是执行此操作的最佳程序。 So far in my mind comes Excel but if there is an easier way I will be glad to use it.
到目前为止,我想到的是 Excel 但如果有更简单的方法,我会很乐意使用它。
As per my below screenshot Excel 365
I have used following formulas.根据我下面的屏幕截图
Excel 365
我使用了以下公式。
C2=FILTERXML("<t><s>"&SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,1,4)),": ","</s><s>")&"</s></t>","//s[last()]")
D2=FILTERXML("<t><s>"&SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,2,4)),": ","</s><s>")&"</s></t>","//s[last()]")
E2=FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,3,4)),",","</s><s>"),":","</s><s>")&"</s></t>","//s[2]")
F2=FILTERXML("<t><s>"&SUBSTITUTE(SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,3,4)),",","</s><s>"),":","</s><s>")&"</s></t>","//s[last()-1]")
H2=FILTERXML("<t><s>"&SUBSTITUTE(INDEX($A:$A,SEQUENCE(COUNTA($A:$A)/4,1,4,4)),": ","</s><s>")&"</s></t>","//s[last()]")
If you are not in Excel 365
then can try-如果您不在
Excel 365
中,则可以尝试-
=FILTERXML("<t><s>"&SUBSTITUTE(INDEX($A:$A,ROW($A1)+(ROW($A1)-1)*3),": ","</s><s>")&"</s></t>","//s[last()]")
Basically =ROW(A1)+(ROW(A1)-1)*3
will generate a sequence of row numbers and INDEX($A:$A,ROW($A1)+(ROW($A1)-1)*3)
will return value from Column A
as per that sequence.基本上
=ROW(A1)+(ROW(A1)-1)*3
会生成一系列行号和INDEX($A:$A,ROW($A1)+(ROW($A1)-1)*3)
将按照该顺序从Column A
返回值。 Then FILTERXML()
will return expected value specified in xPath
parameter.然后
FILTERXML()
将返回xPath
参数中指定的预期值。
To know, how FILTERXML()
works yo can read this article from JvdV.要了解
FILTERXML()
的工作原理,您可以阅读 JvdV 的这篇文章。 This is a fantastic article for FILTERXML()
lover.这是
FILTERXML()
爱好者的精彩文章。
You can obtain your desired output using Power Query
, available in Windows Excel 2010+ and Office 365 Excel您可以使用
Power Query
获得您想要的 output ,在 Windows Excel 2010+ 和 Office 316084EDE84F563 中可用
Data => Get&Transform => From Table/Range
Home => Advanced Editor
Home => Advanced Editor
Applied Steps
window, to better understand the algorithm and stepsApplied Steps
window,以更好地理解算法和步骤Note: The fnPivotAll
function is a custom function that enables a method of creating a non-aggregated Pivot Table where there are multiple values per Pivot Column. Note: The
fnPivotAll
function is a custom function that enables a method of creating a non-aggregated Pivot Table where there are multiple values per Pivot Column. From the UI, you add this as a New Query
from Blank
, and just paste that M-code in place of what's there在 UI 中,您将其添加为来自
Blank
的New Query
,然后粘贴该 M 代码来代替那里的内容
M-Code (for main query) M-Code (用于主查询)
let
//Read in data
//Change table name in next line to your actural table name
Source = Excel.CurrentWorkbook(){[Name="Table1_2"]}[Content],
//Split by comma into new rows
#"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(Source, {{"Column1",
Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv),
let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1"),
//Remove the blank rows
#"Filtered Rows" = Table.SelectRows(#"Split Column by Delimiter", each ([Column1] <> "" and [Column1] <> " ")),
//Split by the rightmost colon only into new columns
#"Split Column by Delimiter1" = Table.SplitColumn(#"Filtered Rows", "Column1",
Splitter.SplitTextByEachDelimiter({":"}, QuoteStyle.Csv, true), {"Column1.1", "Column1.2"}),
//Split by the remaining colon into new rows
// So as to have empty rows under "Additional data"
//Then Trim the columns to remove leading/trailing spaces
#"Split Column by Delimiter2" = Table.ExpandListColumn(Table.TransformColumns(#"Split Column by Delimiter1", {{"Column1.1", Splitter.SplitTextByDelimiter(":", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1.1"),
#"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter2",{{"Column1.1", type text}, {"Column1.2", type text}}),
#"Trimmed Text" = Table.TransformColumns(#"Changed Type",{{"Column1.1", Text.Trim, type text}, {"Column1.2", Text.Trim, type text}}),
//Create new column processing "Additional Data" to show a blank
// and Price to just show the numeric value, splitting from "EUR"
#"Added Custom" = Table.AddColumn(#"Trimmed Text", "Custom", each if [Column1.1] = "Additional data" then " "
else if [Column1.1] = "Price" then Text.Split([Column1.2]," "){1} else [Column1.2]),
//Remove unneeded column
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Column1.2"}),
//non-aggregated pivot
pivot = fnPivotAll(#"Removed Columns","Column1.1","Custom"),
//set data types (frequently a good idea in PQ
#"Changed Type1" = Table.TransformColumnTypes(pivot,{
{"Property type", type text},
{"Location", type text},
{"region", type text},
{"Additional data", type text},
{"Area", type text},
{"Price", Currency.Type}})
in
#"Changed Type1"
M-Code (for custom function) M-Code (用于自定义功能)
be sure to rename this query: fnPivotAll
请务必重命名此查询:
fnPivotAll
//credit: Cam Wallace https://www.dingbatdata.com/2018/03/08/non-aggregate-pivot-with-multiple-rows-in-powerquery/
(Source as table,
ColToPivot as text,
ColForValues as text)=>
let
PivotColNames = List.Buffer(List.Distinct(Table.Column(Source,ColToPivot))),
#"Pivoted Column" = Table.Pivot(Source, PivotColNames, ColToPivot, ColForValues, each _),
TableFromRecordOfLists = (rec as record, fieldnames as list) =>
let
PartialRecord = Record.SelectFields(rec,fieldnames),
RecordToList = Record.ToList(PartialRecord),
Table = Table.FromColumns(RecordToList,fieldnames)
in
Table,
#"Added Custom" = Table.AddColumn(#"Pivoted Column", "Values", each TableFromRecordOfLists(_,PivotColNames)),
#"Removed Other Columns" = Table.RemoveColumns(#"Added Custom",PivotColNames),
#"Expanded Values" = Table.ExpandTableColumn(#"Removed Other Columns", "Values", PivotColNames)
in
#"Expanded Values"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.