[英]How can I speed up summing an array of structs grouped by two properties - macOS Swift
The code I am using is shown below but seems very slow to calculate the sum - around 20 seconds.我正在使用的代码如下所示,但计算总和似乎非常慢 - 大约 20 秒。 Any suggestions for how to speed this up?
有关如何加快速度的任何建议?
Actually its a bit more complicated since I need to create a fine result object that includes all the original properties and the count updated to the sum.实际上它有点复杂,因为我需要创建一个包含所有原始属性和更新为总和的精细结果 object。
struct PAData: Equatable, Hashable {
let pCode: String
let aCode: String
let otherProperty1: String // Unique to pCode
let otherProperty2: String // Unique to pCode
let count: Int
static func == (lhs: PAData, rhs: PAData) -> Bool {
return
lhs.pCode == rhs.pCode &&
lhs.aCode == rhs.aCode
}
func hash(into hasher: inout Hasher) {
hasher.combine(pCode)
hasher.combine(aCode)
}
}
// Group by aCode and pCode and sum count
func calcSum() {
// Find the unique records based on pCode/aCode properties - very fast takes 0.1 second
let unique = Set<PAData>(paData)
// Now find the sum for each pCode/aCode group - too slow takes 20 seconds to complete - how to speed this up ?
// Really only needs to be done for those that have more than one record !??
for key in unique {
let sum = paData.filter({$0.pCode == key.pCode && $0.aCode == key.aCode}).map({$0.count}).reduce(0, +)
let summary = PAData(pCode: key.pCode, aCode: key.aCode, count: sum)
resultArray.append(summary)
}
}
It seems that using Dictionary.grouping() is fast but then combining the results is slow again - still taking some 35 seconds.似乎使用 Dictionary.grouping() 很快,但随后组合结果又很慢 - 仍然需要大约 35 秒。
struct PAData: Equatable, Hashable {
let pCode: String
let aCode: String
let count: Int
var key: String {
return pCode + ":" + aCode
}
static func == (lhs: PAData, rhs: PAData) -> Bool {
return
lhs.pCode == rhs.pCode &&
lhs.aCode == rhs.aCode
}
func hash(into hasher: inout Hasher) {
hasher.combine(pCode)
hasher.combine(aCode)
}
}
// Group by aCode and pCode and sum count
func calcSum() {
var grouped = Dictionary(grouping: paData, by:{$0.key})
struct Item {
let key: String
let sum: Int
}
let resultArray = grouped.keys.map { (key) -> Item in
let value = grouped[key]!
return Item(key: key, sum: value.map{$0.facings}.reduce(0, +))
}
// Find the unique records based on pCode/aCode properties - very fast takes 0.1 second
let unique = Set<PAData>(paData)
// Now we need to combined so we have the original properties as well as the sum but no duplicates - slow
let results = unique.map({ rec -> PAData in
let sum = resultArray.first(where: {$0.key == rec.key})?.sum ?? rec.facings
return PAData(brandCode: rec.brandCode, assortmentCode: rec.assortmentCode, productCode: rec.productCode, productCategory: rec.productCategory, productDescription: rec.productDescription, facings: sum)
})
}
This should do the trick:这应该可以解决问题:
func calcSum2(_ paData: [PAData]) -> [PAData] {
let date = Date()
let grouped = Dictionary(grouping: paData) { aPaData in
return PAData(pCode: aPaData.pCode, aCode: aPaData.aCode, count: -2) //Here it's to take advantage of your Hashable
}
let resultArray = grouped.compactMap { (key: PAData, values: [PAData]) -> PAData in
let sum = values.reduce(into: 0, { $0 += $1.count })
return PAData(pCode: key.pCode, aCode: key.aCode, count: sum)
}
return resultArray
}
What's wrong with your code:你的代码有什么问题:
let unique = Set<PAData>(paData)
for key in unique {
let sum = paData.filter({$0.pCode == key.pCode && $0.aCode == key.aCode}).map({$0.count}).reduce(0, +)
let summary = PAData(pCode: key.pCode, aCode: key.aCode, count: sum)
resultArray.append(summary)
}
Could be changed into:可以改成:
let unique = Set<PAData>(paData)
for key in unique {
let filtered = paData.filter({$0.pCode == key.pCode && $0.aCode == key.aCode})
let countMap = filtered.map({$0.count})
let sum = countMap.reduce(0, +)
let summary = PAData(pCode: key.pCode, aCode: key.aCode, count: sum)
resultArray.append(summary)
}
So if you have 10k elements, and 1k elements for each with the same pCode & aCode.因此,如果您有 10k 个元素,并且每个元素具有相同的 pCode 和 aCode 的 1k 个元素。
pData
)pData
)filtered
) filtered
的)countMap
) countMap
) -> Repeat for each unique -> 对每个独特的重复
You see the lost calculations?你看到丢失的计算了吗?
But if you grouped as in my suggested solution, you are grouping, so you don't need the filter anymore.但是,如果您按照我建议的解决方案进行分组,那么您就是在分组,因此您不再需要过滤器。 And I summed and mapped in the same iteration, which in your case might be:
let sum = filtered.reduce(into: 0, { $0 += $1.count })
.我在同一个迭代中求和并映射,在你的情况下可能是:
let sum = filtered.reduce(into: 0, { $0 += $1.count })
。 In the sample of count I gave, it's 1k iteration saved per unique.在我给出的计数样本中,每个唯一值保存了 1k 次迭代。
Now, one of the issue I got, is that unique
isn't making unique per aCode
& pCode
on my sample code (I don't know why)., it doesn't seem to work as it should be.现在,我遇到的一个问题是,在我的示例代码中,
unique
并没有使每个aCode
和pCode
唯一(我不知道为什么)。它似乎没有按应有的方式工作。 I don't know why yet (I might edit the question on why), but that's still be less optimized, because you'd still filter each time.我还不知道为什么(我可能会编辑关于为什么的问题),但这仍然没有那么优化,因为你仍然每次都会过滤。 In theory, you should use
$0 == key
in your filter
since you override the equal method to not take in account the count
.理论上,您应该在
filter
中使用$0 == key
,因为您覆盖了 equal 方法而不考虑count
。
Sample test on my side:我这边的样本测试:
let testArray = [PAData(pCode: "a", aCode: "b", count: 2),
PAData(pCode: "a", aCode: "b", count: 3),
PAData(pCode: "d", aCode: "c", count: 2)]
let setTest = Set<PAData>(testArray)
print("setTest: \(setTest)")
let groupTest = Dictionary(grouping: testArray) { aPaData in
return PAData(pCode: aPaData.pCode, aCode: aPaData.aCode, count: -2)
}
print("groupTest: \(groupTest)")
With strange result:结果很奇怪:
Dictionary:
{
"PAData(pCode: \"a\", aCode: \"b\", count: -2)" = (
"PAData(pCode: \"a\", aCode: \"b\", count: 2)",
"PAData(pCode: \"a\", aCode: \"b\", count: 3)"
);
"PAData(pCode: \"d\", aCode: \"c\", count: -2)" = (
"PAData(pCode: \"d\", aCode: \"c\", count: 2)"
);
}
Set:
{(
PAData(pCode: "a", aCode: "b", count: 3),
PAData(pCode: "d", aCode: "c", count: 2),
PAData(pCode: "a", aCode: "b", count: 2)
)}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.