[英]Most efficient way to build a serializer in C#
I am currently building my own serializer and I'm at a point where I no longer want to misuse the System.Xml.Linq
classes so I'm building my own.我目前正在构建自己的序列化程序,并且我不想再滥用
System.Xml.Linq
类,因此我正在构建自己的序列化程序。
Let's say I have these classes:假设我有这些课程:
And let's assume these only have a Name
property which returns string
and a Value
property which returns a IReadonlyList<IXmlNode>
.让我们假设它们只有一个返回
string
的Name
属性和一个返回IReadonlyList<IXmlNode>
的Value
属性。
The question I have is, would it be more efficient to make the classes themselves responsible for writing out their own serialized value or would it be more efficient to have a class that uses pattern matching?我的问题是,让类自己负责写出自己的序列化值会更有效,还是使用模式匹配的类会更有效?
So for example option A :例如选项 A :
public class XmlElement: IXmlNode {
public void Write(StringBuilder stringBuilder) {
stringBuilder.WriteLine($"<{Name}>")
foreach(var child in Children) {
child.Write(stringBuilder);
}
stringBuilder.WriteLine($"</{Name}>")
}
}
Or option B :或选项B :
public class XmlWriter {
public string Write(IXmlNode node, StringBuilder passedStringBuilder) {
var stringBuilder = passedStringBuilder ?? new StringBuilder();
if (IXmlNode is XmlElement xmlElement) WriteElement(xmlElement, stringBuilder);
if (IXmlNode is XmlAttribute xmlAttribute) WriteAttribute(xmlAttribute, stringBuilder);
if (IXmlNode is XmlText xmlText) WriteText(xmlText, stringBuilder);
return stringBuilder.ToString();
}
public void WriteElement(XmlElement element, stringBuilder) {
stringBuilder.WriteLine($"<{element.Name}>")
foreach(var child in element.Children) {
Write(child, stringBuilder);
}
stringBuilder.WriteLine($"</{element.Name}>")
}
}
Obviously there is more to writing an XML serializer, this isn't great code and I left out some parts here and there.显然,编写 XML 序列化程序还有更多内容,这不是很好的代码,我在这里和那里遗漏了一些部分。
I'm mostly concerned about the concept of which would be most efficient.我最关心的是哪个概念最有效。
Additional information附加信息
Since there have been some requests to define what I mean by efficiency here are some factors I'd like to score the implementation on:由于有一些要求定义我所说的效率,这里有一些我想对实现进行评分的因素:
Speed
Garbage collection
Memory allocation
Of course code readability is also a factor however, at this point I've written both implementations and in regards of readability Option A has my preference.当然,代码可读性也是一个因素,但在这一点上,我已经编写了两个实现,在可读性方面,我更喜欢选项 A。 Option B resulted in quite some lines of code in a single file and arguably it's doing too much for one class.
选项 B在单个文件中产生了相当多的代码行,可以说它对一个类做了太多的事情。
So in short:简而言之:
Unless Option B greatly outperforms Option A my preference will go towards Option A .除非选项 B大大优于选项 A,否则我的偏好将转向选项 A。
In this case I'd suppose the answer would be Option A .在这种情况下,我认为答案是Option A 。
What I did to test this:我做了什么来测试这个:
ToString
was calledToString
时写出它们自己的序列化值This is the result of the benchmark test:这是基准测试的结果:
// * Summary *
BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19043.1288 (21H1/May2021Update)
Intel Core i7-8750H CPU 2.20GHz (Coffee Lake), 1 CPU, 12 logical and 6 physical cores
.NET SDK=5.0.402
[Host] : .NET Core 3.1.20 (CoreCLR 4.700.21.47003, CoreFX 4.700.21.47101), X64 RyuJIT
BenchmarkDotNet, Version=0.13.1.0, Culture=neutral, PublicKeyToken=aa0ca2f9092cefc4 : .NET Core 3.1.20 (CoreCLR 4.700.21.47003, CoreFX 4.700.21.47101), X64 RyuJIT
Job=BenchmarkDotNet, Version=0.13.1.0, Culture=neutral, PublicKeyToken=aa0ca2f9092cefc4 MaxRelativeError=0.01 IterationCount=1
LaunchCount=5 RunStrategy=ColdStart UnrollFactor=1
WarmupCount=1
| Method | Mean | Error | StdDev | Gen 0 | Completed Work Items | Lock Contentions | Gen 1 | Allocated |
|----------------- |-----------:|----------:|---------:|------------:|---------------------:|-----------------:|-----------:|----------:|
| JsonDataToSTring | 420.3 ms | 42.31 ms | 10.99 ms | 19000.0000 | 4.0000 | - | 6000.0000 | 232 MB |
| SerialJsonWriter | 420.7 ms | 58.03 ms | 15.07 ms | 19000.0000 | 4.0000 | - | 6000.0000 | 232 MB |
| XmlDataToSTring | 1,012.1 ms | 342.95 ms | 89.06 ms | 145000.0000 | 4.0000 | - | 35000.0000 | 1,036 MB |
| SerialXmlWriter | 1,128.5 ms | 70.71 ms | 18.36 ms | 203000.0000 | 4.0000 | - | 40000.0000 | 1,384 MB |
The benchmark test was not extensive by any means and I did not do a very in-depth evaluation because these results to me are pretty clear.无论如何,基准测试并不广泛,我没有进行非常深入的评估,因为这些结果对我来说非常清楚。
For both JSON and XML Option A scores best on time
, garbage collection
and memory allocation
on a pretty small dataset.对于 JSON 和 XML,选项 A在非常小的数据集上在
time
、 garbage collection
和memory allocation
得分最高。
In retrospect, I could've seen this coming because C# doesn't support tail recursion.回想起来,我可以预见到这一点,因为 C# 不支持尾递归。
In fact, when I increased the data size I managed to crash my computer by profiling code that overflowed the stack.事实上,当我增加数据大小时,我设法通过分析溢出堆栈的代码使我的计算机崩溃。
PS: I don't think it's valuable to go into too much detail of how the data is generated because such a small dataset already made such a difference. PS:我认为详细说明数据是如何生成的并没有什么价值,因为这么小的数据集已经产生了如此大的不同。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.