[英]Designing a string class in C++
I need to design (and code, at some point) a "customized" string class in C++. 我需要在C ++中设计(并在某些时候编写代码)“自定义”字符串类。 I was wondering if you could please let me know about any documentation and design issues, primarily, and potential pitfalls I should be aware of.
我想知道你能不能让我知道任何文件和设计问题,主要是我应该注意的潜在缺陷。 Links are very welcome, as are the identification of problems (if any) with current string libs (Qstring, std::string, and the others).
非常欢迎链接,以及使用当前字符串库(Qstring,std :: string和其他)识别问题(如果有)。
Thank you. 谢谢。
Despite the critics, I think this is a valid question. 尽管有批评者,我认为这是一个有效的问题。
The std::string
is not a panacea. std::string
不是灵丹妙药。 It looks like someone took the class from a pure-OO and dumped it in C++, which is probably the case. 看起来有人从纯OO中取出类并将其转储到C ++中,这可能就是这种情况。
Advice 1: Prefer non-member non-friend methods 建议1:首选非会员非朋友方法
Now that this is said, in this hour of internationalization, I would certainly advise you to design a class that would support Unicode
. 既然如此,在国际化的这个时刻,我肯定会建议你设计一个支持
Unicode
的类。 And I do say Unicode
, not UTF-8
or UTF-16
. 我说的是
Unicode
,而不是UTF-8
或UTF-16
。 It's ill-fitting (I think) to devise a class that would contain the data in a given encoding. 我认为设计一个包含给定编码数据的类是不合适的(我认为)。 You can provide methods to then output the information in various formats.
您可以提供方法,然后以各种格式输出信息。
Advice 2: Support Unicode
建议2:支持
Unicode
Then, there is a number of points on the memory allocation schemes: 那么,内存分配方案有很多要点:
Java
, C#
or Python
use immutable strings. Java
, C#
或Python
等“新”语言使用不可变字符串。 Think of it as a pool of strings, all strings containing "Fooo" will point to the same buffer. I would personally pick the "Small String Optimization" here (though it's not exclusive with the other two), simply because it's simple to implement and should actually benefit you (heap allocation cost, locality of reference issues). 我个人会在这里选择“小字符串优化”(虽然它不是与其他两个一起排除),只是因为它实现起来很简单并且实际上应该让你受益(堆分配成本,参考问题的位置)。
The other two technics are somewhat complex in the face of multi-threading, and such are likely error-prone and unlikely to yield any real benefit unless carefully crafted. 另外两种技术在多线程面前有些复杂,这种技术可能容易出错,除非精心设计,否则不太可能带来任何实际好处。
And that brings my last advice: 这带来了我的最后建议:
Advice 3: Don't implement internal locking in an attempt of MultiThreading support 建议3:在尝试MultiThreading支持时不要实现内部锁定
It will slow down the class when used in SingleThreaded context and will not yield as much benefit as you'd think when used in a MultiThreaded one. 当在SingleThreaded上下文中使用时,它将减慢类的速度,并且在MultiThreaded中使用时不会产生与您想象的一样多的好处。
Finally, you could perhaps find something suiting your tastes (or get some pointers) by browsing existing code. 最后,您可以通过浏览现有代码找到适合您口味(或获得一些指示)的内容。 I don't promise to exhibit "smooth" interfaces though:
我不承诺展示“流畅”的界面:
Scott Meyers的有效STL对可能的std::string
实现技术进行了一些有趣的讨论,尽管它涵盖了相当高级的问题,例如写时复制和引用计数。
根据“自定义”的内容(例如自定义分配器),您可以通过std :: basic_string类的模板参数来完成。
From a general-purpose point of view a "new" string class ideally combined the good points of std::string, CString, QString and others. 从通用的角度来看,“新”字符串类理想地结合了std :: string,CString,QString等的优点。 A few points in random order:
以随机顺序排列的几点:
The world doesn't need another string class. 世界不需要另一个字符串类。 Is this homework?
这是家庭作业吗? If not, use
std::string
. 如果没有,请使用
std::string
。
The problem with std::string is.. that you can't change it. std :: string的问题是..你无法改变它。 Sometimes you need the basics of a std::string, but disagree with the implementation of your c++ library.
有时你需要std :: string的基础知识,但不同意你的c ++库的实现。
As an example, thread-safe reference counting employed means lots of locking (or at least locked operations). 例如,使用线程安全引用计数意味着大量锁定(或至少锁定操作)。 Also, if most of your strings are short (because you know this will be the case), you might want a string class that is optimized for that use-case.
此外,如果您的大多数字符串很短(因为您知道会出现这种情况),您可能需要一个针对该用例进行优化的字符串类。
So even if you like the std::string API, or at least have learned to live with it, there is room for 'competing implementations' that are more or less workalikes. 因此,即使您喜欢std :: string API,或者至少已经学会了它,但仍然存在“竞争实现”的空间,这些实现或多或少都是相似的。
PowerDNS would love to have one, as we currently pass many dns host names around, and a large majority of them would fit in a, say, 25 bytes fixed buffer, which would relieve a lot of new/delete pressure. PowerDNS会喜欢有一个,因为我们目前传递了许多dns主机名,并且其中绝大多数都适合于25字节的固定缓冲区,这将减轻很多新的/删除压力。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.