简体   繁体   English

如何最好地实施BCD作为练习?

[英]How best to implement BCD as an exercise?

I'm a beginner (self-learning) programmer learning C++, and recently I decided to implement a binary-coded decimal (BCD) class as an exercise, and so I could handle very large numbers on Project Euler . 我是学习C ++的初学者(自学)程序员,最近我决定实现一个二进制编码的十进制(BCD)类作为练习,因此我可以在Euler项目上处理非常大的数字。 I'd like to do it as basically as possible, starting properly from scratch. 我想从头开始正确地尽可能地做到这一点。

I started off using an array of ints, where every digit of the input number was saved as a separate int. 我开始使用一个整数数组,其中输入数字的每个数字都保存为一个单独的整数。 I know that each BCD digit can be encoded with only 4 bits, so I thought using a whole int for this was a bit overkill. 我知道每个BCD数字只能用4位编码,因此我认为为此使用一个完整的int有点过分。 I'm now using an array of bitset<4>'s. 我现在正在使用bitset <4>的数组。

  1. Is using a library class like this overkill as well? 是否也在使用类似此类的库类?
  2. Would you consider it cheating? 您会认为它作弊吗?
  3. Is there a better way to do this? 有一个更好的方法吗?

EDIT: The primary reason for this is as an exercise - I wouldn't want to use a library like GMP because the whole point is making the class myself. 编辑:这样做的主要原因是作为练习-我不想使用GMP之类的库,因为整个观点都是我自己写的。 Is there a way of making sure that I only use 4 bits for each decimal digit? 有没有一种方法可以确保每个十进制数字仅使用4位?

Just one note, using an array of bitset<4> 's is going to require the same amount of space as an array of long's. 请注意,使用bitset<4>数组将需要与long数组相同的空间。 bitset is usually implemented by having an array of word sized integers be the backing store for the bits, so that bitwise operations can use bitwise word operations, not byte ones, so more gets done at a time. 通常通过将字大小的整数数组作为位的后备存储来实现位集,以便按位操作可以使用按位字操作,而不是字节操作,因此一次可以完成更多操作。

Also, I question your motivation. 另外,我质疑你的动机。 BCD is usually used as a packed representation of a string of digits when sending them between systems. 在系统之间发送BCD时,通常将BCD用作数字字符串的打包表示。 There isn't really anything to do with arithmetic usually. 通常,算术实际上没有任何关系。 What you really want is an arbitrary sized integer arithmetic library like GMP . 您真正想要的是一个任意大小的整数算术库,例如GMP

Is using a library class like this overkill as well? 是否也在使用类似此类的库类?

I would benchmark it against an array of ints to see which one performs better. 我将它与一系列整数进行基准比较,以查看哪种表现更好。 If an array of bitset<4> is faster, then no it's not overkill. 如果bitset <4>的数组更快,那么不算过分。 Every little bit helps on some of the PE problems 一点点帮助解决一些体育问题

Would you consider it cheating? 您会认为它作弊吗?

No, not at all. 一点都不。

Is there a better way to do this? 有一个更好的方法吗?

Like Greg Rogers suggested, an arbitrary precision library is probably a better choice, unless you just want to learn from rolling your own. 就像格雷格·罗杰斯(Greg Rogers)所建议的那样,任意精度库可能是一个更好的选择,除非您只是想从自己的基础上学习。 There's something to learn from both methods (using a library vs. writing a library). 从这两种方法中都有一些要学习的东西(使用库还是编写库)。 I'm lazy, so I usually use Python. 我很懒,所以我通常使用Python。

Like Greg Rogers said, using a bitset probably won't save any space over ints, and doesn't really provide any other benefits. 就像格雷格·罗杰斯(Greg Rogers)所说的那样,使用位集可能不会比int节省任何空间,并且实际上不会提供任何其他好处。 I would probably use a vector instead. 我可能会改用向量。 It's twice as big as it needs to be, but you get simpler and faster indexing for each digit. 它的大小是所需大小的两倍,但是您可以为每个数字建立更简单,更快速的索引。

If you want to use packed BCD, you could write a custom indexing function and store two digits in each byte. 如果要使用打包的BCD,则可以编写自定义索引功能,并在每个字节中存储两位数字。

  1. Is using a library class like this overkill as well? 是否也在使用类似此类的库类?
  2. Would you consider it cheating? 您会认为它作弊吗?
  3. Is there a better way to do this? 有一个更好的方法吗?

1&2: not really 1&2:不是真的

3: each byte's got 8-bits, you could store 2 BCD in each unsigned char. 3:每个字节有8位,您可以在每个未签名的char中存储2个BCD。

In general, bit operations are applied in the context of an integer, so from the performance aspect there is no real reason to go to bits. 通常,位操作是在整数的上下文中应用的,因此从性能方面来看,没有真正的理由去使用位。

If you want to go to bit approach to gain experience, then this may be of help 如果您想通过一点方法来获得经验,那么这可能会有所帮助

#include <stdio.h>
int main
(
    void
)
{
    typedef struct
    {
        unsigned int value:4;

    } Nibble;

    Nibble nibble;

    for (nibble.value = 0; nibble.value < 20; nibble.value++)
    {
        printf("nibble.value is %d\n", nibble.value);
    }

    return 0;
}

The gist of the matter is that inside that struct , you are creating a short integer, one that is 4 bits wide. 要点是在struct内部,您正在创建一个短整数,一个4位宽的整数。 Under the hood, it is still really an integer, but for your intended use, it looks and acts like a 4 bit integer. 实际情况下,它实际上仍然是整数,但是对于您的预期用途,它看起来像是4位整数。

This is shown clearly by the for loop, which is actually an infinite loop . for循环清楚地表明了这一点,它实际上是一个无限循环 When the nibble value hits, 16, the value is really zero, as there are only 4 bits to work with. 当半字节值达到16时,该值实际上为零,因为只有4位可以使用。 As a result nibble.value < 20 never becomes true. 结果, nibble.value <20永远不会为真。

If you look in the K&R White book, one of the notes there is the fact that bit operations like this are not portable , so if you want to port your program to another platform, it may or may not work. 如果您查看《 K&R白皮书》,其中一项便笺是这样的位操作不可移植的事实,因此,如果您想将程序移植到另一个平台上,可能会或可能不会。

Have fun. 玩得开心。

You are trying to get base-10 representation (ie decimal digit in each cell of the array). 您正在尝试获取以10为基的表示形式(即数组的每个单元格中的十进制数字)。 This way either space (one int per digit), or time (4-bits per dgit, but there is overhead of packing/unpacking) is wasted. 这样就浪费了空间(每个数字一个整数)或时间(每个dgit 4位,但打包/拆包的开销)。

Why not try with base-256, for example, and use an array of bytes? 例如,为什么不尝试使用base-256并使用字节数组呢? Or even base-2^32 with array of ints? 甚至是基数为2 ^ 32的整数数组? The operations are implemented the same way as in base-10. 这些操作的实现方式与10级基础相同。 The only thing that will be different is converting the number to a human-readable string. 唯一不同的是将数字转换为人类可读的字符串。

It may work like this: Assuming base-256, each "digit" has 256 possible values, so the numbers 0-255 are all single digit values. 它可能是这样工作的:假设以256为基,每个“数字”都有256个可能的值,因此数字0-255都是单数字值。 Than 256 is written as 1:0 (I'll use colon to separate the "digits", we cannot use letters like in base-16), analoge in base-10 is how after 9, there is 10. Likewise 1030 (base-10) = 4 * 256 + 6 = 4:6 (base-256). 大于256的字符写为1:0(我将使用冒号分隔“数字”,我们不能使用像base-16那样的字母),base-10的类比是9之后的数字,等于10。1030类似。 -10)= 4 * 256 + 6 = 4:6(以256为基)。 Also 1020 (base-10) = 3 * 256 + 252 = 3:252 (base-256) is two-digit number in base-256. 1020(base-10)= 3 * 256 + 252 = 3:252(base-256)是base-256中的两位数。

Now let's assume we put the digits in array of bytes with the least significant digit first: 现在假设我们将数字放在字节数组中,最低有效数字在前:

unsigned short digits1[] = { 212, 121 }; // 121 * 256 + 212 = 31188
int len1 = 2;
unsigned short digits2[] = { 202, 20  }; // 20 * 256 + 202 = 5322
int len2 = 2;

Then adding will go like this (warning: notepad code ahead, may be broken): 然后添加将如下所示(警告:前面的记事本代码可能已损坏):

unsigned short resultdigits[enough length] = { 0 };
int len = len1 > len2 ? len1 : len2; // max of the lengths
int carry = 0;
int i;
for (i = 0; i < len; i++) {
    int leftdigit = i < len1 ? digits1[i] : 0;
    int rightdigit = i < len2 ? digits2[i] : 0;
    int sum = leftdigit + rightdigit + carry;
    if (sum > 255) {
        carry = 1;
        sum -= 256;
    } else {
        carry = 0;
    }
    resultdigits[i] = sum;
}
if (carry > 0) {
    resultdigits[i] = carry;
}

On the first iteration it should go like this: 在第一次迭代中,它应该像这样:

  1. sum = 212 + 202 + 0 = 414 总和= 212 + 202 + 0 = 414
  2. 414 > 256, so carry = 1 and sum = 414 - 256 = 158 414> 256,所以进位= 1且总和= 414-256 = 158
  3. resultdigits[0] = 158 resultdigits [0] = 158

On the second iteration: 在第二次迭代中:

  1. sum = 121 + 20 + 1 = 142 总和= 121 + 20 + 1 = 142
  2. 142 < 256, so carry = 0 142 <256,所以进位= 0
  3. resultdigits[1] = 142 resultdigits [1] = 142

So at the end resultdigits[] = { 158, 142 }, that is 142:158 (base-256) = 142 * 256 + 158 = 36510 (base-10), which is exactly 31188 + 5322 因此,最后结果为digits [] = {158,142},即142:158(base-256)= 142 * 256 + 158 = 36510(base-10),正好是31188 + 5322

Note that converting this number to/from a human-readable form is by no means a trivial task - it requires multiplication and division by 10 or 256 and I cannot present code as a sample without proper research. 请注意,将这个数字转换为人类可读的格式绝非易事-它需要乘以10或256的乘法和除法,如果没有适当的研究,我将无法作为示例提供代码。 The advantage is that the operations 'add', 'subtract' and 'multiply' can be made really efficient and the heavy conversion to/from base-10 is done only once in the beginning and once after the end of the calculation. 这样做的好处是可以真正有效地执行“加”,“减”和“乘”运算,并且在计算的开始和结束之后仅进行一次与以10为底的大量转换。

Having said all that, personally, I'd use base 10 in array of bytes and not care about the memory loss. 说了这么多,就我个人而言,我将使用以10为基数的字节数组,而不关心内存丢失。 This will require adjusting the constants 255 and 256 above to 9 and 10 respectively. 这将需要将常数255和256分别调整为9和10。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM