简体   繁体   中英

Having trouble unpacking Comp-3 in .Net. There are letter characters aside from sign character inside Comp-3 value

I am trying to import a Mainframe EDI File back to SQL Server using .NET and I am having problems unpacking some comp-3 fields.

This file was from one of our clients and I have the Copy Book layout for the following fields:

05  EH-GROSS-INVOICE-AMT            PIC S9(07)V9999  COMP-3.         
05  EH-CASH-DISCOUNT-AMT            PIC S9(07)V9999  COMP-3.         
05  EH-CASH-DISCOUNT-PCT            PIC S9(03)V9999  COMP-3.

I will just be focusing on these 3 fields as all other fields are PIC(X) and are already Unicode values. I loaded everything up with the help of this Tool Ebcdic2Ascii that was created by Max Vagner. I just did a bit of modification on the "Unpack" function and have modified it to

private string Unpack(byte[] packedBytes, int decimalPlaces, out bool isParsedSuccessfully)
{
    isParsedSuccessfully = true;
    return BitConverter.ToString(packedBytes);
}

in order for me to get the following sample data:

EH-GROSS-INVOICE-AMT     EH-CASH-DISCOUNT-AMT     EH-CASH-DISCOUNT-PCT
----------------------------------------------------------------------
00-1A-1A-03-26-0C        00-00-00-00-00-0C        00-00-00-0C
00-0A-1A-1A-00-0C        00-00-1A-1A-2D-0C        00-1A-00-0C
00-09-10-20-00-0C        00-00-10-1A-1A-0C        00-1A-00-0C

Here is a sample code that I created for Unpacking these values based on my understanding of Comp-3 values:

namespace ConsoleApp1
{
    class Program
    {
        static void Main(string[] args)
        {
            var result1 = UnpackMod("00-1A-1A-03-26-0C", 4);
            var result2 = UnpackMod("00-00-00-00-00-0C", 4);
            var result3 = UnpackMod("00-00-00-0C", 4);

            Console.WriteLine($"{result1}\n{result2}\n{result3}\n");

            var result4 = UnpackMod("00-0A-1A-1A-00-0C", 4);
            var result5 = UnpackMod("00-00-1A-1A-2D-0C", 4);
            var result6 = UnpackMod("00-1A-00-0C", 4);

            Console.WriteLine($"{result4}\n{result5}\n{result6}\n");

            var result7 = UnpackMod("00-09-10-20-00-0C", 4);
            var result8 = UnpackMod("00-00-10-1A-1A-0C", 4);
            var result9 = UnpackMod("00-1A-00-0C", 4);

            Console.WriteLine($"{result7}\n{result8}\n{result9}");

            Console.ReadLine();
        }

        /// <summary>
        /// Method for unpacking Comp-3 fields.
        /// </summary>
        /// <param name="hexString"></param>
        /// <param name="decimalPlaces"></param>
        /// <returns>Returns numeric string if parse was successful; else Return input hex string</returns>
        private static string UnpackMod(string inputString, int decimalPlaces)
        {
            var outputString = inputString;

            // Remove "-".
            outputString = outputString.Replace("-", "");

            // Check last character for sign.
            string lastChar = outputString.Substring(outputString.Length - 1, 1);
            bool isNegative = (lastChar == "D" || lastChar == "B");

            // Remove sign character.
            if (lastChar == "C" || lastChar == "A" || lastChar == "E" || lastChar == "F" || lastChar == "D" || lastChar == "B")
            {
                outputString = outputString.Substring(0, outputString.Length - 1);
            }

            // Place decimal point.
            outputString = outputString.Insert(outputString.Length - decimalPlaces, ".");

            // Check if parsed value is numeric. This will also eliminate all leading 0.
            var isParsedSuccessfully = decimal.TryParse(outputString, out decimal decimalValue);

            // If isParsedSuccessfully is true then return numeric string else return inputString..
            string result = "NULL";
            if (isParsedSuccessfully)
            {
                // Convert value to negative.
                if (isNegative)
                {
                    decimalValue = decimalValue * -1;
                }

                result = decimalValue.ToString();
            }

            return result;
        }
    }
}

After running the sample code I was able to get the following results:

EH-GROSS-INVOICE-AMT     EH-CASH-DISCOUNT-AMT     EH-CASH-DISCOUNT-PCT
----------------------------------------------------------------------
NULL                     0.0000                   0.0000
NULL                     NULL                     NULL
9102.0000                NULL                     NULL        

As you can see I was only able to get following 3 values correctly:

00-09-10-20-00-0C -> 9102.0000
00-00-00-00-00-0C -> 0.0000
00-00-00-0C       -> 0.0000

As referenced from this source: http://www.3480-3590-data-conversion.com/article-packed-fields.html . I have the following understanding about Comp-3:

COBOL Comp-3 is a binary field type that puts ("packs") two digits into each byte, using a notation called Binary Coded Decimal, or BCD.

The Binary Coded Decimal (BCD) data type is just as its name suggests -- it is a value stored in decimal (base ten) notation, and each digit is binary coded. Since a digit only has ten possible values (0-9).

The low nibble of the least significant byte is used to store the sign for the number. This nibble stores only the sign, not a digit. "C" hex is positive, "D" hex is negative, and "F" hex is unsigned.

Since I know that BCD should only be values 0-9 and that there should just only be a character at the end which could either be "C", "D" or "F". I don't know how to unpack the following values:

00-1A-1A-03-26-0C
00-0A-1A-1A-00-0C        
00-00-1A-1A-2D-0C
00-1A-00-0C
00-00-10-1A-1A-0C
00-1A-00-0C

These values has other characters beside the sign character. I have a feeling that the data has already been converted because if it is not then there should be no readable values there not unless you apply an Encoding. I am still not sure about this and would love any insights on this. Thanks.

First, PIC X is not Unicode in COBOL.

Quoting myself from here ...

It is common for mainframe data to include both text and binary data in a single record, for example a name, a currency amount, and a quantity:

Hopper Grace ar%.

...which would be...

x'C8969797859940404040C799818385404040404081996C004B'

...in hex. This is code page 37, commonly referred to as EBCDIC.

[...]Converting to code page 1250, commonly in use on Microsoft Windows, you would end up with...

x'486F707065722020202047726163652020202020617225002E'

...where the text data is translated but the packed data is destroyed. The packed data no longer has a valid sign in the last nibble (the lower half of the last byte), the currency amount itself has been changed as has the quantity (from decimal 75 to decimal 11,776 due to both code page conversion and mangling of a big endian number as a little endian number).

Likely your data was code page converted on transfer from the mainframe. If you know the original code page and the code page it was converted to, then you might be able to unscramble the packed data.

I say might because, if you're lucky, the hex values you have will have been mapped one-to-one with hex values in the original code page. Note that it is common for both EBCDIC x'15' and x'0D' to be mapped to ASCII x'0D'.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM