简体   繁体   English

如何从给定文本文件的十六进制数中提取特定位数

[英]How to extract specific number of bits from a hexadecimal number for a given text file

This is input file: input.txt这是输入文件: input.txt

PS name         above bit      below bit      original            1_info           2_info            new      
PS_AS_0         PS_00[31]      PS_00[00]      0x00000000          0x156A17[00]     0x156A17[31]      0x0003F4a1 
PS_RST_D2       PS_03[05]      PS_03[00]      0x00000003          0x1678A1[00]     0x1678A1[05]      0x0a56F001
PS_N_YD_C       PS_03[06]      PS_03[06]      0x00000000          0x1678A1[06]     0x1678A1[06]      0x0a56F001
PS_1_FG         PS_03[31]      PS_03[07]      0x000000FF          0x1678A1[07]     0x1678A1[31]      0x0a56F001
PS_F_23_ASD     PS_04[07]      PS_03[00]      0x00000000          0x18C550[00]     0x18C550[07]      0x00000000
PS_A_0_STR      PS_04[15]      PS_04[08]      0x00000FFF          0x18C550[08]     0x18C550[15]      0x00000000
PS_AD_0         PS_04[31]      PS_04[16]      0x00000000          0x18C550[16]     0x18C550[31]      0x00000000

here i need to extract the bits in this way:在这里我需要以这种方式提取位:

if value of new = 0x0a56F001 then first i need that to be converted to binary 0000 1010 0101 0110 1111 0000 0000 0001 .如果new = 0x0a56F001的值,那么首先我需要将其转换为二进制0000 1010 0101 0110 1111 0000 0000 0001

Then check above bit and below bit column.然后检查above bitbelow bit列。

for eg: PS_03[05] PS_03[00] then take 0 to 5th bit of new binary value which is 000001 which is 0x1 and then convert this to 32 bit value ie 0x00000001 .例如: PS_03[05] PS_03[00]然后取新二进制值的第 0 位到第 5 位,即000001 ,即0x1 ,然后将其转换为 32 位值,即0x00000001 and replace new column of that row with this value.并用该值替换该行的列。

PS_RST_D2       PS_03[05]      PS_03[00]      0x00000003          0x1678A1[00]     0x1678A1[05]      0x00000001

similarly for all and finally the output file should look like this:同样,对于所有文件,最后 output 文件应如下所示:

PS name         above bit      below bit      original            1_info           2_info            new      
PS_AS_0         PS_00[31]      PS_00[00]      0x00000000          0x156A17[00]     0x156A17[31]      0x0003F4a1 
PS_RST_D2       PS_03[05]      PS_03[00]      0x00000003          0x1678A1[00]     0x1678A1[05]      0x00000001
PS_N_YD_C       PS_03[06]      PS_03[06]      0x00000000          0x1678A1[06]     0x1678A1[06]      0x00000000
PS_1_FG         PS_03[31]      PS_03[07]      0x000000FF          0x1678A1[07]     0x1678A1[31]      0x0014ADE0
PS_F_23_ASD     PS_04[07]      PS_03[00]      0x00000000          0x18C550[00]     0x18C550[07]      0x00000000
PS_A_0_STR      PS_04[15]      PS_04[08]      0x00000FFF          0x18C550[08]     0x18C550[15]      0x00000000
PS_AD_0         PS_04[31]      PS_04[16]      0x00000000          0x18C550[16]     0x18C550[31]      0x00000000

Is this possible in Python?这在 Python 中可能吗? This is current attempt:这是当前的尝试:

with open("input.txt") as fin:
    with open("output.txt", "w") as fout:
         for line in fin:
             if line.strip():
                 line = line.strip("\n' '")
                 cols = l.split(" ")
                 cols[6] = int(cols[6],16)

i tried by selecting specific column but it is not working.我尝试选择特定的列,但它不起作用。

For reading input-Data like this I like to use pandas .为了读取这样的输入数据,我喜欢使用pandas (update at the end of answer) (在答案末尾更新)

To get the number of the above and the below bit, you can use indexing of the string like:要获取上方和下方位的编号,您可以使用字符串索引,例如:

sAboveBit ="PS_03[05]"
iAboveBit = int(sAboveBit[-3:-1])

Or much safer:或者更安全:

iAboveBit = int(sAboveBit.split("[")[-1].split("]")[0])

For creating the new value, you could use a bitwise-AND with an integer which you can calculate with your aboveBit and belowBit要创建新值,您可以使用按位与 integer,您可以使用 aboveBit 和 belowBit 计算

first way I think of is a for loop:我想到的第一种方法是 for 循环:

iSumUp = 0
for i in range(iBelowBit,iAboveBit+1):
    iSumUp+=2**i

To getting your number in hex you can use the module/package bitstring.要以十六进制获取您的号码,您可以使用模块/包位串。

import bitstring as bs
sOldNew = "0x0a56F001"
iOldNew = bs.BitArray(sOldNew).uint

Now you can use a bitwise AND现在您可以使用按位 AND

iNewNew = iOldNew & iSumUp

And finally create your new hex-string with a formatted string.最后使用格式化字符串创建新的十六进制字符串。

sNewNew = f"0x{iNewNew:08x}"

At least save your date to your (new) file, for which I also prefer using pandas.至少将您的日期保存到您的(新)文件中,为此我也更喜欢使用 pandas。

Update:更新:

For reading your data with pandas:使用 pandas 读取数据:

import pandas as pd
df =pd.read_csv(r'input.txt',delimiter="\t")
print(df)

You can use split to split the lines, then a regex to extract the above and below values.您可以使用split来拆分行,然后使用正则表达式来提取上面和下面的值。

To compute the new value, you can only keep the (above_bit + 1) least signicant bits with a bitwise and with 2**n - 1 , and then right shift the result by below_bit.要计算新值,您只能使用按位和2**n - 1保留 (above_bit + 1) 最低有效位,然后将结果右移 below_bit。

Possible code:可能的代码:

import re

# compile the regex
bit_re = re.compile(r'.*\[(\d{2})\]')

with open("input.txt") as fin, open("output.txt", "w") as fout:
    line = next(fin)          # skip header line
    fout.write(line)
    for line in fin:
        row = line.split()    # extract fields
        # print(row)          # uncomment for traces
        # extract above and below values
        above = int(bit_re.match(row[1]).group(1))
        below = int(bit_re.match(row[2]).group(1))
        val = int(row[6],16) & (2**(above +1) - 1)
        val = val >> below & (2**(above +1) - 1)
        row[6] = format(val, '#010x')    # format the result as a 32 bits hex number
        print(*row, file=fout)

with for sample data it gives as expected:对于示例数据,它按预期提供:

PS name         above bit      below bit      original            1_info           2_info            new      
PS_AS_0 PS_00[31] PS_00[00] 0x00000000 0x156A17[00] 0x156A17[31] 0x0003f4a1
PS_RST_D2 PS_03[05] PS_03[00] 0x00000003 0x1678A1[00] 0x1678A1[05] 0x00000001
PS_N_YD_C PS_03[06] PS_03[06] 0x00000000 0x1678A1[06] 0x1678A1[06] 0x00000000
PS_1_FG PS_03[31] PS_03[07] 0x000000FF 0x1678A1[07] 0x1678A1[31] 0x0014ade0
PS_F_23_ASD PS_04[07] PS_03[00] 0x00000000 0x18C550[00] 0x18C550[07] 0x00000000
PS_A_0_STR PS_04[15] PS_04[08] 0x00000FFF 0x18C550[08] 0x18C550[15] 0x00000000
PS_AD_0 PS_04[31] PS_04[16] 0x00000000 0x18C550[16] 0x18C550[31] 0x00000000

You could get a better formatting by replacing the end of line with the new value...您可以通过用新值替换line尾来获得更好的格式...

The first problem is that you have many spaces.第一个问题是你有很多空间。 When splitting at the space, you get a lot of empty columns.在空格处拆分时,您会得到很多空列。 Replace many spaces with a single one first:首先用一个空格替换多个空格:

import re
line = re.sub(' +', ' ', line)

Then, 0x0a56F001 is a hexadecimal number.那么, 0x0a56F001就是一个十六进制数。 To read it from the text file, use int(cols[6], 16) , not int(cols[6], 2) , which attempts to read it as binary.要从文本文件中读取它,请使用int(cols[6], 16) ,而不是int(cols[6], 2) ,它会尝试将其读取为二进制文件。

You can then get a 32 digit binary string like this然后你可以得到一个像这样的 32 位二进制字符串

number = int(cols[6],16)
binary_string = f"{number:032b}"

Now do the slicing, then convert it back with现在进行切片,然后将其转换回来

sliced_number = int( ..., 2)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM