Could anyone please give me some broad guidance for a Python project that my 10 year-old son is trying? I'm looking less for specific coding solutions, but I hope this is a good place to ask the question. What I'd like is to see if my son is on to something that's realistic with his coding project and if there's a relatively straightforward way for him to learn the right steps. Or is this something that's just out of the league for a 10 year-old who loves to read about and try various coding projects just for fun? As you can see I'm not a coder and know very little about this sort of project, so some patience and good will would be appreciated!
My son is into cryptography and he's telling me he tried the Python code below. He hopes to build a sponge-like function to encrypt a massage so that it can't be decrypted. This is inspired by a section in his book "Serious Cryptography" (by J. Aumasson) with the title "Permutation-based Hash Functions: Sponge Functions". When he runs the code that he wrote, he gets the error message "TypeError: unsupported operand type(s) for <<: 'str' and 'int'" (see his interaction in the terminal below the code).
Thanks a lot! Alexander
Here's his code:
import math
import textwrap
plaintext = raw_input("The value to be hashed: ") # Get the user to input the data to be hashed
nonce = raw_input("The nonce to be used: ") # Get the user to input the nonce to be used
key = raw_input("The key to be used: ") # Get the user to input the key to be used
blocks = textwrap.wrap(plaintext, 16) # Split the string into 128-bit blocks
if len(blocks[len(blocks)-1]) < 16: # Check if the last block is less than 128 bits
while len(blocks[len(blocks)-1]) < 16: # Keep iterating the following code
blocks[len(blocks)-1] += "." # Add padding to the end of the block to make it 128-bit
sponge = nonce # Set the sponge's initial state to that of the nonce
for j in blocks: # Absorb all of the blocks
sponge = (sponge << 128) + j # Concatenate the current sponge value and the block
sponge = textwrap.wrap(sponge, 128) # Convert the sponge into 128-bit blocks
for z in sponge: # Keep iterating the following code
z = z^j # XOR the sponge block with the message block
sponge = join(sponge) # Convert the blocks back into a string
sponge = textwrap.wrap(sponge, len(key)*8) # Convert the sponge into blocks with the same length of the key
output = sponge # Create a new variable to save space
del nonce, blocks # Delete variables to save space
while len(output) > 1: # Keep iterating the following code
output[1] = output[1]^output[0] >> output[0] # XOR the second element with the first, then shift forward
del output[0] # Delete the first element, so it can repeat again
tag = ((output^plaintext) <<< sponge) + output # Generate an authentication tag. That's not overkill, is it?
print output # Oh yeah, just print it in hexadecimal, I dunno how to
When he runs the script in the terminal, this is the interaction:
The exception:
Traceback (most recent call last):
File "DarkKnight-Sponge.py", line 13, in <module>
sponge = (sponge << 128) + j # Concatenate the current sponge value and the block
TypeError: unsupported operand type(s) for <<: 'str' and 'int'
Congratulations to your son! The project looks realistic to me. The only ambitious thing I can think of was delving directly into bitwise operators like <<
and ^
instead of trying to implement the corresponding operations on sequences of characters. Bitwise operators sometimes look like arithmetic dark magic because they manipulate the internal binary representation of numbers, which we are not as familiar with as a number's decimal representation or a text.
TypeError: unsupported operand type(s) for <<: 'str' and 'int'
This error is pretty straightforward: it says the operation sponge << 128
cannot be performed, because sponge
is an str
, ie, a (character) string, ie, it's text, whereas 128 is an int, ie, an integer number.
Imagine if you asked the computer to calculate "three" + 2
. It would return an error, because +
expects two numbers, but "three"
is a string, not a number. Similarly, if you asked the computer to calculate "327" + 173
, it would return an error, because "327"
is text, not a number.
The operator <<
is the leftwise bitshift operator. It shifts a number to the left by a certain amount of bits. Computers store numbers in binary representation; we humans are more used to decimal representation, so let's make an anology with a "leftwise digit-shift" operation. "Shifting a number to the left" would mean multiplying it by a power of 10. For instance, 138 shifted to the left twice would be 13800. We padded with zeroes on the right. In binary representation, bitshift works the same, but multiplies by a power of 2 instead. 138 in binary representation is 1110110
; shifting it to the left twice gives 111011000
, which is the same as multiplying it by 100
(which is 4).
If sponge
and j
are both numbers, and j
is less than 2^128, then the line:
sponge = (sponge << 128) + j # Concatenate the current sponge value and the block
shifts sponge
to the left by 128 bits, then adds a number less than 128 bits to the result. Effectively, this is concatenating the bits of sponge
with the bits of j
. To come back to our decimal analogy: if x
is a number, and y
is a number less than 100, then the number x * 100 + y
is the number obtained by concatenating the digits of x
and y
. For instance, 1374 * 100 + 56 = 137456
.
I haven't read the cryptography book that inspired this code, so I am only guessing from here on.
My understanding is that the book expects plaintext
, nonce
and key
to be numbers. However, in your son's code, they are all text. The distinction between those two types of objects is not irreconcilable. Inside a computer's memory, everything is stored as sequences of bits anyway. A number is a sequence of bits; a string is a sequence of characters, and each character is itself a short sequence of bits.
I see three possibilities: (1) convert all the text to numbers before performing the operations; (2) adapt the operations so they can be applied to strings instead of ints; (3) convert all the text to strings containing only the characters 0
and 1
and adapt the operations so they can be applied to such sequences. Real-world efficient implementations of cryptography algorithms most certainly all choose the second option. The third option is obviously the less efficient of the three, but for learning purposes it's a possible option.
Looking at your code, I notice all operations used are about manipulation of sequences, rather than about arithmetic operations. As I mentioned, (sponge << 128) + j
is the concatenation of two sequences of bits. The bitwise xor operation which is used later in the code ^
expects two sequences of bits of the same length, and returns a sequence of the same length with 1
at every position where the two sequences had distinct bits and 0
at every position where the two sequences had equal bits. For instance, 00010110 ^ 00110111 = 00100001
because the third and eighth bits are distinct, but all the other bits are equal.
To convert text into numbers (which I called option 2), you could replace the first three lines of the code with those lines:
plaintext_string = raw_input("The value to be hashed: ") # Get the user to input the data to be hashed
nonce_string = raw_input("The nonce to be used: ") # Get the user to input the nonce to be used
key_string = raw_input("The key to be used: ") # Get the user to input the key to be used
def string_to_int(txt):
number = 0
for c in txt:
number = (number << 8) + ord(c)
return number
plaintext = string_to_int(plaintext_string)
nonce = string_to_int(plaintext_string)
key = string_to_int(key_string)
How this works: every ascii character c
is mapped to an 8-bit number by the python function ord
. The 8-bit blocks are concatenated using the formula number = (number << 8) + ord(c)
, which you can recognize from the above discussion.
This is not sufficient to make your code work properly, as the textwrap.wrap()
function used directly afterwards expects a string, not an int. A possibility is to replace the textwrap.wrap()
function by a custom function text_to_intblocks()
:
def string_to_intblocks(txt, blocksize):
blocks = []
block_number = 0
for i,c in enumerate(txt):
block_number = (block_number << 8) + ord(c)
if i % blocksize == 0:
blocks.append(block_number)
block_number = 0
return blocks
And then replace blocks = textwrap.wrap(plaintext, 16)
with blocks = string_to_intblocks(plaintext_string, 16)
.
This is still not sufficient to fix your son's code. I am convinced there is a logic error in the following six lines, although fixing it would require a better understanding of the algorithm than I currently have:
sponge = nonce # Set the sponge's initial state to that of the nonce
for j in blocks: # Absorb all of the blocks
sponge = (sponge << 128) + j # Concatenate the current sponge value and the block
sponge = textwrap.wrap(sponge, 128) # Convert the sponge into 128-bit blocks
for z in sponge: # Keep iterating the following code
z = z^j # XOR the sponge block with the message block
sponge = join(sponge)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.