简体   繁体   English

将C#哈希代码迁移到PHP

[英]Migrate C# Hash Code to PHP

I know there are similar questions already on SO but none of them seem to address this problem. 我知道在SO上已经有类似的问题,但是似乎没有一个问题可以解决。 I have inherited the following c# code that has been used to create password hashes in a legacy .net app, for various reasons the C# implementation is now being migrated to php: 我继承了以下用于在旧版.net应用程序中创建密码哈希的c#代码,由于各种原因,C#实现现在正迁移到php:

string input = "fred";
SHA256CryptoServiceProvider provider = new SHA256CryptoServiceProvider();
byte[] hashedValue = provider.ComputeHash(Encoding.ASCII.GetBytes(input));
string output = "";
string asciiString = ASCIIEncoding.ASCII.GetString(hashedValue);
foreach ( char c in asciiString ) {
   int tmp = c;
   output += String.Format("{0:x2}", 
             (uint)System.Convert.ToUInt32(tmp.ToString()));
}
return output;

My php code is very simple but for the same input "fred" doesn't produce the same result: 我的php代码非常简单,但是对于相同的输入“ fred”不会产生相同的结果:

$output = hash('sha256', "fred");

I've traced the problem down to an encoding issue - if I change this line in the C# code: 我已将问题归结为编码问题-如果我在C#代码中更改此行:

string asciiString = ASCIIEncoding.ASCII.GetString(hashedValue);

to

string asciiString = ASCIIEncoding.UTF7.GetString(hashedValue);

Then the php and C# output match (it yields d0cfc2e5319b82cdc71a33873e826c93d7ee11363f8ac91c4fa3a2cfcd2286e5). 然后php和C#输出匹配(产生d0cfc2e5319b82cdc71a33873e826c93d7ee11363f8ac91c4fa3a2cfcd2286e5)。

Since I'm not able to change the .net code I need to work out how to replicate the results in php. 由于我无法更改.net代码,因此我需要研究如何在php中复制结果。

Thanks in advance for any help, 预先感谢您的帮助,

I don't know PHP well enough to answer your question; 我不太了解PHP,无法回答您的问题; however, I must point out that your C# code is broken. 但是,我必须指出,您的C#代码已损坏。 Try generating the hash of these two inputs: "âèí" and "çñÿ" . 尝试生成以下两个输入的哈希值: "âèí""çñÿ" You will find that their hash collides: 您会发现他们的哈希冲突:

3f3b221c6c6e3f71223f51695d456d52223f243f3f363949443f3f763b483615

The first bug lies in this operation: 第一个错误在于此操作:

Encoding.ASCII.GetBytes(input)

This assumes that all characters within your input are US-ASCII. 假设input中的所有字符均为US-ASCII。 Any non-ASCII characters would cause the encoder to fall back to the byte value for the ? 任何非ASCII字符都将导致编码器退回到?的字节值? character, thereby giving (unwanted) hash collisions, as demonstrated above. 字符,从而产生(不需要的)哈希冲突,如上所述。 Notwithstanding, this will not be an issue if your input is constrained to only allow US-ASCII characters. 尽管如此,如果您的输入被限制为仅允许使用US-ASCII字符,这将不是问题。

The other (more severe) bug lies in the following operation: 另一个(更为严重的)错误在于以下操作:

ASCIIEncoding.ASCII.GetString(hashedValue)

ASCII only defines mappings for values 0–127. ASCII仅定义值0–127的映射。 Since the elements of your hashedValue byte array may contain any byte value (0–255), encoding them as ASCII would cause data to be lost whenever a value greater than 127 is encountered. 由于hashedValue字节数组的元素可以包含任何字节值(0–255),因此,如果遇到大于127的值,则将它们编码为ASCII将导致数据丢失。 This may lead to further “unwanted” (read: potentially maliciously generated) hash collisions, even when your original input was US-ASCII. 即使您的原始输入是US-ASCII,这也可能导致进一步的“有害”(读为:可能恶意生成)哈希冲突。

Given that, statistically, half of the bytes constituting your hashes would be greater than 127, then you are losing at least half the strength of your hash algorithm. 从统计上讲,假设构成哈希的一半字节大于127,那么您将损失至少一半哈希算法的强度。 If a hacker gains access to your stored hashes, it is quite likely that they will manage to devise an attack to generate hash collisions by exploiting this cryptographic weakness. 如果黑客能够访问您存储的哈希,则很有可能他们将设法利用这种加密漏洞来设计攻击以生成哈希冲突。

Edit : Notwithstanding the considerations mentioned in my posts and Jon's, here is the PHP code that succumbs to the same weakness – so to speak – as your C# code, and thereby gives the same hash: 编辑 :尽管在我的帖子和Jon的文章中提到了注意事项,但是以下PHP代码却屈服于与C#代​​码相同的弱点(可以这么说),从而给出了相同的哈希值:

$output = hash('sha256', $input, true);

for ($i = 0; $i < strlen($output); $i++)
   if ($output[$i] > chr(127))
       $output[$i] = '?';

$output = bin2hex($output);

您能否使用mb_convert_encoding(请参阅http://php.net/manual/zh-CN/function.mb-convert-encoding.php-该页面还提供了所支持的编码列表的链接)将PHP字符串从UTF7转换为ASCII ?

I've traced the problem down to an encoding issue 我已将问题归结为编码问题

Yes. 是。 You're trying to treat arbitrary binary data as if it's valid text-encoded data. 您试图将任意二进制数据视为有效的文本编码数据。 It's not. 不是。 You should not be using any Encoding here. 应该使用任何Encoding这里。

If you want the results in hex, the simplest approach is to use BitConverter.ToString 如果要以十六进制显示结果,最简单的方法是使用BitConverter.ToString

string text = BitConverter.ToString(hashedValue).Replace("-", "").ToLower();

And yes, as pointed out elsewhere, you probably shouldn't be using ASCII to convert the text to binary at the start of the hashing process. 是的,正如其他地方指出的那样,在散列过程开始时,您可能不应该使用ASCII将文本转换为二进制。 I'd probably use UTF-8. 我可能会使用UTF-8。

It's really important that you understand the problem here though, as otherwise you'll run into it in other places too. 不过,在这里理解问题是非常重要的,否则您也会在其他地方遇到问题。 You should only use encodings such as ASCII, UTF-8 etc (on any platform) when you've genuinely got encoded text data. 当您真正拥有编码的文本数据时, 应使用ASCII,UTF-8等编码(在任何平台上)。 You shouldn't use them for images, the results of cryptography, the results of hashing, etc. 您不应该将它们用于图像,加密结果,哈希结果等。

EDIT: Okay, you say you can't change the C# code... it's not clear whether that just means you've got legacy data , or whether you need to keep using the C# code regardless. 编辑:好的,您说您无法更改C#代码...尚不清楚这是否仅意味着您拥有旧数据 ,还是无论是否需要继续使用C#代码。 You should absolutey not run this code for a second longer than you have to. 绝对不应将这段代码运行超过一秒钟。

But in PHP, you may find you can get away with just replacing every byte with a value >= 0x80 in the hash with 0x3F, which is the ASCII for "question mark". 但是在PHP中,您可能会发现,只需用0x3F(即“问号”的ASCII)将哈希中的值大于等于0x80的每个字节替换即可。 If you look through your data you'll probably find there are a lot of 3F bytes in there. 如果查看数据,可能会发现其中有很多 3F字节。

If you can get this to work, I would strongly suggest that you migrate over to the true MD5 hash without losing information like this. 如果您可以使用它,我强烈建议您迁移到真正的 MD5哈希表,而不会丢失这样的信息。 Wherever you're storing the hashes, store two: the legacy one (which is all you have now) and the rehashed one. 无论您将哈希存储在何处,都应存储两个:旧式哈希(这就是现在的全部)和重新哈希化的哈希。 Whenever you're asked to validate that a password is correct, you should: 每当要求您验证密码正确时,您应该:

  • Check whether you have a "new" one; 检查您是否有“新”用户; if so, only use that - ignore the legacy one. 如果是这样,请仅使用该选项-忽略旧版本。
  • If you only have a legacy one: 如果您只有旧版:
    • Hash the password in the broken way to check whether it's correct 散列密码散列,以检查密码是否正确
    • If it is, hash it again properly and store the results in the "new" place. 如果是,请再次对其进行适当的哈希处理,然后将结果存储在“新”位置。

Then when everyone's logged in correctly once, you'll be able to wipe out the legacy hashes. 然后,当每个人都正确登录一次后,您就可以清除旧式哈希。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM