简体繁体 English

python中的安全凭证存储

[英]Secure credential storage in python

原文 2013-01-31 22:26:57 3 2 python/ security/ reflection/ storage/ credentials

The attack 攻击

One possible threat model, in the context of credential storage, is an attacker which has the ability to : 在凭证存储的情况下，一种可能的威胁模型是攻击者，它具有以下能力：

inspect any (user) process memory 检查任何（用户）进程内存
read local (user) files 读取本地（用户）文件

AFAIK, the consensus on this type of attack is that it's impossible to prevent (since the credentials must be stored in memory for the program to actually use them), but there's a couple of techniques to mitigate it: AFAIK，这种攻击的共识是无法阻止（因为必须将凭据存储在内存中，程序才能实际使用它们），但是有两种技术可以缓解这种攻击：

minimize the amount of time the sensitive data is stored in memory 最小化敏感数据在内存中的存储时间
overwrite the memory as soon as the data is not needed anymore 一旦不再需要数据，将覆盖内存
mangle the data in memory, keep moving it, and other security through obscurity measures 通过模糊措施破坏内存中的数据，继续移动数据以及其他安全性

Python in particular 特别是Python

The first technique is easy enough to implement, possibly through a keyring (hopefully kernel space storage) 第一种技术很容易实现，可能通过密钥环（希望是内核空间存储）来实现

The second one is not achievable at all without writing a C module, to the best of my knowledge (but I'd love to be proved wrong here, or to have a list of existing modules) 就我所知，不编写C模块根本无法实现第二个（但是我很乐意在这里被证明是错误的，或者希望有一个现有模块的列表）

The third one is tricky. 第三个是棘手的。

In particular, python being a language with very powerful introspection and reflection capabilities, it's difficult to prevent access to the credentials to anyone which can execute python code in the interpreter process. 特别是，由于python是具有强大的自省和反射功能的语言，因此很难阻止能够在解释器过程中执行python代码的任何人访问凭据。

There seems to be a consensus that there's no way to enforce private attributes and that attempts at it will at best annoy other programmers who are using your code . 似乎已经达成共识，那就是无法强制执行私有属性，而尝试进行私有属性充其量只会使其他正在使用您的代码的程序员烦恼。

The question 问题

Taking all this into consideration, how does one securely store authentication credentials using python? 考虑到所有这些，如何使用python安全地存储身份验证凭据？ What are the best practices? 最佳做法是什么？ Can something be done about the language "everything is public" philosophy? 关于“一切都是公共的”哲学这个语言，可以做些什么吗？ I know "we're all consenting adults here" , but should we be forced to choose between sharing our passwords with an attacker and using another language? 我知道“我们所有人都同意这里的成年人” ，但是我们是否应该被迫在与攻击者共享密码和使用另一种语言之间进行选择？

2 个解决方案

There are two very different reasons why you might store authentication credentials: 为什么要存储身份验证凭据，有两个非常不同的原因：

To authenticate your user: For example, you only allow the user access to the services after the user authenticates to your program 为了验证您的用户：例如，你只允许到服务的用户访问的用户验证到您的程序后，
To authenticate the program with another program or service: For example, the user starts your program which then accesses the user's email over the Internet using IMAP. 要使用另一个程序或服务对程序进行身份验证：例如，用户启动您的程序，然后使用IMAP通过Internet访问用户的电子邮件。

In the first case, you should never store the password (or an encrypted version of the password). 在第一种情况下，您永远不要存储密码（或密码的加密版本）。 Instead, you should hash the password with a high-quality salt and ensure that the hashing algorithm you use is computationally expensive (to prevent dictionary attacks) such as PBKDF2 or bcrypt. 相反，您应该使用高质量的盐对密码进行哈希处理 ，并确保所使用的哈希算法在计算上昂贵（以防止字典攻击），例如PBKDF2或bcrypt。 See Salted Password Hashing - Doing it Right for many more details. 有关更多详细信息，请参见腌制密码哈希-正确执行。 If you follow this approach, even if the hacker retrieves the salted, slow-hashed token, they can't do very much with it. 如果您采用这种方法，即使黑客检索到盐渍，哈希值低的令牌，他们也不会做太多事情。

In the second case, there are a number of things done to make secret discovery harder (as you outline in your question), such as: 在第二种情况下，有许多事情使秘密发现变得更加困难（正如您在问题中所概述的），例如：

Keeping secrets encrypted until needed, decrypting on demand, then re-encrypting immediately after 保持机密直到需要时才加密，按需解密，然后在之后立即重新加密
Using address space randomization so each time the application runs, the keys are stored at a different address 使用地址空间随机化，因此每次应用程序运行时，密钥都存储在不同的地址
Using the OS keystores 使用操作系统密钥库
Using a "hard" language such as C/C++ rather than a VM-based, introspective language such as Java or Python 使用“硬”语言（例如C / C ++）而不是基于VM的自省语言（例如Java或Python）

Such approaches are certainly better than nothing, but a skilled hacker will break it sooner or later. 这样的方法肯定总比没有好，但是熟练的黑客迟早会破解它。

Tokens 令牌

From a theoretical perspective, authentication is the act of proving that the person challenged is who they say they are. 从理论上讲，认证是证明被挑战者就是他们所说的人的行为。 Traditionally, this is achieved with a shared secret (the password), but there are other ways to prove yourself, including: 传统上，这是通过共享机密（密码）实现的，但是还有其他方式可以证明自己，包括：

Out-of-band authentication. 带外认证。 For example, where I live, when I try to log into my internet bank, I receive a one-time password (OTP) as a SMS on my phone. 例如，我住的地方，当我尝试登录到我的网上银行，我收到一个一次性密码（OTP）作为我的手机短信。 In this method, I prove I am by virtue of owning a specific telephone number 通过这种方法，我证明自己拥有一个特定的电话号码
Security token : To log in to a service, I have to press a button on my token to get a OTP which I then use as my password. 安全令牌：要登录服务，我必须按令牌上的按钮以获取OTP，然后将其用作密码。
Other devices: 其他设备：
- SmartCard , in particular as used by the US DoD where it is called the CAC . SmartCard ，特别是美国国防部使用的智能卡，称为CAC 。 Python has a module called pyscard to interface to this Python有一个名为pyscard的模块可以与此接口
- NFC device NFC设备

And a more complete list here 而一个更完整的列表在这里

The commonality between all these approaches is that the end-user controls these devices and the secrets never actually leave the token/card/phone, and certainly are never stored in your program. 所有这些方法之间的共同点是，最终用户控制这些设备，并且秘密永远不会真正离开令牌/卡/电话，并且肯定不会存储在您的程序中。 This makes them much more secure. 这使它们更加安全。

Session stealing 会话窃取

However (there is always a however): 但是（总是有一个）：

Let us suppose you manage to secure the login so the hacker cannot access the security tokens. 让我们假设您设法保护登录名的安全，以便黑客无法访问安全令牌。 Now your application is happily interacting with the secured service. 现在，您的应用程序很高兴与安全服务进行交互。 Unfortunately, if the hacker can run arbitrary executables on your computer, the hacker can hijack your session for example by injecting additional commands into your valid use of the service. 不幸的是，如果黑客可以在您的计算机上运行任意可执行文件，则黑客可以例如通过向服务的有效使用中注入其他命令来劫持您的会话。 In other words, while you have protected the password, it's entirely irrelevant because the hacker still gains access to the 'secured' resource. 换句话说，虽然您已经保护了密码，但是这完全无关紧要，因为黑客仍然可以访问“安全”资源。

This is a very real threat, as the multiple cross-site scripting attacks have shows (one example is US Bank and Bank of America Websites Vulnerable , but there are countless more). 正如多种跨站点脚本攻击所显示的那样，这是一个非常现实的威胁（一个示例是“ 美国银行”和“美国银行网站易受攻击” ，但还有更多的此类威胁）。

Secure proxy 安全代理

As discussed above, there is a fundamental issue in keeping the credentials of an account on a third-party service or system so that the application can log onto it, especially if the only log-on approach is a username and password. 如上所述，存在一个基本问题，即在第三方服务或系统上保留帐户凭据，以便应用程序可以登录到该帐户，尤其是在唯一的登录方法是用户名和密码的情况下。

One way to partially mitigate this by delegating the communication to the service to a secure proxy, and develop a secure sign-on approach between the application and proxy. 通过将与服务的通信委托给安全代理来部分缓解此问题的一种方法，并在应用程序和代理之间开发一种安全的登录方法。 In this approach 用这种方法

The application uses a PKI scheme or two-factor authentication to sign onto the secure proxy 该应用程序使用PKI方案或两因素身份验证来登录安全代理
The user adds security credentials to the third-party system to the secure proxy. 用户将安全凭证添加到第三方系统的安全代理中。 The credentials are never stored in the application 凭据永远不会存储在应用程序中
Later, when the application needs to access the third-party system, it sends a request to the proxy. 稍后，当应用程序需要访问第三方系统时，它将请求发送给代理。 The proxy logs on using the security credentials and makes the request, returning results to the application. 代理使用安全凭证登录并发出请求，并将结果返回给应用程序。

The disadvantages to this approach are: 这种方法的缺点是：

The user may not want to trust the secure proxy with the storage of the credentials 用户可能不希望信任凭据存储的安全代理
The user may not trust the secure proxy with the data flowing through it to the third-party application 用户可能不信任通过它流到第三方应用程序的数据的安全代理
The application owner has additional infrastructure and hosting costs for running the proxy 应用程序所有者具有用于运行代理的其他基础结构和托管成本

Some answers 一些答案

So, on to specific answers: 因此，针对特定答案：

How does one securely store authentication credentials using python? 如何使用python安全地存储身份验证凭据？

If storing a password for the application to authenticate the user, use a PBKDF2 algorithm, such as https://www.dlitz.net/software/python-pbkdf2/ 如果存储应用程序的密码以验证用户身份，请使用PBKDF2算法，例如https://www.dlitz.net/software/python-pbkdf2/
If storing a password/security token to access another service, then there is no absolutely secure way. 如果存储密码/安全令牌以访问其他服务，则没有绝对安全的方法。
However, consider switching authentication strategies to, for example the smartcard, using, eg, pyscard . 但是，请考虑使用pyscard将身份验证策略切换到例如智能卡。 You can use smartcards to both authenticate a user to the application, and also securely authenticate the application to another service with X.509 certs. 您可以使用智能卡对用户进行身份验证，也可以使用X.509证书对应用程序进行安全身份验证。

Can something be done about the language "everything is public" philosophy? 关于“一切都是公共的”哲学这个语言，可以做些什么吗？ I know "we're all consenting adults here", but should we be forced to choose between sharing our passwords with an attacker and using another language? 我知道“我们所有人都同意这里的成年人”，但是我们是否应该被迫在与攻击者共享密码和使用另一种语言之间进行选择？

IMHO there is nothing wrong with writing a specific module in Python that does it's damnedest to hide the secret information, making it a right bugger for others to reuse (annoying other programmers is its purpose ). 恕我直言，用Python编写一个特定的模块并没有什么错，因为它确实可以隐瞒秘密信息，这使它成为供其他人重用的正确工具（讨厌其他程序员是其目的 ）。 You could even code large portions in C and link to it. 您甚至可以用C编写大部分代码并链接到它。 However, don't do this for other modules for obvious reasons. 但是，出于明显的原因，请勿对其他模块执行此操作。

Ultimately, though, if the hacker has control over the computer, there is no privacy on the computer at all. 但是，最终，如果黑客控制了计算机，则计算机上根本没有隐私。 Theoretical worst-case is that your program is running in a VM, and the hacker has complete access to all memory on the computer, including the BIOS and graphics card, and can step your application though authentication to discover its secrets. 从理论上讲，最坏的情况是您的程序正在VM中运行，黑客可以完全访问计算机上的所有内存，包括BIOS和图形卡，并且可以通过身份验证来逐步执行您的应用程序以发现其秘密。

Given no absolute privacy, the rest is just obfuscation, and the level of protection is simply how hard it is obfuscated vs. how much a skilled hacker wants the information. 在没有绝对隐私的情况下，剩下的只是混淆，而保护的级别就是混淆的难易程度与熟练的黑客想要多少信息。 And we all know how that ends , even for custom hardware and billion-dollar products . 众所周知，即使对于定制硬件和价值十亿美元的产品，这也是如何结束的。

Using Python keyring 使用Python密钥环

While this will quite securely manage the key with respect to other applications, all Python applications share access to the tokens. 尽管这将相当安全地管理其他应用程序的密钥，但所有Python应用程序都共享对令牌的访问。 This is not in the slightest bit secure to the type of attack you are worried about. 对于您担心的攻击类型，这一点也不安全。

I'm no expert in this field and am really just looking to solve the same problem that you are, but it looks like something like Hashicorp's Vault might be able to help out quite nicely. 我不是该领域的专家，我只是想解决您所遇到的同样问题，但是看起来像Hashicorp的Vault之类的东西也许可以很好地帮助您。

In particular WRT to the problem of storing credentials for 3rd part services. 特别是WRT，涉及到存储第三部分服务的凭证的问题。 eg: 例如：

In the modern world of API-driven everything, many systems also support programmatic creation of access credentials. 在现代的API驱动的世界中，许多系统还支持以编程方式创建访问凭据。 Vault takes advantage of this support through a feature called dynamic secrets: secrets that are generated on-demand, and also support automatic revocation. 保管箱通过一种称为动态机密的功能来利用此支持：动态机密是按需生成的，并且还支持自动吊销。

For Vault 0.1, Vault supports dynamically generating AWS, SQL, and Consul credentials. 对于Vault 0.1，Vault支持动态生成AWS，SQL和Consul凭证。

More links: 更多链接：