简体   繁体   English

如何使用Python对字符串进行编码和解码以在URL中使用?

[英]How does one encode and decode a string with Python for use in a URL?

I have a string like this: 我有一个像这样的字符串:

String A: [ 12234_1_Hello'World_34433_22acb_4554344_accCC44 ]

I would like to encrypt String A to be used in a clean URL. 我想加密字符串A以在干净的URL中使用。 something like this: 这样的事情:

String B: [ cYdfkeYss4543423sdfHsaaZ ]

Is there a encode API in python, given String A, it returns String B? python中是否有编码API,给定字符串A,它返回String B? Is there a decode API in python, given String B, it returns String A? 在python中是否有解码API,给定String B,它返回String A?

note that theres a huge difference between encoding and encryption. 请注意,编码和加密之间存在巨大差异。

if you want to send sensitive data, then dont use the encoding mentioned above ;) 如果你想发送敏感数据,那么不要使用上面提到的编码;)

One way of doing the encode/decode is to use the package base64, for an example: 进行编码/解码的一种方法是使用包base64,例如:

import base64
import sys

encoded = base64.b64encode(sys.stdin.read())
print encoded

decoded = base64.b64decode(encoded)
print decoded

Is it what you were looking for? 这是你在找什么? With your particular case you get: 根据您的具体情况,您将得到:

input: 12234_1_Hello'World_34433_22acb_4554344_accCC44 输入:12234_1_Hello'World_34433_22acb_4554344_accCC44

encoded: MTIyMzRfMV9IZWxsbydXb3JsZF8zNDQzM18yMmFjYl80NTU0MzQ0X2FjY0NDNDQ= 编码:MTIyMzRfMV9IZWxsbydXb3JsZF8zNDQzM18yMmFjYl80NTU0MzQ0X2FjY0NDNDQ =

decoded: 12234_1_Hello'World_34433_22acb_4554344_accCC44 已解码:12234_1_Hello'World_34433_22acb_4554344_accCC44

Are you after encryption, compression, or just urlencoding? 您是在加密,压缩还是只是urlencoding之后? The string can be passed after urlencoding, but that will not make it smaller as in your example. 字符串可以在urlencoding之后传递,但这不会像在示例中那样变小。 Compression might shrink it, but you would still need to urlencode the result. 压缩可能会缩小它,但您仍需要对结果进行urlencode。

Do you actually need to hide the string data from the viewer (eg sensitive data, should not be viewable by someone reading the URL over your shoulder)? 您是否真的需要隐藏查看器中的字符串数据(例如,敏感数据,不应该被读取URL的人看到)?

To make it really short -> just insert a row into the database. 要使它真的很短 - >只需在数据库中插入一行。 Store something like a list of (id auto_increment, url) tuples. 存储类似(id auto_increment, url)元组的列表。 Then you can base64 encode the id to get a "proxy url". 然后你可以base64编码id来获得一个“代理网址”。 Decode it by decoding the id and looking up the proper url in the database. 通过解码id并在数据库中查找正确的url来解码它。 Or if you don't mind the identifiers looking sequential, just use the numbers. 或者,如果您不介意标识符看起来是顺序的,只需使用数字即可。

Are you looking to encrypt the string or encode it to remove illegal characters for urls? 您是否希望加密字符串或对其进行编码以删除网址的非法字符? If the latter, you can use urllib.quote : 如果是后者,你可以使用urllib.quote

>>> quoted = urllib.quote("12234_1_Hello'World_34433_22acb_4554344_accCC44")
>>> quoted
'12234_1_Hello%27World_34433_22acb_4554344_accCC44'

>>> urllib.unquote(quoted)
"12234_1_Hello'World_34433_22acb_4554344_accCC44"

The base64 module provides encoding and decoding for a string to and from different bases, since python 2.4. 从64版开始,base64模块为不同的字符串提供字符串的编码和解码。

In you example, you would do the following: 在您的示例中,您将执行以下操作:

import base64
string_b = base64.b64encode(string_a)
string_a = base64.b64decode(string_b)

For full API: http://docs.python.org/library/base64.html 完整的API: http//docs.python.org/library/base64.html

It's hard to reduce the size of a string and preserve arbitrary content. 很难减小字符串的大小并保留任意内容。

You have to restrict the data to something you can usefully compress. 您必须将数据限制为可以有效压缩的内容。

Your alternative is to do the following. 您可以选择执行以下操作。

  1. Save "all the arguments in the URL" in a database row. 将“URL中的所有参数”保存在数据库行中。

  2. Assign a GUID key to this collection of arguments. 为此参数集分配GUID键。

  3. Then provide that shortened GUID key. 然后提供缩短的GUID密钥。

Another method that would also shorten the string would be to calculate the md5/sha1 hash of the string (concatenated with a seed if you wished): 另一种缩短字符串的方法是计算字符串的md5 / sha1哈希值(如果你愿意,可以与种子连接):

import hashlib
>>> hashlib.sha1("12234_1_Hello'World_34433_22acb_4554344_accCC44").hexdigest()
'e1153227558aadc00a2e90b5013fdd6b0804fdfb'

In theory you should get a set of strings with very few collisions and with a fixed length. 从理论上讲,你应该得到一组具有很少碰撞和固定长度的字符串。 The hashlib library has an array of different hash functions you can use in this manner, with different output sizes. hashlib库有一个不同的哈希函数数组,您可以使用这种方式使用不同的输出大小。

Edit: You also said that you needed a reversible string, so this wouldn't work for that. 编辑:你还说你需要一个可逆的字符串,所以这不适用于那个。 Afaik, however, many web platforms that use clean URLs like you seem to want to implement use hash functions to calculate a shortened URL and then store that URL along with the page's other data to provide the reverse lookup capability. 然而,Afaik,许多使用干净URL的网络平台似乎想要实现使用哈希函数来计算缩短的URL,然后将该URL与页面的其他数据一起存储以提供反向查找功能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM