简体   繁体   English

未编码的python字符串''不返回空

[英]unicoded python string '' not returning empty

Not sure if this is a rookie mistake or plain stupid, but I am facing this strange issue. 不知道这是菜鸟的错误还是愚蠢的,但是我正面临着这个奇怪的问题。 I have a unicoded string declared as classifier = u"''" which I am checking for emptiness. 我有一个未编码的字符串声明为classifier = u"''" ,正在检查是否为空。 The following code block: 以下代码块:

if classifier: 
   # do something
else:
   # else do something else

will hit the else block since there is '' embedded. 由于已嵌入''因此将击中else块。 I don't have control over the source generating classifier string. 我无法控制源生成分类器字符串。

Only if classifier can somehow be operated to return the embedded '' I can check for emptiness of classifier , but not sure how. 只有分类器可以以某种方式操作以返回嵌入的内容''我才能检查classifier是否为空,但不确定如何。 If it is of any help classifier is collected from HttpRequest object classifier = request.GET.get('c', '') . 如果它是任何帮助的classifier是从收集HttpRequest对象classifier = request.GET.get('c', '')

EDIT: 编辑:

classifier[1:-1] returns u'' which now can be checked for emptiness. classifier[1:-1]返回u'' ,现在可以检查其是否为空。 Any built in method which one can use? 任何可以使用的内置方法?

I will go ahead with this approach for now. 我现在将继续使用这种方法。 But leaving the post open for any other advanced pointers if any. 但是,如果有其他高级指针,则保留该帖子。

thanks, 谢谢,

You could do this: 您可以这样做:

if classifier.strip("'"): 
   # do something
else:
   # else do something else
if len(classifier) > 2:
    # do something
else:
    # do something else

You have to actually know what the data means before you can decide how to parse it. 在决定如何解析之前,您必须真正了解数据的含义。 Just randomly hacking at it until it works for one example isn't going to help. 只是随机地对其进行破解,直到它成为一个例子为止是无济于事的。

So, you're getting the string out of a URL, and it looks like this: 因此,您是从URL中获取字符串,它看起来像这样:

http:///a=maven&v=1.1.0&classifier=''&ttype=pom HTTP:///a=maven&v=1.1.0&classifier=''&ttype=pom

Normally, when given a URL, the right thing to do is call urlparse.urlparse and then call urlparse.parse_qs on the query . 通常,给定URL后,正确的做法是调用urlparse.urlparse ,然后在query调用urlparse.parse_qs But that won't actually help here, because this is not actually a valid URL. 但这实际上并没有帮助,因为这实际上不是有效的URL。

Well, it is a valid URL, but it's one with a path <someurl>/a=maven&v=1.1.0&classifier=''&ttype=pom , not one with a path <someurl>/ and a query a=maven&v=1.1.0&classifier=''&ttype=pom . 嗯,这一个有效的URL,但是它是一个路径为<someurl>/a=maven&v=1.1.0&classifier=''&ttype=pom URL,而不是一个路径为<someurl>/且查询为a=maven&v=1.1.0&classifier=''&ttype=pom <someurl>/ a=maven&v=1.1.0&classifier=''&ttype=pom You need a ? 你需要一个? to set off the query. 触发查询。

And, on top of that, the query is clearly not generated correctly. 而且,最重要的是,查询显然无法正确生成。 You don't quote empty strings in a query. 您不会在查询中引用空字符串。 You don't quote anything (you entity-escape ampersands and percent-escape any other special characters). 您不引用任何内容 (实体转义“&”号和百分号“转义”其他任何特殊字符)。 So, unless the URL literally means that the classifier is '' rather than the empty string, it's wrong. 因此,除非URL从字面上意味着分类器是''而不是空字符串,否则这是错误的。

And, if it weren't wrong, you wouldn't be asking these questions. 而且,如果没错,您也不会问这些问题。

If you have any control over how these URLs are getting generated, obviously you want to get that fixed. 如果您可以控制这些URL的生成方式,那么显然您希望将其修复。 If you can't control it, but at least know how they're being generated, you can write code to reverse that to get the original values. 如果您无法控制它,但至少知道它们是如何生成的,则可以编写代码将其反转以获取原始值。 But if you don't even know that, you have to guess. 但是,即使您甚至不知道,您都必须猜测。

You ideally need more than one example to guess. 理想情况下,您需要多个示例进行猜测。 Are they quoting just empty strings, or are they also, eg, quoting strings with " characters or spaces or ampersands in them? If it's the latter, you can probably just strip("'") , but if it's the former, that will be incorrect in any cases where the original data actually has quotes. 它们是只引用空字符串,还是引用例如带有"字符或空格或&符的字符串?如果是后者,则可能只需要strip("'") ,但是如果是前者,那将是在原始数据实际上带有引号的任何情况下都是不正确的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM