简体   繁体   English

Bioformats-Python错误:'ascii'编解码器在使用OMEXML()时无法对字符u'\ xb5'进行编码

[英]Bioformats-Python error: 'ascii' codec can't encode character u'\xb5' when using OMEXML()

I am trying to use bioformats in Python to read in a microscopy image (.lsm, .czi, .lif, you name it), print out the meta data, and display the image. 我试图在Python中使用生物形式来读取显微镜图像(.lsm,.czi,.lif,你的名字),打印元数据,并显示图像。 ome = bf.OMEXML(md) gives me an error (below). ome = bf.OMEXML(md)给出了一个错误(下面)。 I think it's talking about the information stored within md . 我认为这是在讨论存储在md的信息。 It doesn't like that the information in md isn't all ASCII. 它不喜欢md中的信息不是所有ASCII。 But how do I overcome this problem? 但是我该如何克服这个问题呢? This is what I wrote: 这就是我写的:

import Tkinter as Tk, tkFileDialog
import os
import javabridge as jv
import bioformats as bf
import matplotlib.pyplot as plt
import numpy as np

jv.start_vm(class_path=bf.JARS, max_heap_size='12G')

User selects file to work with 用户选择要使用的文件

#hiding root alllows file diaglog GUI to be shown without any other GUI elements
root = Tk.Tk()
root.withdraw()
file_full_path = tkFileDialog.askopenfilename()
filepath, filename = os.path.split(file_full_path)
os.chdir(os.path.dirname(file_full_path))

print('opening:  %s' %filename)
reader = bf.ImageReader(file_full_path)
md = bf.get_omexml_metadata(file_full_path)
ome = bf.OMEXML(md)

Put image in numpy array 将图像放在numpy数组中

raw_data = []
    for z in range(iome.Pixels.get_SizeZ()):
    raw_image = reader.read(z=z, series=0, rescale=False)
    raw_data.append(raw_image)
raw_data = np.array(raw_data)

Show wanted metadata 显示想要的元数据

iome = ome.image(0) # e.g. first image
print(iome.get_Name())
print(iome.Pixels.get_SizeX())
print(iome.Pixels.get_SizeY())

Here's the error I get: 这是我得到的错误:

---------------------------------------------------------------------------
UnicodeEncodeError                        Traceback (most recent call last)
<ipython-input-22-a22c1dbbdd1e> in <module>()
     11 reader = bf.ImageReader(file_full_path)
     12 md = bf.get_omexml_metadata(file_full_path)
---> 13 ome = bf.OMEXML(md)

/anaconda/envs/env2_bioformats/lib/python2.7/site-packages/bioformats/omexml.pyc in __init__(self, xml)
    318         if isinstance(xml, str):
    319             xml = xml.encode("utf-8")
--> 320         self.dom = ElementTree.ElementTree(ElementTree.fromstring(xml))
    321 
    322         # determine OME namespaces

<string> in XML(text)

UnicodeEncodeError: 'ascii' codec can't encode character u'\xb5' in position 1623: ordinal not in range(128)

Here's a representative test image with proprietary microscopy format 这是具有专有显微镜格式的代表性测试图像

Thank you for adding the sample image. 感谢您添加示例图像。 That helped tremendously! 这极大地帮助了!

Let's first remove all the unnecessary Tkinter code until we get to a Minimal, Complete and Verifiable Example that allows us to reproduce your error message. 让我们首先删除所有不必要的Tkinter代码,直到我们找到一个允许我们重现您的错误消息的Minimal,Complete和Verifiable示例

import javabridge as jv
import bioformats as bf

jv.start_vm(class_path=bf.JARS, max_heap_size='12G')

file_full_path = '/path/to/Cell1.lsm'

md = bf.get_omexml_metadata(file_full_path)

ome = bf.OMEXML(md)

jv.kill_vm()

We first get some warning messages about 3i SlideBook SlideBook6Reader library not found but we can apparently ignore that. 我们首先得到一些关于3i SlideBook SlideBook6Reader library not found警告信息,但我们显然可以忽略它。

Your error message reads UnicodeEncodeError: 'ascii' codec can't encode character u'\\xb5' in position 1623: ordinal not in range(128) , so let's look what we can find around position 1623. 您的错误消息显示为UnicodeEncodeError: 'ascii' codec can't encode character u'\\xb5' in position 1623: ordinal not in range(128) ,所以让我们看看我们可以在1623位置找到什么。

If you add print md after md = bf.get_omexml_metadata(file_full_path) , the whole xml with metadata is printed out. 如果在md = bf.get_omexml_metadata(file_full_path)之后添加print md ,则打印出包含元数据的整个xml。 Let's zoom in: 我们放大:

>>> print md[1604:1627]
PhysicalSizeXUnit="µm"

So, the µ character is the culprit, it can't be encoded with the 'ascii' codec . 因此, µ字符是罪魁祸首,它不能用'ascii' codec

Looking back at the traceback: 回顾追溯:

/anaconda/envs/env2_bioformats/lib/python2.7/site-packages/bioformats/omexml.pyc in __init__(self, xml)
    318         if isinstance(xml, str):
    319             xml = xml.encode("utf-8")
--> 320         self.dom = ElementTree.ElementTree(ElementTree.fromstring(xml))
    321 
    322         # determine OME namespaces

We see that the in the lines before the error occurs, we encode our xml to utf-8 , that should solve our problem. 我们看到在错误发生之前的行中,我们将xml编码为utf-8 ,这应该可以解决我们的问题。 So why doesn't it happen? 那为什么不发生呢?

if we add print type(md) we get back <type 'unicode'> and not <type 'str'> as the code expected.. So this is a bug in omexml.py ! 如果我们添加print type(md)我们会返回<type 'unicode'>而不是<type 'str'> omexml.py <type 'str'>作为预期的代码..所以这是omexml.py一个错误!

To solve this, do the following (you might need to be root); 要解决此问题,请执行以下操作(您可能需要是root用户);

  • Go to /anaconda/envs/env2_bioformats/lib/python2.7/site-packages/bioformats/ 转到/anaconda/envs/env2_bioformats/lib/python2.7/site-packages/bioformats/
  • remove omexml.pyc 删除omexml.pyc
  • in omexml.py change line 318 from isinstance(xml, str): to if isinstance(xml, basestring): omexml.pyisinstance(xml, str):更改第318行isinstance(xml, str):if isinstance(xml, basestring):

basestring is the superclass for str and unicode . basestringstrunicode的超类。 It is used to test whether an object is an instance of str or unicode . 它用于测试对象是str还是unicode的实例。

I wanted to file a bug for this, but it seems there is already an open issue . 我想为此提交一个错误,但似乎已经存在一个未解决的问题

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 UnicodeEncodeError:&#39;ascii&#39;编解码器无法在位置11编码字符u&#39;\\ xb0&#39;:序数不在范围内(128) - UnicodeEncodeError: 'ascii' codec can't encode character u'\xb0' in position 11: ordinal not in range(128) Python错误; UnicodeEncodeError:'ascii'编解码器无法编码字符u'\ u2026' - Python Error; UnicodeEncodeError: 'ascii' codec can't encode character u'\u2026' Python unicode错误。 UnicodeEncodeError:&#39;ascii&#39;编解码器无法编码字符u&#39;\\ u4e3a&#39; - Python unicode error. UnicodeEncodeError: 'ascii' codec can't encode character u'\u4e3a' &#39;ascii&#39;编解码器无法编码字符u&#39;\\ u2013&#39; - 'ascii' codec can't encode character u'\u2013' Unicode编码错误:&#39;ascii&#39;编解码器无法编码字符u&#39;\\ u2019&#39; - Unicode Encode Error: 'ascii' codec can't encode character u'\u2019' ASCII编解码器无法编码字符u&#39;\\ u2013&#39; - ASCII codec can't encode character u'\u2013' Python:&#39;ascii&#39;编解码器无法编码字符u&#39;\\\\ u2026&#39; - Python: 'ascii' codec can't encode character u'\\u2026' Python错误:UnicodeEncodeError:&#39;ascii&#39;编解码器无法编码字符 - Python error : UnicodeEncodeError: 'ascii' codec can't encode character UnicodeEncodeError:&#39;ascii&#39;编解码器无法使用python脚本编码字符u&#39;\\ u200f&#39; - UnicodeEncodeError: 'ascii' codec can't encode character u'\u200f' with python script python&#39;ascii&#39;编解码器无法编码字符 - python 'ascii' codec can't encode character
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM