Bioformats-Python錯誤：'ascii'編解碼器在使用OMEXML（）時無法對字符u'\ xb5'進行編碼

Question

我試圖在Python中使用生物形式來讀取顯微鏡圖像（.lsm，.czi，.lif，你的名字），打印元數據，並顯示圖像。 ome = bf.OMEXML(md)給出了一個錯誤（下面）。 我認為這是在討論存儲在md的信息。 它不喜歡md中的信息不是所有ASCII。 但是我該如何克服這個問題呢？ 這就是我寫的：

import Tkinter as Tk, tkFileDialog
import os
import javabridge as jv
import bioformats as bf
import matplotlib.pyplot as plt
import numpy as np

jv.start_vm(class_path=bf.JARS, max_heap_size='12G')

用戶選擇要使用的文件

#hiding root alllows file diaglog GUI to be shown without any other GUI elements
root = Tk.Tk()
root.withdraw()
file_full_path = tkFileDialog.askopenfilename()
filepath, filename = os.path.split(file_full_path)
os.chdir(os.path.dirname(file_full_path))

print('opening:  %s' %filename)
reader = bf.ImageReader(file_full_path)
md = bf.get_omexml_metadata(file_full_path)
ome = bf.OMEXML(md)

將圖像放在numpy數組中

raw_data = []
    for z in range(iome.Pixels.get_SizeZ()):
    raw_image = reader.read(z=z, series=0, rescale=False)
    raw_data.append(raw_image)
raw_data = np.array(raw_data)

顯示想要的元數據

iome = ome.image(0) # e.g. first image
print(iome.get_Name())
print(iome.Pixels.get_SizeX())
print(iome.Pixels.get_SizeY())

這是我得到的錯誤：

---------------------------------------------------------------------------
UnicodeEncodeError                        Traceback (most recent call last)
<ipython-input-22-a22c1dbbdd1e> in <module>()
     11 reader = bf.ImageReader(file_full_path)
     12 md = bf.get_omexml_metadata(file_full_path)
---> 13 ome = bf.OMEXML(md)

/anaconda/envs/env2_bioformats/lib/python2.7/site-packages/bioformats/omexml.pyc in __init__(self, xml)
    318         if isinstance(xml, str):
    319             xml = xml.encode("utf-8")
--> 320         self.dom = ElementTree.ElementTree(ElementTree.fromstring(xml))
    321 
    322         # determine OME namespaces

<string> in XML(text)

UnicodeEncodeError: 'ascii' codec can't encode character u'\xb5' in position 1623: ordinal not in range(128)

這是具有專有顯微鏡格式的代表性測試圖像

Answer 1

感謝您添加示例圖像。 這極大地幫助了！

讓我們首先刪除所有不必要的Tkinter代碼，直到我們找到一個允許我們重現您的錯誤消息的Minimal，Complete和Verifiable示例。

import javabridge as jv
import bioformats as bf

jv.start_vm(class_path=bf.JARS, max_heap_size='12G')

file_full_path = '/path/to/Cell1.lsm'

md = bf.get_omexml_metadata(file_full_path)

ome = bf.OMEXML(md)

jv.kill_vm()

我們首先得到一些關於3i SlideBook SlideBook6Reader library not found警告信息，但我們顯然可以忽略它。

您的錯誤消息顯示為UnicodeEncodeError: 'ascii' codec can't encode character u'\\xb5' in position 1623: ordinal not in range(128) ，所以讓我們看看我們可以在1623位置找到什么。

如果在md = bf.get_omexml_metadata(file_full_path)之后添加print md ，則打印出包含元數據的整個xml。 我們放大：

>>> print md[1604:1627]
PhysicalSizeXUnit="µm"

因此， µ字符是罪魁禍首，它不能用'ascii' codec 。

回顧追溯：

/anaconda/envs/env2_bioformats/lib/python2.7/site-packages/bioformats/omexml.pyc in __init__(self, xml)
    318         if isinstance(xml, str):
    319             xml = xml.encode("utf-8")
--> 320         self.dom = ElementTree.ElementTree(ElementTree.fromstring(xml))
    321 
    322         # determine OME namespaces

我們看到在錯誤發生之前的行中，我們將xml編碼為utf-8 ，這應該可以解決我們的問題。 那為什么不發生呢？

如果我們添加print type(md)我們會返回<type 'unicode'>而不是<type 'str'> omexml.py <type 'str'>作為預期的代碼..所以這是omexml.py一個錯誤！

要解決此問題，請執行以下操作（您可能需要是root用戶）;

轉到/anaconda/envs/env2_bioformats/lib/python2.7/site-packages/bioformats/
刪除omexml.pyc
在omexml.py從isinstance(xml, str):更改第318行isinstance(xml, str):到if isinstance(xml, basestring):

basestring是str和unicode的超類。 它用於測試對象是str還是unicode的實例。

我想為此提交一個錯誤，但似乎已經存在一個未解決的問題。

Bioformats-Python錯誤：'ascii'編解碼器在使用OMEXML（）時無法對字符u'\ xb5'進行編碼

問題描述

1 個解決方案

解決方案1
1 已采納 2017-04-26 09:59:33

Bioformats-Python錯誤：'ascii'編解碼器在使用OMEXML（）時無法對字符u'\ xb5'進行編碼

問題描述

1 個解決方案

解決方案1 1 已采納 2017-04-26 09:59:33

解決方案1
1 已采納 2017-04-26 09:59:33