简体   繁体   English

无法摆脱numpy数组中的非Unicode字符

[英]Can't get rid of non-unicode characters in numpy array

I have a table: vals = table.iloc[sum(which(pa_i),[1]),1:] content: 我有一个表: vals = table.iloc[sum(which(pa_i),[1]),1:]内容:

Out[249]: 
                      1         2       3
2  NA [1] (16.0 to N/A)  12.0 [2]  NA [1]

When I write vals.values I get: vals 当我写vals.values我得到:vals

Out[250]: array([['NA\xa0[1] (16.0\xa0to\xa0N/A)', '12.0\xa0[2]', 'NA\xa0[1]']], dtype=object)

I simply want to get this table to an array or list but earlier want to replace \\xa0 values. 我只是想将此表保存到数组或列表中,但较早前想替换\\xa0值。 When I apply: v = print(map(lambda s: s.replace('\\xa0' , ' '), vals)) I don't know how to read this object or when np.char.replace(vals[0],'\\xa0' , ' ') I get error "TypeError: string operation on non-string array". 当我申请时: v = print(map(lambda s: s.replace('\\xa0' , ' '), vals))我不知道如何读取该对象或何时np.char.replace(vals[0],'\\xa0' , ' ')我收到错误“ TypeError:非字符串数组上的字符串操作”。

What is the easiest way to convert the content to array or to replace unwanted chars?? 将内容转换为数组或替换不需要的字符的最简单方法是什么?

EDIT 编辑

I've got a solution: v = vals.astype('str') and v = np.char.replace(v,'\\xa0' , ' ') . 我有一个解决方案: v = vals.astype('str')v = np.char.replace(v,'\\xa0' , ' ')

Out[306]: array([['NA [1] (16.0 to N/A)', '12.0 [2]', 'NA [1]']], dtype='<U20') 

But I'm not fully satisfied of this answer. 但是我对这个答案并不完全满意。 I need something to work directly on vals variable - for example by doing this: a = vals[1:].toarray(?) expected result: 我需要直接在vals变量上工作的东西-例如,通过执行以下操作: a = vals[1:].toarray(?)预期结果:

a
Out[318]: ['(16.0 to N/A)', '12.0 [2]', 'NA [1]]']

Well, its because it contains int elements and not str 好吧,因为它包含int元素而不是str

Do this: 做这个:

table.values = [str(v) for v in table.values]

I have found the best way: 我找到了最好的方法:

vals
Out[357]: 
                      1         2       3
2  NA [1] (16.0 to N/A)  12.0 [2]  NA [1]

a = vals.values.tolist()

a
Out[358]: [['NA\xa0[1] (16.0\xa0to\xa0N/A)', '12.0\xa0[2]', 'NA\xa0[1]']]

a1 = [w.replace('\\xa0', ' ') for w in a[0]]

a1
Out[359]: ['NA [1] (16.0 to N/A)', '12.0 [2]', 'NA [1]']

Pffff. Pffff。 Maybe not the prettiest in terms of coding habits but what can we do. 就编码习惯而言,也许不是最漂亮的,但是我们可以做什么。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM