[英]reading from excel file pandas in the desired type
I am reading excel file using pandas containing 2 columns: df 我正在使用包含2列的熊猫阅读excel文件:df
EID List of Tuples
1 [('Physics', 90)]
2 [('Physics', 80), ('Math', 70)]
3 [('Physics', 60, ('Math', 25))]
when I check df['List of Tuples'].iat[0]
it gives me u"[('Physics', 90)]"
I am getting this as a unicode and not as a list of tuples. 当我检查
df['List of Tuples'].iat[0]
时,我得到u"[('Physics', 90)]"
我得到的是unicode而不是tuple列表。 When I df['List of Tuples'].iat[0].decode('iso-8859-1').encode('utf-8')
, I get string: "[('Physics', 90)]"
I want to read/convert it as list of tuples [('Physics', 90)]
instead of string or unicode.In short,I want to get rid of double quotes around each entry and read it as [('Physics', 90)]
and [('Physics', 80), ('Math', 70)]
and so on. 当我
df['List of Tuples'].iat[0].decode('iso-8859-1').encode('utf-8')
,我得到string: "[('Physics', 90)]"
我想将其读取/转换为元组列表[('Physics', 90)]
而不是字符串或unicode。简而言之,我想摆脱每个条目周围的双引号,并将其读取为[('Physics', 90)]
和[('Physics', 80), ('Math', 70)]
等。
You might find it useful to parse these into python objects using ast
which can convert string representations back into python objectd. 您可能会发现使用
ast
其解析为python对象很有用,后者可以将字符串表示形式转换回python对象。 Try something like the following (I can't replicate exactly because I don't have your data): 请尝试以下操作(由于没有您的数据,我无法完全复制):
import ast
df['transformed_tuples'] = df['List of Tuples'].apply(ast.literal_eval)
To avoid this arising in the first place you might consider the file format you choose to read/write to, for example pickle will retain the original information (I'm assuming this has come from a pandas DataFrame that has been saved to excel). 为了避免这种情况的发生,您可以考虑选择读取/写入的文件格式,例如pickle将保留原始信息(我假设这是来自已保存到excel的pandas DataFrame)。
You might also consider a tabular schema which doesn't have this irregular data type within it which would probably prove to be more stable and effective in the long run. 您可能还会考虑其中没有这种不规则数据类型的表格模式,从长远来看,它可能被证明更加稳定和有效。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.