简体   繁体   中英

data extraction from xls using xlrd in python

I am trying to extract the data from an .xls file and making a list but i am getting the list as [u'elem1', u'elem2', u'elem3'] , but if i print separately i get as:

elem1
elem2
elem3

what is that u thing and how to remove it?

Here is my code...

from xlrd import open_workbook
xls=open_workbook('name.xls')
for sheets in xls.sheets():
    list1=[]
    for col in range(sheets.ncols):
        for rows in range(sheets.nrows):
            list1.append(sheets.cell(rows, col).value)
print(list1)
for i in list1:
    print(i)

You can define the text as string,while appending data to the list in list1.append(str(sheets.cell(rows, col).value)) to remove [u' .The code will be:

   from xlrd import open_workbook
   xls=open_workbook('name.xls')
   for sheets in xls.sheets():
   list1=[]
   for col in range(sheets.ncols):
      for rows in range(sheets.nrows):
         list1.append(str(sheets.cell(rows, col).value))
   print(list1)
   for i in list1:
      print i

Assuming you are using Python 2.x, the u thing says that xlrd gives you unicode strings (what Excel strings really are). If you want to convert them in Python 2.7 strings, you have to encode them with the charset you use

Assuming you use latin1 (also knows as iso-8859-1 or (with minimal differences) windows-1252, you can transform your list of unicode strings in a list of latin1 strings that way :

strlist = [ elt.encode('latin1') for elt in list1 ]

or if you have only ASCII characters

strlist = [ str(elt) for elt in list1 ]

我通过做解决了

str(variable_name)

For practical matters, the u in the beginning won't affect u. U can work with them as well unless you have some issues related to encoding it in different formats.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM