简体   繁体   中英

Remove \t from a list of lists

I would like to remove the \\t from the second token. Was trying something with loops but was not successful. Any help please?

import re
regex = re.compile(r'[\t]')
for sent in train_sents:
    for tuples in sent:
         print tuples[1]

 [('O', 'Identification\t'),
 ('O', 'of\t'),
 ('O', 'APC2,\t'),
 ('O', 'a\t'),
 ('O', 'homologue\t'),
 ('O', 'of\t'),
 ('O', 'the\t'),
 ('B-DISEASE', 'adenomatous\t'),
 ('I-DISEASE', 'polyposis\t'),
 ('I-DISEASE', 'coli\t'),
 ('I-DISEASE', 'tumour\t'),
 ('O', 'suppressor\t'),
 ('O', '.\t')],
 [('O', 'The\t'),
 ('B-DISEASE', 'adenomatous\t'),
 ('I-DISEASE', 'polyposis\t'),
 ('I-DISEASE', 'coli\t'),
 ('I-DISEASE', '(\t'),
 ('I-DISEASE', 'APC\t'),
 ('I-DISEASE', ')\t'),
 ('I-DISEASE', 'tumour\t'),
 ('O', '-suppressor\t'),
 ('O', 'protein\t'),
 ('O', 'controls\t'),
 ('O', 'the\t'),
 ('O', 'Wnt\t'),
 ('O', 'signalling\t'),
 ('O', 'pathway\t'),
 ('O', 'by\t'),
 ('O', 'forming\t'),
 ('O', 'a\t'),
 ('O', 'complex\t'),
 ('O', 'with\t'),
 ('O', 'glycogen\t'),
 ('O', 'synthase\t'),
 ('O', 'kinase\t'),
 ('O', '3beta\t'),
 ('O', '(\t'),
 ('O', 'GSK-3beta\t'),
 ('O', ')\t'),
 ('O', ',\t'),
 ('O', 'axin\t'),
 ('O', '/\t'),
 ('O', 'conductin\t'),
 ('O', 'and\t'),
 ('O', 'betacatenin\t'),
 ('O', '.\t')]

replace() should be useful here.See below:

 lst=[('O', 'signalling\t'),('O', 'kinase\t'),('try_yourself_first','happy_coding\t')]
 for tup,i in zip (lst,range(0,len(lst))):
    lst[i]=(tup[0],tup[1].replace('\t',''))
 print(lst)

OUTPUT:

 [('O', 'signalling'), ('O', 'kinase'), ('try_yourself_first', 'happy_coding')]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM