简体   繁体   中英

text file cleaning, formatting and alignment

I'm new to python and I'd like to clean and reformat a list in python3 from:

[['', '\xa0', '', ''], ['First Standard', 'First Flex', 'Business 
Standard','Business Flex', 'Economy standard', 'Economy Flex', 'Economy 
Saver', 'Economy Superdeal'], ['Class', 'P/F', 'A', 'C/D', ' Z/J', 'W/Y/B/M 
', 'H/K/L', 'Q/G/V/E', 'S/T/U/N'], ['Change Fee\nIn']]

to

['First Standard', 'First Flex', 'Business Standard', 'Business Flex', 
'Economy standard', 'Economy Flex', 'Economy Saver', 'Economy Superdeal'], 
['Class','P/F', 'A', 'C/D', ' Z/J', 'W/Y/B/M ', 'H/K/L', 'Q/G/V/E', 
'S/T/U/N']

Here's a not particularly clean, but self-explanatory way:

old = [['', '\xa0', '', ''], ['First Standard', 'First Flex', 'Business Standard', 'Business Flex', 'Economy standard', 'Economy Flex', 'Economy Saver', 'Economy Superdeal'], ['Class', 'P/F', 'A', 'C/D', ' Z/J', 'W/Y/B/M ', 'H/K/L', 'Q/G/V/E', 'S/T/U/N'], ['Change Fee\nIn']]

new = []
for old_element in old:
    new_element = []

    for string in old_element:
        stripped = string.strip()

        if stripped:
            new_element.append(stripped)

    if new_element:
        new.append(new_element)

print new

Edit:

Still waiting on a better explanation of the problem, but if you just want to remove the first and last elements, you can either do as @salparadise said, or else something like the following, which is a bit more general:

old = [['', '\xa0', '', ''], ['First Standard', 'First Flex', 'Business Standard', 'Business Flex', 'Economy standard', 'Economy Flex', 'Economy Saver', 'Economy Superdeal'], ['Class', 'P/F', 'A', 'C/D', ' Z/J', 'W/Y/B/M ', 'H/K/L', 'Q/G/V/E', 'S/T/U/N'], ['Change Fee\nIn']]

# 1 means the second element of the list, because 0 is the first.
# -1 means "one element back from the end of the list".
new = old[1:-1]

Assuming you want two separate lists and your first list is:

[['', '\xa0', '', ''],
 ['First Standard',
  'First Flex',
  'Business Standard',
  'Business Flex',
  'Economy standard',
  'Economy Flex',
  'Economy Saver',
  'Economy Superdeal'],
 ['Class',
  'P/F',
  'A',
  'C/D',
  ' Z/J',
  'W/Y/B/M ',
  'H/K/L',
  'Q/G/V/E',
  'S/T/U/N'],
 ['Change Fee\nIn']]

You can use python's unpacking magic to just grab the inner lists and dump it into two different lists:

 _, first, second, _ = bad_formatted_list

In [149]: first, second
Out[149]:
(['First Standard',
  'First Flex',
  'Business Standard',
  'Business Flex',
  'Economy standard',
  'Economy Flex',
  'Economy Saver',
  'Economy Superdeal'],
 ['Class',
  'P/F',
  'A',
  'C/D',
  ' Z/J',
  'W/Y/B/M ',
  'H/K/L',
  'Q/G/V/E',
  'S/T/U/N'])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM