简体   繁体   中英

Python - Formatting strings

I have the input file :

sun vehicle
one number
two number
reduce command
one speed
five speed
zero speed
speed command
kmh command

I used the following code:

from collections import OrderedDict
output = OrderedDict()
with open('final') as in_file:
for line in in_file:    
columns = line.split(' ')
if len(columns) >= 2:
    word,tag = line.strip().split()
    if output.has_key(tag) == False:
        output[tag] = [];
    output[tag].append(word)
else:
    print ""
    for k, v in output.items():
        print '<{}> {} </{}>'.format(k, ' '.join(v), k)
    output = OrderedDict()

I am getting the output as:

<vehicle> sun </vehicle>
<number> one two </number>
<command> reduce speed kmh </command>
<speed> one five zero </speed>

But my expected output should be:

<vehicle> sun </vehicle>
<number> one two </number>
<command> reduce 
<speed> one five zero </speed>
speed kmh </command>

Can someone help me in solving this?

It looks like the output you want to achieve is underspecified!

You presumably want the code to "know in advance" that speed is a part of command , before you get to the line speed command .

To do what you want, you will need a recursive function . How about

    for k, v in output.items():
        print  expandElements(k, v,output)

and somewhere you define

    def expandElements(k,v, dic):
        out = '<' +k + '>'
        for i in v:
          # check each item of v for matches in dic.
          # if no match, then out=out+i
          # otherwise expand using a recursive call of expandElements()
          # and out=out+expandElements
        out = out + '<' +k + '>'

It looks like you want some kind of tree structure for your output?

You are printing out with print '<{}> {} </{}>'.format(k, ' '.join(v), k) so all of your output is going to have the form of '<{}> {} </{}>' .

If you want to nest things you are going to need a nested structure to represent them.

For recursivly parsing the input file I would make a class representing the tag. Each tag can have its children . Every children is first a string added manually with tag.children.append("value") or by calling tag.add_value(tag.name, "value").

class Tag:
    def __init__(self, name, parent=None):
        self.name = name
        self.children = []
        self.has_root = True
        self.parent = parent

    def __str__(self):
        """ compose string for this tag (recursivly) """
        if not self.children:
            return self.name
        children_str = ' '.join([str(child) for child in self.children])
        if not self.parent:
            return children_str
        return '<%s>%s</%s>' % (self.name, children_str, self.name)

    @classmethod
    def from_file(cls, file):
        """ create root tag from file """
        obj = cls('root')
        columns = []
        with open(file) as in_file:
            for line in in_file:
                value, tag = line.strip().split(' ')
                obj.add_tag(tag, value)
        return obj

    def search_tag(self, tag):
        """ search for a tag in the children """
        if self.name == tag:
            return self
        for i, c in enumerate(self.children):
            if isinstance(c, Tag) and c.name == tag:
                return c
            elif isinstance(c, str):
                if c.strip() == tag.strip():
                    self.children[i] = Tag(tag, self)
                    return self.children[i]
            else:
                result = c.search_tag(tag)
                if result:
                    return result

    def add_tag(self, tag, value):
         """
         add a value, tag pair to the children

         Firstly this searches if the value is an child. If this is the
         case it moves the children to the new location

         Afterwards it searches the tag in the children. When found
         the value is added to this tag. If not a new tag object 
         is created and added to this Tag. The flag has_root
         is set to False so the element can be moved later.
         """
         value_tag = self.search_tag(value)
         if value_tag and not value_tag.has_root:
             print("Found value: %s" % value)
             if value_tag.parent:
                 i = value_tag.parent.children.index(value_tag)
                 value = value_tag.parent.children.pop(i)
                 value.has_root = True
             else:
                 print("not %s" % value)

         found = self.search_tag(tag)
         if found:
             found.children.append(value)
         else:
             # no root
             tag_obj = Tag(tag, self)
             self.children.append(tag_obj)
             tag_obj.add_tag(tag, value)
             tag_obj.has_root = False

tags = Tag.from_file('final')
print(tags)

I know in this example the speed-Tag is not added twice. I hope that's ok. Sorry for the long code.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM