简体   繁体   中英

BeautifulSoup: tag enclosing formatting

when I prettify a soup, I am trying to get this:

<tag attr="val" />

Instead of this:

<tag attr="val"></tag>

I checked bs4.formatter code and I didn't find an option related to my needs:

def __init__(
            self, language=None, entity_substitution=None,
            void_element_close_prefix='/', cdata_containing_tags=None,
            empty_attributes_are_booleans=False, indent=1,
    ):

How can I achieve this? Thanks

I tried with new_tap options and bs4.formatter options.

I'm not sure why you'd want to do such a thing, since bs4 produces valid html and this would be messing with that, but you could use this function:

def closeVoidElements(html, voidEls=None, parser=None, pFormatter=None):
    if type(voidEls) != list:            
        voidEls = [ 
            'area', 'base', 'br', 'col', 'command', 'embed', 'wbr', 'img', 
            'input', 'keygen', 'link', 'meta', 'param', 'source', 'track', 'hr'
        ] # void elements from https://www.w3.org/TR/2011/WD-html-markup-20110113/syntax.html#syntax-elements 
    
    html = BeautifulSoup(str(html), parser)
    if voidEls: voidEls = set([t.name for t in html.find_all(voidEls)])    
    html = html.prettify()

    for ve in voidEls: 
        html = html.replace(f'<{ve}', f'<{ve}_x').replace(f'{ve}>', f'{ve}_x>')
    html = BeautifulSoup(html, parser).prettify(formatter=pFormatter)
    for ve in voidEls: 
        html = html.replace(f'<{ve}_x', f'<{ve}').replace(f'{ve}_x>', f'{ve}>')
    return html

and call it like closeVoidElements(soup) instead of soup.prettify() . (It's basically changing the tag names of self-closing tags so bs4 doesn't recognize them as such and then parsing and prettifying before changing them back.)

Before, there used to be a selfClosingTags arguments for xml, but it has been discontinued.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM