简体   繁体   中英

How can I get child elements of a HTML element using puppeteer in order to create a HTML tree along with their browser computed styles?

 const e = await page.querySelectorAll('body')
 const htmlTag = await page.evaluate((e) => e.outerHTML, e)
 const compStyle = await page.evaluate((e) => 
             JSON.parse(JSON.stringify(getComputedStyle(e))), e)

Using the above code I'm getting the body html element and it's computed style. Like wise I have to get it's child elements and their style. How can I get it?

If you don't mind the ordering of elements, you can create an array of all elements using a simple selector body * and for loop.

First, let's abstract the style extractor because we will be using the same thing multiple times.

// get the styles for particular element
// apply all kind of JSON filtering here
function getElementStyles(elem) {
    return JSON.parse(JSON.stringify(getComputedStyle(elem)))
}

// this will generate a single array containing all elements
function getStyleArray(selector) {
    const styles = []
    const allElements = document.querySelectorAll(selector)
    for (const element of allElements) {
        const style = getElementStyles(element)
        styles.push(style)
    }
    return styles;
}

//usage
getStyleArray('body *')

If you do want to get a tree, there are already multiple libraries for that. That being said, here is how you can replicate one yourself. I used recursion to get through this.

// find if element is an element :D
function isElement(element) {
    return element instanceof Element || element instanceof HTMLDocument;
}

// this will generate a tree style array
// all child elements are accesible using the child key
function getChildStyles(elem) {
    const childStyles = []
    for (let childNode of elem.childNodes) {
        if (isElement(childNode)) {
            const singleChildStyle = getElementStyles(childNode)

            // recursion
            if (childNode.hasChildNodes()) {
                singleChildStyle.child = getChildStyles(childNode)
            }
            childStyles.push(singleChildStyle)
        }
    }
    return childStyles
}

// usage
getChildStyles(document.body)

Note,

  • that this could be improved using better loops and other sorting/searching methods.
  • this will cost you a lot of time if the page has many elements.

Result: 在此处输入图片说明

It works!!!

Let's apply on puppeteer, You can just copy paste them or use addScriptTag .

await page.evaluate(() => {
        // add the scripts we created somewhere and then use them here
        return {
            arrayBased: getStyleArray('body *'),
            treeBased: getChildStyles(document.body)
        }
})

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM