简体   繁体   中英

Scraping information from a script tag using Javascript

I'm trying to scrape information that is within a script tag on a webpage. I've figured out how to get at the information, but I can't figure out how to manipulate it into a data object.

I'm able to get at the information using document.querySelector(x).innerHTML. Here is the innerHTML that appears (the first part doesn't seem to be formatting as code here).

" Y = YUI(YUI_CONFIG).use( 'squarespace-commerce-analytics',

function(Y) {
  Y.on('domready', function() {
    Y.Squarespace.CommerceAnalytics.checkoutConfirmed({'id':'12345676','orderNumber':'00065','websiteId':'12345678','purchasedCartId':'1234567','testMode':true,'grandTotal':{'currencyCode':'USD','value':3239,'decimalValue':'32.39','fractionalDigits':2},'grandTotalFormatted':'$32.39','subtotal':{'currencyCode':'USD','value':2300,'decimalValue':'23.00','fractionalDigits':2},'subtotalFormatted':'$23.00','taxTotal':{'currencyCode':'USD','value':204,'decimalValue':'2.04','fractionalDigits':2},'taxTotalFormatted':'$2.04','shippingTotal':{'currencyCode':'USD','value':735,'decimalValue':'7.35','fractionalDigits':2},'shippingTotalFormatted':'$7.35','billingDetails':{'customer':{'address':{'city':'New York','region':'NY','country':'United States'}}},'items':[{'sku':'123456','productName':'This is a Product','unitPrice':{'currencyCode':'USD','value':2300,'decimalValue':'23.00','fractionalDigits':2},'quantity':1}]});
  });
});

"

This code shows the innerHTML that I'm getting, and I want each of the data items (id, orderNumber, productName etc.) to be formatted into an object so that I can track ecommerce better using GTM. I'm not sure how to manipulate it in a way that I need

If you replace all the ' s with " s, this will be JSON that you can parse, so if you use a regular expression to match checkoutConfirmed to the ); , you can extract the almost-JSON, turn it into JSON, and then parse it:

 const html = document.querySelector('script[type="dontexecute"]').innerHTML; const singleQuotedJSON = html.match(/checkoutConfirmed\((.+?)\);/)[1]; const actualJSON = singleQuotedJSON.replace(/'/g, '"'); const obj = JSON.parse(actualJSON); console.log(obj);
 <script type="dontexecute">Y = YUI(YUI_CONFIG).use( 'squarespace-commerce-analytics', function(Y) { Y.on('domready', function() { Y.Squarespace.CommerceAnalytics.checkoutConfirmed({'id':'12345676','orderNumber':'00065','websiteId':'12345678','purchasedCartId':'1234567','testMode':true,'grandTotal':{'currencyCode':'USD','value':3239,'decimalValue':'32.39','fractionalDigits':2},'grandTotalFormatted':'$32.39','subtotal':{'currencyCode':'USD','value':2300,'decimalValue':'23.00','fractionalDigits':2},'subtotalFormatted':'$23.00','taxTotal':{'currencyCode':'USD','value':204,'decimalValue':'2.04','fractionalDigits':2},'taxTotalFormatted':'$2.04','shippingTotal':{'currencyCode':'USD','value':735,'decimalValue':'7.35','fractionalDigits':2},'shippingTotalFormatted':'$7.35','billingDetails':{'customer':{'address':{'city':'New York','region':'NY','country':'United States'}}},'items':[{'sku':'123456','productName':'This is a Product','unitPrice':{'currencyCode':'USD','value':2300,'decimalValue':'23.00','fractionalDigits':2},'quantity':1}]}); }); });</script>

Now that you have a well-formed object, you can manipulate it however you want. Eg, to extract the orderNumber , reference obj.orderNumber :

 const html = document.querySelector('script[type="dontexecute"]').innerHTML; const singleQuotedJSON = html.match(/checkoutConfirmed\((.+?)\);/)[1]; const actualJSON = singleQuotedJSON.replace(/'/g, '"'); const obj = JSON.parse(actualJSON); console.log(obj.orderNumber);
 <script type="dontexecute">Y = YUI(YUI_CONFIG).use( 'squarespace-commerce-analytics', function(Y) { Y.on('domready', function() { Y.Squarespace.CommerceAnalytics.checkoutConfirmed({'id':'12345676','orderNumber':'00065','websiteId':'12345678','purchasedCartId':'1234567','testMode':true,'grandTotal':{'currencyCode':'USD','value':3239,'decimalValue':'32.39','fractionalDigits':2},'grandTotalFormatted':'$32.39','subtotal':{'currencyCode':'USD','value':2300,'decimalValue':'23.00','fractionalDigits':2},'subtotalFormatted':'$23.00','taxTotal':{'currencyCode':'USD','value':204,'decimalValue':'2.04','fractionalDigits':2},'taxTotalFormatted':'$2.04','shippingTotal':{'currencyCode':'USD','value':735,'decimalValue':'7.35','fractionalDigits':2},'shippingTotalFormatted':'$7.35','billingDetails':{'customer':{'address':{'city':'New York','region':'NY','country':'United States'}}},'items':[{'sku':'123456','productName':'This is a Product','unitPrice':{'currencyCode':'USD','value':2300,'decimalValue':'23.00','fractionalDigits':2},'quantity':1}]}); }); });</script>

Hi developer arriving from a Google search, CertainPerformance's answer is the best answer for general scraping, but if all you care about is getting this particular Squarespace order detail information on a given Order Confirmation page: here's the fast track to the object you want:

Y.Squarespace.CommerceAnalytics._yuievt.events["commerceTrack:commerce-checkout-confirmed"].details[0]

Have fun! =)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM