简体   繁体   中英

Date in javascript different that the page source

If I view the page source of this website there is a piece of HTML that looks like:

<tr>
    <td class="start-time text-right">2020-01-05T16:30:00Z</td>
    <td>Pre-Show</td>
    <td>Tech Crew</td>
    <td rowspan="2" class="visible-lg text-center"> <i class="fa fa-clock-o text-gdq-red" aria-hidden="true"></i>
        0:10:00 </td>
</tr>

If I get the start-time class's innerText with javascript then I get the string "10:30 AM".
This is not too surprising because it is the same as what is displayed in the browser:
显示文字

But how can I get the original large timestamp and turn it into a date object?

In order to get a date object we need a day, a month, a year, a time in 24h format and a timezone

looking at the link you provided I can see they provided the dates inside .day-split , so with some simple text manipulation we are able to extract the needed info.

Javascript:

//data scaped from website::
var date = "Sunday, January 5th"; //from inside .day-split
var time = "10:30 AM"; // from inside .start-time
var timezone = "(detected as UTC+02:00)"; // from span #offset-detected 

//data extraction and cleanup::
var day = parseInt(date.split(" ")[2]);
var month = date.split(" ")[1];
var year = new Date().getFullYear();
var time = convertTo24Hour(time)
var timezone = timezone.split(" ")[2].replace(")","")

// building the string and parsing it::
var dateString = [day, month, year, time, timezone].join(" ");
var dateObj = new Date(dateString);

// loging the output::
console.log(dateObj)

// a function used to convert time format (12H -> 24H)::
function convertTo24Hour(time) {
    var hours = parseInt(time.substr(0, 2));
    if (time.indexOf('AM') != -1 && hours == 12) {
        time = time.replace('12', '0');
    }
    if (time.indexOf('PM') != -1 && hours < 12) {
        time = time.replace(hours, (hours + 12));
    }
    return time.replace(/( AM| PM)/, '');
}

my answer assumes you already know how to scrape data from the website.

You'd have to navigate the DOM and parse out the textContents of different elements in the page, like so:

var months = ["January","February","March","April","May","June","July", "August","September","October","November","December"];
function getDate(timeTD){
    var dateTR = timeTD.parentNode;
    while(!dateTR.className || dateTR.className.indexOf("day-split") == -1){
        dateTR = dateTR.previousSibling;
    }

    var month = months.indexOf(dateTR.textContent.split(" ")[1]);
    var date = dateTR.textContent.replace(/\D/g, "");
    var hours = timeTD.textContent.toLowerCase().indexOf("pm") > -1 ? Number(timeTD.textContent.split(":")[0]) + 12 : timeTD.textContent.split(":")[0];
    if(hours == "12") hours = 0;
    if(hours == 24) hours = 12;
    var minutes = timeTD.textContent.split(":")[1].replace(/\D/g, "");

    var dateObj = new Date();
    dateObj.setMonth(month);
    dateObj.setDate(date);
    dateObj.setHours(hours);
    dateObj.setMinutes(minutes);

    return dateObj;
}

getDate(document.getElementsByClassName("start-time")[0])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM