Simulating Javascript 'doPostBack()' in C#

Question

I am writing a web scaper for my company. Our client gives us access to their website for this purpose, but our client's IT team does not communicate with us, so I have to do the program with no help from the source.

Their website uses javascript on all of their buttons/dropdown menus to send postData to the server so that the screen will update to show the end user the correct info.

I am trying to get my program to simulate clicking the 'next page'. The 'next page' button has an onclick event that reads like this...

onclick="javascript:WebForm_DoPostBackWithOptions(
new WebForm_PostBackOptions(&quot;ctl00$ContentPlaceHolder1$ucTaxQueueListView$lviewOrderQueue$DataPager2$ctl00$btnNextPage&quot;
, &quot;&quot;, true, &quot;&quot;, &quot;&quot;, false, false))"

In my C# program, i am using the HTTPWebRequest class and the HTMLAgilityPack to do my requests / scrapping respectively.

I've done all i can in my code to try and get this to work. The only thing that works is to use Fiddler to copy the postData and paste that verbatim into my WebRequest function. This is very impractical when i have to potentially go to 1000+ 'next pages'.

I have also tried extracting the ViewState from page and using that, but that always gives me an 'error' page.

Any help or guidance would be appreciated and even compensated...my boss wants this project completed this weekend!!!

Answer 1

The last time I had to do a project similar to this, I took a very different approach.

I used GreaseMonkey -- though you could also use a Windows HTA file with the same effect --

And I let the GreaseMonkey script run and step through the pages one by one. To handle the DoPostBack I simply invoked the click handler on the appropriate elements.

I had several data stores going.

One DataStore covered every menu item that I had "clicked" on to avoid duplicating things.

Another DataStore was the raw HTML of the page (taken by body.innerHTML )

Once I had cloned all the pages, I wrote another GreaseMonkey script to load up each saved page and mine whatever info I needed off of it. I build up a third datastore of resources (images and CSS) and then pulled those down with a big text file piped into CuRL.

Simulating Javascript 'doPostBack()' in C#

Question

1 answers

solution1
1 2014-08-01 13:09:22

Simulating Javascript 'doPostBack()' in C#

Question

1 answers

solution1 1 2014-08-01 13:09:22

solution1
1 2014-08-01 13:09:22