简体   繁体   English

C#自动执行表单登录和数据挖掘

[英]C# to automate form login and data mining

I have a list of username and password in my database as provided by my users, I wish to use these records to automate login to a 3rd party website, and then go to certain page and selection and data mining the HTML response of that particular page. 我有用户提供的数据库中的用户名和密码列表,我希望使用这些记录来自动登录到第三方网站,然后转到某些页面并选择和数据挖掘该特定页面的HTML响应。 In short, the process as below 简而言之,过程如下

Go To Login Page -> Fill in the username and password, select dropdown option and Submit -> Select certain dropdown -> Submit the selection -> Data mining the HTML response -> Repeat the process with the next username and password. 转到登录页面->填写用户名和密码,选择下拉选项,然后提交->选择某些下拉菜单->提交选择->数据挖掘HTML响应->使用下一个用户名和密码重复该过程。

[UPDATE] I have learnt how to do web scraping in .Net, which solve the data mining part. [更新]我已经学习了如何在.Net中进行Web抓取,从而解决了数据挖掘部分。

What I still missing is the capability to automate login (fill in username, password and submit). 我仍然缺少自动登录的功能(填写用户名,密码和提交)。 If server is keep session id as the login info, thus I will need to be able to user the same session id to login and do scrapping. 如果服务器将会话ID保留为登录信息,那么我将需要能够使用相同的会话ID进行登录和抓取。

I wrote an application that would send you an email or text message if the value on a small section of a web page changed. 我编写了一个应用程序,如果网页一小部分的值发生更改,该应用程序将向您发送电子邮件或短信。 An example would be when your competitors price box for a given product changed - some situations were behind a login page. 例如,您的竞争对手给定产品的价格框发生了变化-有些情况是在登录页面后面。

As seen below, I entered the URL of the login page on the first row and clicked the "Get" button for that row. 如下所示,我在第一行输入了登录页面的URL,然后单击该行的“获取”按钮。 It went out and got the page and then populated the "Inputs" grid below with all of the controls on that page. 它退出并进入页面,然后使用该页面上的所有控件填充下面的“ Inputs”网格。

I then have to enter the values that will be posted back. 然后,我必须输入将回发的值。 A radio button needs to be selected for which button is tagged as the button doing the post back. 需要选择一个单选按钮,将其标记为做回发的按钮。 So your process has to be posting back the same input names as the page has and the correct button name. 因此,您的过程必须回发与页面相同的输入名称和正确的按钮名称。

在此处输入图片说明

For this to work, cookies have to be enabled. 为此,必须启用cookie。 The login step will return the cookies with a security token added, the second step has to send back cookies with the same token. 登录步骤将返回添加了安全令牌的cookie,第二步必须发送回具有相同令牌的cookie。 This is how it knows you are logged in. The tokens have an expire time on them. 这样便知道您已登录。令牌上有过期时间。

To see what this looks like, do it once while watching the traffic with Fiddler. 要查看其外观,请在与Fiddler一起观看路况时执行一次。

I wrote this in C#. 我用C#编写的。 When a simple site I used the HttpWebRequest object. 当使用简单网站时,我使用了HttpWebRequest对象。 When Ajax was involved I used the C# WebBrowser object so I could wait for the full page to load, a complication you might have to deal with. 当涉及到Ajax时,我使用了C#WebBrowser对象,因此我可以等待整个页面加载,这可能是您必须处理的复杂问题。

Got the solution, please refer to my code below, this is sufficient to expand into more complex code. 找到了解决方案,请参考下面的代码,这足以扩展为更复杂的代码。

private void bttExecute_Click(object sender, EventArgs e)
        {
            webBrowser1.Navigate(URL);            
        }

        private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            step++;

            switch (step)
            {
                case 1: //Login step                    

                    webBrowser1.Document.GetElementById("txtUsername").InnerText = "user123";

                    //This piece of code auto select the radio selection
                    HtmlElementCollection theElementCollection = webBrowser1.Document.GetElementsByTagName("input");
                    foreach (HtmlElement curElement in theElementCollection)
                    {
                        if (curElement.Id == "radio123")
                        {
                            curElement.InvokeMember("click");

                            break;
                        }
                    }

                    webBrowser1.Document.GetElementById("dropdown").SetAttribute("value", "2015"); 
                    webBrowser1.Document.GetElementById("button").InvokeMember("click");

                    break;

                case 2: 

                    //Grab the source code and process further
                    string sourceCode = webBrowser1.DocumentText;

                    break;

                case 3:



                    break;
            }

        }

After user_1 logon success, you want call back interface logon for user_2 by: window.location = "Enter Url your page html logon"; 在user_1登录成功后,您希望通过以下方式回拨user_2的界面登录:window.location =“输入您的页面html登录URL”; Ex php: echo("window.location=' http://10.224.41.131/barcode_scan/main.html ';"); 例如php:echo(“ window.location =' http://10.224.41.131/barcode_scan/main.html ';”);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM