Sorry if these seem too easy, but I am new to HttpHandlers.
The manual process
1. I login to a secure site.
2. The site redirects me to a URI similar to this " https://subDomain.Domain.com/SubDir/desktop.asp?abc=EditLayout&defe=ChangeDep&Divi=qwerty&returnURL=
3. I click a search link which changes the query string and content displayed
4. I enter a search term and, if found, the results are displayed in a multi tabbed box. The tabs appear to be controlled by javascript and selecting them displays different data.
5. A thing that I cannot explain is that if I view the source, none of the data is on the page, yet it is visible
Now, I have got as far as passing my credentials to the site and being able to retrieve the login page. What I cant figure out is the following;
1. How do I know if my credentials were accepted? The HTTPWebResponse status code returns OK, but the login page is returned and not the redirected one.
2. Once the original post is done, do my credentials remain valid or are they lost?
3. How do they hide the data.
My code is as follows....
' The Class module
Class Scrape
Private
UserName As
String
Private
UserPwd As
String
Private
ProxyServer As
String
Private
ProxyPort As
Integer
Private
Request As
String
Sub New(ByVal HttpUserName
As String,
ByVal HttpUserPwd
As String,
ByVal HttpProxyServer
As String,
ByVal HttpProxyPort
As Integer,
ByVal HttpRequest
As String)
UserName = HttpUserName
UserPwd = HttpUserPwd
ProxyServer = HttpProxyServer
ProxyPort = HttpProxyPort
Request = HttpRequest
End Sub
'New
Public Overridable
Function CreateWebRequest(ByVal uri
As String,
ByVal collHeader
As NameValueCollection,
ByVal RequestMethod As
String, ByVal NwCred
As Boolean)
As HttpWebRequest
Dim webrequest
As HttpWebRequest = CType(webrequest.Create(uri), HttpWebRequest)
webrequest.KeepAlive = False
webrequest.Method = RequestMethod
Dim iCount
As Integer = collHeader.Count
Dim key
As String
Dim keyvalue
As String
Dim i As
Integer
For i = 0
To iCount - 1
key = collHeader.Keys(i)
keyvalue = collHeader(i)
webrequest.Headers.Add(key, keyvalue)
Next i
webrequest.ContentType = "text/html"
If ProxyServer.Length > 0
Then
webrequest.Proxy = New WebProxy(ProxyServer, ProxyPort)
End If
webrequest.AllowAutoRedirect = False
If NwCred
Then
Dim wrCache
As New CredentialCache
wrCache.Add(New Uri(uri), "Basic",
New NetworkCredential(UserName, UserPwd))
webrequest.Credentials = wrCache
End If
collHeader.Clear()
Return webrequest
End Function
'CreateWebRequest
Public Overridable
Function GetFinalResponse(ByVal ReUri
As String,
ByVal Cookie As
String, ByVal RequestMethod
As String,
ByVal NwCred As
Boolean) As
String
Dim collHeader
As New NameValueCollection
If Cookie.Length > 0
Then
collHeader.Add("Cookie", Cookie)
End If
Dim webrequest
As HttpWebRequest = CreateWebRequest(ReUri, collHeader, RequestMethod, NwCred)
BuildReqStream(webrequest)
Dim webresponse
As HttpWebResponse
webresponse = CType(webrequest.GetResponse(), HttpWebResponse)
Dim sc = webresponse.StatusCode
Dim st = webresponse.ResponseUri
Dim enc
As Encoding = System.Text.Encoding.GetEncoding(1252)
Dim loResponseStream
As New StreamReader(webresponse.GetResponseStream(), enc)
Dim Response
As String = loResponseStream.ReadToEnd()
loResponseStream.Close()
webresponse.Close()
Return Response
End Function
'GetFinalResponse
Private Sub BuildReqStream(ByRef webrequest
As HttpWebRequest)
Dim bytes
As Byte() = Encoding.ASCII.GetBytes(Request)
webrequest.ContentLength = bytes.Length
Dim oStreamOut
As Stream = webrequest.GetRequestStream()
oStreamOut.Write(bytes, 0, bytes.Length)
oStreamOut.Close()
End Sub
'BuildReqStream
End
Class
'The code from the calling form
Private Sub Button1_Click(ByVal sender
As System.Object,
ByVal e As System.EventArgs)
Handles Button1.Click
Dim resp
As New Scrape("username", "password", "", 0, "Submit")
txtData.Text = resp.GetFinalResponse(https://www.SubDomain.MainDomain.com/Directory/default.asp, "", "POST", False)
resp = Nothing
End Sub
I've managed to resolve two problems. 1. I was not pointing to the correct uri after login and 2. I didn't allow for cookie support. I can now navigate around the site.
I still need some advice though. When I view the source of the search page, this is what I see..
<td>
<Input name=btnSearch type=button onclick="
str = '<SParam>';
str += '<System>CC_01</System>';
str += '<Name>' + txtFName.value + '</Name>';
'...........a few more parameters like the one above
str += '<Seed>0</Seed>';
str += '</SParam>';
TData.loadXML(str);
pnlResult.innerHTML='';
tbResults.click();"
value="Search=" ID="SrcButton">>">
</td>
I think the line 'TData.loadXML(str) is calling a javascript function that passes an xml query string (str), so a new question is how do I get this to execute and see the results?
I'd really appreciate some help on this if anyone can provide it as I am stumped here.
Hi Terry,
the javascript code you saw is a methos called Remote Scripting.
you can use XMLHTTP to send some kind of data to a page, using a client side method and get a reponse, and ofcource a server side method is called on the other side.
there are several components that do this job for you and most of them are free.
one of them is called Ajax, you can see 3 or 4 articles about it and how you can use it in
www.codeproject.com,
also take a look at this article :
www.eggheadcafe.com/articles/20050514.asp
and now i have one question !
how do you manage cookies in your http request ?
would you please explain ?
Regards,
Ariya
MCP, MCAD.net, MCSD.net, CIW Associate, CIW Professional
Thanks you for your response. I shall look this up now.
In answer to your question, using the code from my first post I added
In the CreateWebRequest function
webrequest.CookieContainer = New CookieContainer
If Not
Me.Cookies Is
Nothing And Me.Cookies.Count > 0
Then
webrequest.CookieContainer.Add(Me.Cookies)
End If
In the GetFinalResponse function after the line.......
"webresponse = CType(webrequest.GetResponse(), HttpWebResponse)"
If webresponse.Cookies.Count > 0
Then
If Me.Cookies
Is Nothing
Then
Me.Cookies = webresponse.Cookies
Else
Dim
newCookie As
Cookie
For
Each newCookie
In
webresponse.Cookies
Dim
b As
Boolean = False
Dim
currentCookie As
Cookie
For
Each
currentCookie In
Me.Cookies
If
currentCookie.Name = newCookie.Name
Then
currentCookie.Value = newCookie.Value
b = True
Exit
For
End
If
If
Not
b Then
Me.Cookies.Add(newCookie)
End
If
Next
Next
End If
End If
I have looked into remote scripting and Ajax and now understand what's happening and why I can't see the content on the page :)
From my original post, I mentioned that I was attempting to scrape the data from a remote site. What I can't yet grasp, despite the reading, is how I use Ajax or XMLHTTP in this manner.
When I scrape, I get to the search page and find this
Dim sb As New StringBuilder
sb.Append("<SearchParam>;")
sb.Append("<System>CC_01</System>;")
sb.Append("<Name>A.N Other</Name>;")
sb.Append("<SearchSeed>0</SearchSeed>;")
sb.Append("</SearchParam>;")
But how do I process the function "TData.loadXML(sb);" back to the server?
Member
13 Points
131 Posts
Novice questions
Jul 08, 2005 11:49 AM|terryrey|LINK
Sorry if these seem too easy, but I am new to HttpHandlers.
The manual process
1. I login to a secure site.
2. The site redirects me to a URI similar to this " https://subDomain.Domain.com/SubDir/desktop.asp?abc=EditLayout&defe=ChangeDep&Divi=qwerty&returnURL=
3. I click a search link which changes the query string and content displayed
4. I enter a search term and, if found, the results are displayed in a multi tabbed box. The tabs appear to be controlled by javascript and selecting them displays different data.
5. A thing that I cannot explain is that if I view the source, none of the data is on the page, yet it is visible
Now, I have got as far as passing my credentials to the site and being able to retrieve the login page. What I cant figure out is the following;
1. How do I know if my credentials were accepted? The HTTPWebResponse status code returns OK, but the login page is returned and not the redirected one.
2. Once the original post is done, do my credentials remain valid or are they lost?
3. How do they hide the data.
My code is as follows....
' The Class module
Class Scrape
Private UserName As String
Private UserPwd As String
Private ProxyServer As String
Private ProxyPort As Integer
Private Request As String
Sub New(ByVal HttpUserName As String, ByVal HttpUserPwd As String, ByVal HttpProxyServer As String, ByVal HttpProxyPort As Integer, ByVal HttpRequest As String)
Public Overridable Function CreateWebRequest(ByVal uri As String, ByVal collHeader As NameValueCollection, ByVal RequestMethod As String, ByVal NwCred As Boolean) As HttpWebRequestUserName = HttpUserName
UserPwd = HttpUserPwd
ProxyServer = HttpProxyServer
ProxyPort = HttpProxyPort
Request = HttpRequest
End Sub 'New
Dim webrequest As HttpWebRequest = CType(webrequest.Create(uri), HttpWebRequest)
webrequest.KeepAlive = False
webrequest.Method = RequestMethod
Dim iCount As Integer = collHeader.Count
Dim key As String
Dim keyvalue As String
Dim i As Integer
For i = 0 To iCount - 1
key = collHeader.Keys(i)
keyvalue = collHeader(i)
webrequest.Headers.Add(key, keyvalue)
Next i
webrequest.ContentType = "text/html"
If ProxyServer.Length > 0 Then
webrequest.Proxy = New WebProxy(ProxyServer, ProxyPort)
End If
webrequest.AllowAutoRedirect = False
If NwCred Then
Dim wrCache As New CredentialCache
wrCache.Add(New Uri(uri), "Basic", New NetworkCredential(UserName, UserPwd))
webrequest.Credentials = wrCache
End If
collHeader.Clear()
Return webrequest
End Function 'CreateWebRequest Public Overridable Function GetFinalResponse(ByVal ReUri As String, ByVal Cookie As String, ByVal RequestMethod As String, ByVal NwCred As Boolean) As String
Dim collHeader As New NameValueCollection
If Cookie.Length > 0 Then
collHeader.Add("Cookie", Cookie)
End If
Dim webrequest As HttpWebRequest = CreateWebRequest(ReUri, collHeader, RequestMethod, NwCred)
BuildReqStream(webrequest)
Dim webresponse As HttpWebResponse
webresponse = CType(webrequest.GetResponse(), HttpWebResponse)
Dim sc = webresponse.StatusCode
Dim st = webresponse.ResponseUri
Dim enc As Encoding = System.Text.Encoding.GetEncoding(1252)
Dim loResponseStream As New StreamReader(webresponse.GetResponseStream(), enc)
Dim Response As String = loResponseStream.ReadToEnd()
loResponseStream.Close()
webresponse.Close()
Return Response
End Function 'GetFinalResponse Private Sub BuildReqStream(ByRef webrequest As HttpWebRequest)
Dim bytes As Byte() = Encoding.ASCII.GetBytes(Request)
webrequest.ContentLength = bytes.Length
Dim oStreamOut As Stream = webrequest.GetRequestStream()
oStreamOut.Write(bytes, 0, bytes.Length)
oStreamOut.Close()
End Sub 'BuildReqStream
End
Class'The code from the calling form
Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
Dim resp As New Scrape("username", "password", "", 0, "Submit")
txtData.Text = resp.GetFinalResponse(https://www.SubDomain.MainDomain.com/Directory/default.asp, "", "POST", False)
resp = Nothing
End Sub
Thanks.
Member
13 Points
131 Posts
Re: Novice questions
Jul 11, 2005 04:13 AM|terryrey|LINK
I've managed to resolve two problems. 1. I was not pointing to the correct uri after login and 2. I didn't allow for cookie support. I can now navigate around the site.
I still need some advice though. When I view the source of the search page, this is what I see..
<td>
<Input name=btnSearch type=button onclick="
str = '<SParam>';
str += '<System>CC_01</System>';
str += '<Name>' + txtFName.value + '</Name>';
'...........a few more parameters like the one above
str += '<Seed>0</Seed>';
str += '</SParam>';
TData.loadXML(str);
pnlResult.innerHTML='';
tbResults.click();"
value="Search=" ID="SrcButton">>">
</td>
I think the line 'TData.loadXML(str) is calling a javascript function that passes an xml query string (str), so a new question is how do I get this to execute and see the results?
I'd really appreciate some help on this if anyone can provide it as I am stumped here.
Thanks in advance
Terry.
Member
30 Points
87 Posts
Re: Novice questions
Jul 11, 2005 06:49 AM|ariya|LINK
the javascript code you saw is a methos called Remote Scripting.
you can use XMLHTTP to send some kind of data to a page, using a client side method and get a reponse, and ofcource a server side method is called on the other side.
there are several components that do this job for you and most of them are free.
one of them is called Ajax, you can see 3 or 4 articles about it and how you can use it in www.codeproject.com,
also take a look at this article :
www.eggheadcafe.com/articles/20050514.asp
and now i have one question !
how do you manage cookies in your http request ?
would you please explain ?
Regards,
Ariya
Blog: http://arashnorouzi.wordpress.com
Member
13 Points
131 Posts
Re: Novice questions
Jul 11, 2005 09:32 AM|terryrey|LINK
Hi Ariya,
Thanks you for your response. I shall look this up now.
In answer to your question, using the code from my first post I added
In the CreateWebRequest function
webrequest.CookieContainer = New CookieContainer
If Not Me.Cookies Is Nothing And Me.Cookies.Count > 0 Then
webrequest.CookieContainer.Add(Me.Cookies)
End If
In the GetFinalResponse function after the line.......
"webresponse = CType(webrequest.GetResponse(), HttpWebResponse)"
If webresponse.Cookies.Count > 0 Then
If Me.Cookies Is Nothing Then
Me.Cookies = webresponse.Cookies
Else
Dim newCookie As Cookie
For Each newCookie In webresponse.Cookies
Dim b As Boolean = False
Dim currentCookie As Cookie
For Each currentCookie In Me.Cookies
If currentCookie.Name = newCookie.Name Then
currentCookie.Value = newCookie.Value
b = True
Exit For
End If
If Not b Then
Me.Cookies.Add(newCookie)
End If
Next
Next
End If
End If
Member
13 Points
131 Posts
Re: Novice questions
Jul 12, 2005 06:04 AM|terryrey|LINK
Hi Ariya,
I have looked into remote scripting and Ajax and now understand what's happening and why I can't see the content on the page :)
From my original post, I mentioned that I was attempting to scrape the data from a remote site. What I can't yet grasp, despite the reading, is how I use Ajax or XMLHTTP in this manner.
When I scrape, I get to the search page and find this
<Input name=btnSearch type=button onclick="
str = '<SParam>';
str += '<System>CC_01</System>';
str += '<Name>' + txtFName.value + '</Name>';
str += '<Seed>0</Seed>';
str += '</SParam>';
TData.loadXML(str);
pnlResult.innerHTML='';
tbResults.click();"
value="Search=" ID="SrcButton">>">
I can reformat the string by doing this
Dim sb As New StringBuilder
sb.Append("<SearchParam>;")
sb.Append("<System>CC_01</System>;")
sb.Append("<Name>A.N Other</Name>;")
sb.Append("<SearchSeed>0</SearchSeed>;")
sb.Append("</SearchParam>;")
But how do I process the function "TData.loadXML(sb);" back to the server?
Can this be done using Ajax/XMLHTTP?
Regards
Terry.
Member
13 Points
131 Posts
Re: Novice questions
Jul 14, 2005 08:38 AM|terryrey|LINK