Parsing HTML source client side.

Rate It (2)

Last post 10-13-2008 8:55 AM by NC01. 11 replies.

Sort Posts:

  • Parsing HTML source client side.

    09-13-2008, 9:15 PM
    • Member
      1 point Member
    • macmaster9600
    • Member since 09-14-2008, 1:07 AM
    • Posts 6

    ok so here is my issue. We have a billing platform. and a research tool. THey are both web based I am working on the research tool. So i want to have a client side script, java, dhtml, whatever that will first look for the billing site( that is open in IE6) and parse out the data so I can pull the address data out. We do this all the time in vb.net and it works great. However as we move this tool over to ASP.net we need to have the same functionality. The user will have to have the acitive order open on the billing system and initiate the pull on the research tool. I can provide the vb.net script we use to pull and parse out  the data, if it would help. but i havent been able to find anything that would let this happen on a web interface. Please help.

  • Re: Parsing HTML source client side.

    09-14-2008, 2:15 AM
    • Participant
      1,094 point Participant
    • ankit.sri
    • Member since 02-27-2008, 7:10 AM
    • Noida
    • Posts 235

    lets see the vb.net script  

    A fundamental rule in technology says whatever can be done will be done
  • Re: Parsing HTML source client side.

    09-14-2008, 7:47 AM
    • All-Star
      76,043 point All-Star
    • NC01
    • Member since 08-26-2005, 3:33 PM
    • Posts 14,169
    • TrustedFriends-MVPs

    What is the difference in using ASP.NET over classic ASP? You can still do the same things except that ASP.NET has a lot more functionality added.

    NC...

  • Re: Parsing HTML source client side.

    09-14-2008, 10:05 AM
    • Member
      1 point Member
    • macmaster9600
    • Member since 09-14-2008, 1:07 AM
    • Posts 6
    Private Sub CAPTToolStripMenuItem_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles CAPTToolStripMenuItem.Click On Error GoTo err_handler

    Dim FName As String = ""

    Dim LName As String = ""

    Dim SASN, City2, LV1A, LV1B, LV1C, strLECCompany, strPon As String

    'sets string so ticket form will auto grab from capt as well

    strgrab = "CAPT"

    Call CheckIE("http://capt.corp.coperate.com/Default.aspx", "CAPT") strHTMLCaptTable = ParseBy3variable(strHTMLSource, "this.src='Images/

    strgrab = "CAPT"

    Call CheckIE("http://capt.corp.coperate.com/Default.aspx", "CAPT")<>, "<TABLE class=OrderTable", "</TABLE>") ' Get Text Between the Table to find Due Date

    parseCTLnumber(objiedoc)

    ParseName(ParseBy2variable(strHTMLSource, "_lblEUName>", "</SPAN>"), FName, LName) ' Split up Name

    'FirstName.Text = StrConv(strFirstName, VbStrConv.ProperCase) ' First Name

    'LastName.Text = StrConv(strLastName, VbStrConv.ProperCase) ' Last Name

    txtHSN.Text = ParseBy3variable(strHTMLSource, "_txtSANO", "value=", " name=") ' Street Number

    'txtDir.Text = ParseBy3variable(strHTMLSource, "_txtSASD", "value=", " name=") ' Street Direction

    SASN = ParseBy3variable(strHTMLSource, "_txtSASN", "value=", " name=") ' Street Name

    txtSNM.Text = StrConv(Replace(SASN, Chr(34), ""), VbStrConv.ProperCase) ' Take "" out of Street name if space

    'txtType.Text = StrConv(ParseBy3variable(strHTMLSource, "_txtSATH", "value=", " name="), VbStrConv.ProperCase) ' Street Type

    City2 = ParseBy3variable(strHTMLSource, "_txtCITY2", "value=", " name=") ' City

    cmbCity.Text = StrConv(Replace(City2, Chr(34), ""), VbStrConv.ProperCase) 'Take "" out of City name if space

    txtStateSearch.Text = ParseBy3variable(strHTMLSource, "_txtSTATE2", "value=", " name=") ' State

    txtZIP.Text = ParseBy3variable(strHTMLSource, "_txtZipCode2", "value=", " name=") ' Zip

    LV1A = ParseBy3variable(strHTMLSource, "_txtLV1A", "value=", " name=") ' get all levels

    LV1B = ParseBy3variable(strHTMLSource, "_txtLV2A", "value=", " name=") ' get all levels

    LV1C = ParseBy3variable(strHTMLSource, "_txtLV3A", "value=", " name=") ' get all levels

    'Level.Text = LV1A + LV1B + LV1C ' Put all Levels Together

    'LRN.Text = ParseBy3variable(strHTMLSource, "_txtLRN", "value=", " name=")

    strLECCompany = ParseBy3variable(strHTMLSource, "_txtCarrier", "value=", " name=")

    'LECCompany.Text = Replace(strLECCompany, Chr(34), "")

    'If Level.Text = "0" Then ' Make sure level dont = 0 if so removes 0

    ' Level.Text = ""

    'End If

    strPon = ParseBy3variable(strHTMLCaptTable, "ponver", ",", "</A>") ' Grab Pon

    strPon = Split(strPon, "-")(0) ' split up pon and ver

    'txtODRPon.Text = strPon

    txtTN.Text = ParseBy2variable(strHTMLCaptTable, "_lblatn>", "</SPAN>") ' Grab TN from table

    If txtTN.Text = "" Then ' if pon comes back empty no order open in capt

    MsgBox("You do not have an order open.", MsgBoxStyle.SystemModal)

    Exit Sub

    End If

    If btnSearch.Enabled = True Then btnSearch_Click(sender, e)

    Exit Sub

    err_handler:

    If Err.Number = 91 Then

    MsgBox("You do not have an order open. Please open an order and try again.", MsgBoxStyle.Critical, "No order open")

    Else

    MsgBox(Err.Number & Err.Description & Err.Erl)

    End If

    End Sub

    Public Sub ParseName(ByVal strFullName As String, ByVal FName As String, ByVal LName As String)

    Dim P As Integer

    Dim strCutName As String

    Dim strReminderOfName As String

    strFirstName = ""

    strLastName = ""

    P = InStr(strFullName, ",")

    If P > 0 Then

    Dim I As Integer

    strFullName = Trim$(strFullName)

    P = 1

    For I = Len(strFullName) To 1 Step -1

    If Mid$(strFullName, I, 1) = "," Then

    P = I + 1

    Exit For

    End If

    Next I

    If P = 1 Then

    strCutName = strFullName

    strReminderOfName = ""

    Else

    strCutName = Mid$(strFullName, P)

    strReminderOfName = Trim$(Left$(strFullName, P - 1))

    End If

     

    strFirstName = strCutName '(First Name)

    strLastName = strReminderOfName '(Last Name)

    strLastName = StrConv(Replace(strLastName, ",", ""), VbStrConv.ProperCase) ' (Take out ,)

    strFirstName = Trim(strFirstName) '(Remove Spaces)

    strLastName = Trim(strLastName) ' (Remove Spaces)

    Else

    Dim I As Integer

    strFullName = Trim$(strFullName)

    P = 1

    For I = Len(strFullName) To 1 Step -1

    If Mid$(strFullName, I, 1) = " " Then

    P = I + 1

    Exit For

    End If

    Next I

    If P = 1 Then

    strCutName = strFullName

    strReminderOfName = ""

    Else

    strCutName = Mid$(strFullName, P)

    strReminderOfName = Trim$(Left$(strFullName, P - 1))

    End If

    strLastName = strCutName 'Last Name)

    strFirstName = strReminderOfName 'First Name

    End If

    End Sub

    Function ParseBy3variable(ByVal strHTML As String, ByVal DelimiterA As String, ByVal DelimiterB As String, ByVal DelimiterC As String)

    ParseBy3variable = ""

    Dim A, B, D As Integer

    Dim C As String

    A = 1

    While InStr(A, strHTML, DelimiterA) > 0

    A = InStr(A, strHTML, DelimiterA) + Len(DelimiterA)

    B = InStr(A, strHTML, DelimiterC)

    C = Microsoft.VisualBasic.Mid(strHTML, A, B - A)

    D = InStr(1, C, DelimiterB)

    If D = 0 Then

    Exit Function

    End If

    ParseBy3variable = Microsoft.VisualBasic.Mid(C, D + 6)

    End While

    End Function

    Function ParseBy2variable(ByVal strHTML As String, ByVal DelimiterA1 As String, ByVal DelimiterB1 As String)

    ParseBy2variable = ""

    Dim A As Integer, B As Integer

    A = 1

    While InStr(A, strHTML, DelimiterA1) > 0

    A = InStr(A, strHTML, DelimiterA1) + Len(DelimiterA1)

    B = InStr(A, strHTML, DelimiterB1)

    ParseBy2variable = Microsoft.VisualBasic.Mid(strHTML, A, B - A)

    End While

    End Function

    Public Sub CheckIE(ByVal URLa, ByVal URLname)

    Dim objSW As SHDocVw.ShellWindows

    Dim objIE As SHDocVw.InternetExplorer

    Dim objDoc As Object

    Dim bAppRunning As Boolean

    objSW = New SHDocVw.ShellWindows

    If objSW.Count Then ' new

    For Each objDoc In objSW

    If InStr(1, objDoc.LocationName, URLname) Then

    bAppRunning = True

    objDoc.Visible = True

    objDoc.StatusBar = 1

    Dim i As Integer

    i = 1

    While objDoc.Busy : End While : While objDoc.ReadyState <> 4 : End While

    strHTMLSource = objDoc.Document.getElementsByTagName("html")(0).outerHTMLGetDLtable(objDoc.Document, "LTY*")

    objiedoc = objDoc

    Exit For

    End If

    Next objDoc

    End If

    If bAppRunning = False Then

    objIE = CreateObject("InternetExplorer.Application") ' new

    objIE.Visible = True

    objIE.Navigate(URLa)

    objIE.StatusBar = 1

    Dim i As Int16 = 1

    While objIE.Busy : End While : While objIE.ReadyState <> 4 : End While

    strHTMLSource = objIE.Document.getElementsByTagName("html")(0).outerHTMLEnd If

     

     

    objIE = Nothing

    objSW = Nothing

    objDoc = Nothing

    End Sub

    Public Sub GetDLtable(ByVal IE As Object, ByVal TableSearch As String)

    Dim varTables As Object

    Dim varRows As Object

    Dim lngRow As Object

    Dim varCells As Object

    Dim lngColumn As Long

    varTables = IE.All.tags("TABLE") For Each varTable In varTables

    'Use the innerText to see if this is the table we want.

    If varTable.innerText Like TableSearch Then

    varRows = varTable.Rows

    lngRow = 2 'This will be the first output row

    For Each varRow In varRows

    varCells = varRow.Cells

    lngColumn = 1 'This will be the output column

    For Each varCell In varCells

    strTableText(lngColumn - 1) = varCell.innerText

    lngColumn = lngColumn + 1

    Next varCell

    lngRow = lngRow + 1

    Next varRow

    End If

    Next varTable

    End Sub

    Function parsebycombobox(ByVal ie As Object, ByVal controlname As String)

    parsebycombobox = ""

    Dim strby2 As String = ""

    Dim theElementCollection = ie.Document.GetElementsByTagName("SELECT") For Each curElement In theElementCollection

    If Microsoft.VisualBasic.Right(curElement.GetAttribute("Name"), controlname.Length) = controlname Then

    Dim strby3 As String = ie.Document.getElementByid(curElement.name).value()

    'MsgBox(ParseBy2variable(curElement.name, "gridOrder:_", ":"))

    strby2 = ParseBy2variable(strHTMLSource, strby3 & " selected>", "</OPTION>")

    End If

    Next

    parsebycombobox = strby2

    End Function

    Public Sub parseCTLnumber(ByVal ie As Object)

    strCTLnumber = ""

    Dim theElementCollection = ie.Document.GetElementsByTagName("SELECT") For Each curElement In theElementCollection

    'we look for the ddlcounty only because that is one control that is only visible when an order is open.

    'any other control that is visible only when a control is open will work as well.

    'the "9" value is the length of the ddlcounty string

    If Microsoft.VisualBasic.Right(curElement.GetAttribute("Name"), 9) = "ddlCounty" Then

    strCTLnumber = ParseBy2variable(curElement.name, "gridOrder:_", ":")

    'MsgBox(strCTLnumber)

    End If

    Next

    End Sub

    Public Function FindWindowByLocationURL(ByVal LocationURL As String) As SHDocVw.InternetExplorer

    FindWindowByLocationURL = New SHDocVw.InternetExplorer

    Dim objSW As New SHDocVw.ShellWindows

    Dim timer As Date

    blnwindowfound = False

    If objSW.Count Then

    For Each objIE In objSW

    If InStr(1, objIE.LocationURL, LocationURL) Then

    blnwindowfound = True

    objIE.StatusBar = 1

    objIE.Visible = True

    WaitAgain:

    timer = Now

    While objIE.Busy

    If DateDiff(DateInterval.Second, Now, timer, FirstDayOfWeek.Monday, FirstWeekOfYear.Jan1) > 10 Then

    objIE.Refresh()

    GoTo WaitAgain

    End If

    Application.DoEvents()

    End While

    timer = Now

    While objIE.ReadyState <> 4

    If DateDiff(DateInterval.Second, Now, timer, FirstDayOfWeek.Monday, FirstWeekOfYear.Jan1) > 10 Then

    objIE.Refresh()

    GoTo WaitAgain

    End If

    Application.DoEvents()

    End While

    FindWindowByLocationURL = objIE

    Exit For

    End If

    If objIE.LocationURL = "" Then objIE.Quit() Next objIE

    End If

    If blnwindowfound = False Then

    FindWindowByLocationURL = CreateObject("InternetExplorer.Application")

    FindWindowByLocationURL.StatusBar = 1

    FindWindowByLocationURL.Visible = True

    End If

    SetForegroundWindow(FindWindowByLocationURL.HWND)

    End Function

  • Re: Parsing HTML source client side.

    09-15-2008, 7:14 AM
    Answer
    • All-Star
      76,043 point All-Star
    • NC01
    • Member since 08-26-2005, 3:33 PM
    • Posts 14,169
    • TrustedFriends-MVPs

    It would take me a week to look through and debug all of that code, and, sorry, I just don't have that much time to allot to one post. Have you even tried running this code in VB.NET/ASP.NET?

    NC...

  • Re: Parsing HTML source client side.

    09-15-2008, 7:31 AM
    Answer
    • All-Star
      62,825 point All-Star
    • TATWORTH
    • Member since 02-04-2003, 8:34 AM
    • England
    • Posts 12,263
    • TrustedFriends-MVPs

     >We do this all the time in vb.net and it works great. However as we move this tool over to ASP.net we need to have the same functionality.

     Do you mean VBSCRIPT or VB.NET? They are not the same language!

    Don't forget to click "Mark as Answer" on the post that helped you.
    This credits that member, earns you a point and marks your thread as Resolved so we will all know you have been helped.
  • Re: Parsing HTML source client side.

    10-10-2008, 7:25 PM
    • Member
      1 point Member
    • macmaster9600
    • Member since 09-14-2008, 1:07 AM
    • Posts 6
    Basicly the issue is that i need to be able to pull the HTML source from a diferent IE window. Once i have it i can parse it. But i cant find a way to pull the HTML source from anything other than the active window. so basicly i have window 1 (the research application) and window 2 (the billing window) and i need to have window 1 pull the SOURCE or pull data from the fields of window 2. If i pull the source i can parse it out. If i populate using the fields i wont have to parse at all. I dont care what language it is in. As long as it works.
  • Re: Parsing HTML source client side.

    10-11-2008, 6:44 AM
    • All-Star
      76,043 point All-Star
    • NC01
    • Member since 08-26-2005, 3:33 PM
    • Posts 14,169
    • TrustedFriends-MVPs

    Is window #1 opening window #2 and how is it opening it? Using window.open or Response.Redirect or a POST? If using POST you can use Request["control-ID"] to get the posted values this will also work with a QueryString from the opener window.

    Please post more particulars as to exactly what you are doing. You can't really read the HTML from one window to another since the web is a disconnected architecture.

    NC...

     

  • Re: Parsing HTML source client side.

    10-11-2008, 10:38 AM
    • Member
      1 point Member
    • macmaster9600
    • Member since 09-14-2008, 1:07 AM
    • Posts 6

    Currently window 1 does not open window 2. However if it will make this work i can have them change the process and I can change the script. I usually use a Redirect, but i will look at doing a post insted. I have not done that yet so i will have to find out exactly how to do that. Please explain more how to use the queryString function. I am failry new to ASP, most of my code is using windows forms and vb.net or vbscript. What about, i know it is old school and not best practice now, but doing a frame. So window 1 will physically contain window 2. The controls that come with visual studio do not make it easy but i am sure i can figure it out.

  • Re: Parsing HTML source client side.

    10-12-2008, 7:18 AM
    • All-Star
      76,043 point All-Star
    • NC01
    • Member since 08-26-2005, 3:33 PM
    • Posts 14,169
    • TrustedFriends-MVPs

    First of all, you might try a Google search on some of those topics. Google is your best friend as a developer.

    I still do not know what you are trying to do, but I will try to answer some of your questions.

    To use a QueryString and Redirect:

    Window #1 CodeBehind:
    protected void nextWindowButton_Click(object sender, System.EventArgs e)
    {
     string firstValueToSend = TextBox1.Text;
     string secondValueToSend = TextBox2.Text;
     string thirdValueToSend = TextBox3.Text;

     string windowUrl = "Page2.aspx";
     string queryString = string.Format("?value1={0}&value2={1}&value3={2}", firstValueToSend, secondValueToSend, thirdValueToSend);
     this.Response.Redirect(windowUrl + queryString);
    }

    Window #2 CodeBehind:
    private void Page_Load(object sender, System.EventArgs e)
    {
     string firstValueSent = Request["value1"];
     string secondValueSent = Request["value2"];
     string thirdValueSent = Request["value3"];
    }

    To use a POST:

    Window #1 aspx file:
    <script type="text/javascript">
    <!--
    function postToPayPal()
    {
     var formElementsArray = document.getElementsByTagName('FORM');
     
     if ( formElementsArray != null )
     {
      var formElement = formElementsArray[0];

      document.getElementById('__EVENTTARGET').value = ' ';
      document.getElementById('__EVENTARGUMENT').value = ' ';
      document.getElementById('__VIEWSTATE').name = 'NOVIEWSTATE';

      var formAction = 'Page2.aspx';
      formElement.action = formAction;
      formElement.submit();
     }
    // -->
    </script>

    Window #2 CodeBehind:
    private void Page_Load(object sender, System.EventArgs e)
    {
     string firstValueSent = Request["TextBox1"];
     string secondValueSent = Request["TextBox2"];
     string thirdValueSent = Request["TextBox3"];
    }

    To use an iFrame:

    <iframe id="yourIFrame" frameborder="1" height="400" runat="server" src="Page1.aspx" width="100%"></iframe>

    The problem with an iFrame is that you are not easily going to be able to access the controls on the page in the iFrame, especially server-side.

    NC...

  • Re: Parsing HTML source client side.

    10-12-2008, 11:55 AM
    • Member
      1 point Member
    • macmaster9600
    • Member since 09-14-2008, 1:07 AM
    • Posts 6

    ok so, to give you a better idea on what i am trying to do. My IT department manages our billing department, they wont let me make changes to that application at all, however my department needs a tool they can use to research some details about the customers address. So I can make the people of my department manually type in the address into the research tool, but they have to have the billing system open at all times anyway. So the idea is to just have a button on the research tool pull the details from the proper fields that hold the address data. This would speed the process, cut down on errors, and reduce strain on the employees because it will reduce keystroks enterd every day. That is the general idea.

     I like the post option. However i cant add any script to the billing window so I have to pull the data directly. I think that post option will work with some modifications. I will make the modifications and post it here so i can get some feed back.

     Thanks for all your help.

  • Re: Parsing HTML source client side.

    10-13-2008, 8:55 AM
    • All-Star
      76,043 point All-Star
    • NC01
    • Member since 08-26-2005, 3:33 PM
    • Posts 14,169
    • TrustedFriends-MVPs

    There is no way to read data from an already opened window. The only way for the POST method to work would be for the billing window to do a submit to the new window, which it does not sound like that you can do. I would suggest just reading the information needed from the same database that the billing window uses.

    NC...

     

Page 1 of 1 (12 items)