<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://forums.asp.net/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Visual Basic .NET</title><link>http://forums.asp.net/36.aspx</link><description>Discussions/Questions about the Visual Basic .NET language. &lt;a href="http://aspadvice.com/SignUp/list.aspx?l=14&amp;c=23" target="_blank"&gt;Email List&lt;/a&gt;</description><dc:language>en</dc:language><generator>CommunityServer 2007 SP1 (Build: 20510.895)</generator><item><title>Re: RegEx Help Needed</title><link>http://forums.asp.net/thread/3277954.aspx</link><pubDate>Tue, 07 Jul 2009 04:10:46 GMT</pubDate><guid isPermaLink="false">4c671506-2930-414c-a40b-8bf57ded5924:3277954</guid><dc:creator>imran_ku07</dc:creator><author>imran_ku07</author><slash:comments>0</slash:comments><comments>http://forums.asp.net/thread/3277954.aspx</comments><wfw:commentRss>http://forums.asp.net/commentrss.aspx?SectionID=36&amp;PostID=3277954</wfw:commentRss><description>&lt;p&gt;This Pattern is differnt from which i gave you.&lt;/p&gt;&lt;p&gt;&lt;BLOCKQUOTE&gt;&lt;div&gt;&lt;img src="/Themes/fan/images/icon-quote.gif"&gt; &lt;strong&gt;ddelella:&lt;/strong&gt;&lt;/div&gt;&lt;div&gt;&amp;lt;a\\s+href=[&amp;#39;&amp;quot;](?&amp;lt;url&amp;gt;[^&amp;#39;&amp;quot;]*)[&amp;#39;&amp;quot;](?&amp;lt;All&amp;gt;[^&amp;gt;]*)&amp;gt;(?&amp;lt;title&amp;gt;[^&amp;lt;]*)&amp;lt;&lt;a&gt;[^\\(]*\\((?\\d&amp;quot; mce_href=&amp;quot;file://\\s*/\\s*a\\s*&amp;gt;[^\\(]*\\((?\\d&amp;quot;&amp;gt;\\s*/\\s*a\\s*&amp;gt;[^\\(]*\\((?&amp;lt;year&amp;gt;\\d&lt;/a&gt;+)&lt;/div&gt;&lt;/BLOCKQUOTE&gt;&lt;br /&gt;&lt;/p&gt;&lt;p&gt;It is always Better to Use WebRequest Class.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description></item><item><title>Re: RegEx Help Needed</title><link>http://forums.asp.net/thread/3276740.aspx</link><pubDate>Mon, 06 Jul 2009 12:32:43 GMT</pubDate><guid isPermaLink="false">4c671506-2930-414c-a40b-8bf57ded5924:3276740</guid><dc:creator>ddelella</dc:creator><author>ddelella</author><slash:comments>0</slash:comments><comments>http://forums.asp.net/thread/3276740.aspx</comments><wfw:commentRss>http://forums.asp.net/commentrss.aspx?SectionID=36&amp;PostID=3276740</wfw:commentRss><description>&lt;pre class="vb.net" name="code"&gt;        Dim myClient As New WebClient
        Dim myHTML As String = myClient.DownloadString(&amp;quot;http://www.imdb.com/find?q=matrix&amp;amp;s=tt&amp;quot;)
        Response.Write(myHTML)&lt;/pre&gt;
&lt;p&gt;&lt;br /&gt;Okay the problem looks&amp;nbsp;to be somewhat on IMDB.&amp;nbsp; Running the above code&amp;nbsp;in the page load gets me the&amp;nbsp;same out put my RegEx is producing however&amp;nbsp;to imdb.com and searching on matrix yields a total different page than it was before.&amp;nbsp; Instead of 16 approximates and a title_approx in the html it has title_substring and shows 24 approximate.&amp;nbsp; The later seems to be the correct which means the above is working.&amp;nbsp; I appreciate the suggestion on query strings but now I just need to find out why the characters are being pulled back as escaped from the WebClient object.&amp;nbsp; Thanks&lt;/p&gt;</description></item><item><title>Re: RegEx Help Needed</title><link>http://forums.asp.net/thread/3276727.aspx</link><pubDate>Mon, 06 Jul 2009 12:23:49 GMT</pubDate><guid isPermaLink="false">4c671506-2930-414c-a40b-8bf57ded5924:3276727</guid><dc:creator>ddelella</dc:creator><author>ddelella</author><slash:comments>0</slash:comments><comments>http://forums.asp.net/thread/3276727.aspx</comments><wfw:commentRss>http://forums.asp.net/commentrss.aspx?SectionID=36&amp;PostID=3276727</wfw:commentRss><description>&lt;p&gt;The string pattern produces an ArgumentException when trying to parse.&amp;nbsp; The exact message is:&lt;/p&gt;
&lt;p&gt;parsing &amp;quot;&amp;lt;a\\s+href=[&amp;#39;&amp;quot;](?&amp;lt;url&amp;gt;[^&amp;#39;&amp;quot;]*)[&amp;#39;&amp;quot;](?&amp;lt;All&amp;gt;[^&amp;gt;]*)&amp;gt;(?&amp;lt;title&amp;gt;[^&amp;lt;]*)&amp;lt;&lt;a&gt;[^\\(]*\\((?\\d&amp;quot; mce_href=&amp;quot;file://\\s*/\\s*a\\s*&amp;gt;[^\\(]*\\((?\\d&amp;quot;&amp;gt;\\s*/\\s*a\\s*&amp;gt;[^\\(]*\\((?&amp;lt;year&amp;gt;\\d&lt;/a&gt;+)&amp;quot; - Not enough )&amp;#39;s.&lt;/p&gt;</description></item><item><title>Re: RegEx Help Needed</title><link>http://forums.asp.net/thread/3275939.aspx</link><pubDate>Mon, 06 Jul 2009 04:53:11 GMT</pubDate><guid isPermaLink="false">4c671506-2930-414c-a40b-8bf57ded5924:3275939</guid><dc:creator>imran_ku07</dc:creator><author>imran_ku07</author><slash:comments>0</slash:comments><comments>http://forums.asp.net/thread/3275939.aspx</comments><wfw:commentRss>http://forums.asp.net/commentrss.aspx?SectionID=36&amp;PostID=3275939</wfw:commentRss><description>&lt;p&gt;try this&lt;/p&gt;&lt;p&gt;&lt;pre name="code" class="c-sharp"&gt;string sr = &amp;quot;&amp;lt;a href=\&amp;quot;/title/tt0133093/1\&amp;quot; onclick=\&amp;quot;(new Image()).src=&amp;#39;/rg/find-title-1/title_popular/images/b.gif?link=/title/tt0133093/&amp;#39;;\&amp;quot;&amp;gt;The Matrix1&amp;lt;/a&amp;gt; (1991) (V) &amp;quot;;
            sr += &amp;quot;&amp;lt;a href=\&amp;quot;/title/tt0133093/2\&amp;quot; onclick=\&amp;quot;(new Image()).src=&amp;#39;/rg/find-title-1/title_abc/images/b.gif?link=/title/tt0133093/&amp;#39;;\&amp;quot;&amp;gt;The Matrix2&amp;lt;/a&amp;gt; (1992) (V) &amp;quot;;
            sr += &amp;quot;&amp;lt;a href=\&amp;quot;/title/tt0133093/3\&amp;quot; onclick=\&amp;quot;(new Image()).src=&amp;#39;/rg/find-title-1/title_approx/images/b.gif?link=/title/tt0133093/&amp;#39;;\&amp;quot;&amp;gt;The Matrix3&amp;lt;/a&amp;gt; (1993) (V) &amp;quot;;
            string Pattern = &amp;quot;&amp;lt;a\\s+href=[&amp;#39;\&amp;quot;](?&amp;lt;url&amp;gt;[^&amp;#39;\&amp;quot;]*)[&amp;#39;\&amp;quot;](?&amp;lt;All&amp;gt;[^&amp;gt;]*)&amp;gt;(?&amp;lt;title&amp;gt;[^&amp;lt;]*)&amp;lt;\\s*/\\s*a\\s*&amp;gt;[^\\(]*\\((?&amp;lt;year&amp;gt;\\d+)&amp;quot;;
            MatchCollection m = Regex.Matches(sr, Pattern, RegexOptions.IgnoreCase);
            foreach(Match mm in m)
            {
                string temp = mm.Groups[&amp;quot;All&amp;quot;].Value.ToLower();
                if (temp.Contains(&amp;quot;title_popular&amp;quot;) || temp.Contains(&amp;quot;title_approx&amp;quot;))
                {
                    Response.Write(mm.Groups[&amp;quot;url&amp;quot;].Value);
                    Response.Write(mm.Groups[&amp;quot;title&amp;quot;].Value);
                    Response.Write(mm.Groups[&amp;quot;year&amp;quot;].Value);
                }
            }&lt;/pre&gt;&lt;br /&gt; &lt;br /&gt;&lt;/p&gt;&lt;p&gt;&lt;br /&gt;&lt;/p&gt;</description></item><item><title>Re: RegEx Help Needed</title><link>http://forums.asp.net/thread/3275893.aspx</link><pubDate>Mon, 06 Jul 2009 04:18:31 GMT</pubDate><guid isPermaLink="false">4c671506-2930-414c-a40b-8bf57ded5924:3275893</guid><dc:creator>ddelella</dc:creator><author>ddelella</author><slash:comments>0</slash:comments><comments>http://forums.asp.net/thread/3275893.aspx</comments><wfw:commentRss>http://forums.asp.net/commentrss.aspx?SectionID=36&amp;PostID=3275893</wfw:commentRss><description>&lt;p&gt;There are many reasons not to use the agility pack.&amp;nbsp; The code has not been updated in&amp;nbsp;a long long time.&amp;nbsp; The code is slow when trying to traverse&amp;nbsp;html that has&amp;nbsp;little to no identifiers.&amp;nbsp;&amp;nbsp;If the code was structure to the point where I could easily identify the objects it may make sense but in the case where the html no easy way to find the cells I need it will be easier with the RegEx.&lt;/p&gt;
&lt;p&gt;The above&amp;nbsp;RegEx works very fast and find 95% accurate results.&amp;nbsp; There are 2 - 3 extra items showing in the list which should not be there.&amp;nbsp; I added some extra code to the original to check for title_popular and title_approx only and found a small issue with&amp;nbsp;what was showing in the Matches collection:&lt;/p&gt;
&lt;p&gt;&amp;nbsp;This is what is showing when I look at the myMatch.Value:&lt;/p&gt;
&lt;p&gt;&amp;lt;a href=&amp;quot;/title/tt1074193/&amp;quot; onclick=&amp;quot;(new Image()).src=&amp;#39;/rg/find-title-16/title_substring/images/b.gif?link=/title/tt1074193/&amp;#39;;&amp;quot;&amp;gt;Decoded: The Making of &amp;amp;#x27;The Matrix Reloaded&amp;amp;#x27;&amp;lt;/a&amp;gt; (2003) (TV)&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/td&amp;gt;&lt;br /&gt;&lt;/p&gt;
&lt;p&gt;This is what is showing in the view source from IE8:&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&amp;lt;&lt;span&gt;a&lt;/span&gt; &lt;span&gt;href&lt;/span&gt;=&lt;span&gt;&amp;quot;/title/tt0410519/&amp;quot;&lt;/span&gt; &lt;span&gt;onclick&lt;/span&gt;=&lt;span&gt;&amp;quot;(new Image()).src=&amp;#39;/rg/find-title-16/title_approx/images/b.gif?link=/title/tt0410519/&amp;#39;;&amp;quot;&lt;/span&gt;&amp;gt;&lt;/span&gt;The Matrix Recalibrated&lt;span&gt;&amp;lt;/&lt;span&gt;a&lt;/span&gt;&amp;gt;&lt;/span&gt;&amp;nbsp;(2003) (TV)&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;lt;/td&amp;gt;&lt;/p&gt;
&lt;p&gt;There are some weird discrepencies.&amp;nbsp; 1) title_approx became title_substring?&amp;nbsp; And the characters in the html were escaped showing as &amp;amp;#x27; ?&amp;nbsp; Anyone have any ideas what in the RegEx could be causing these issue or why the could it be a problem with the WebClient object?&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description></item><item><title>Re: RegEx Help Needed</title><link>http://forums.asp.net/thread/3275101.aspx</link><pubDate>Sun, 05 Jul 2009 04:46:31 GMT</pubDate><guid isPermaLink="false">4c671506-2930-414c-a40b-8bf57ded5924:3275101</guid><dc:creator>TATWORTH</dc:creator><author>TATWORTH</author><slash:comments>0</slash:comments><comments>http://forums.asp.net/thread/3275101.aspx</comments><wfw:commentRss>http://forums.asp.net/commentrss.aspx?SectionID=36&amp;PostID=3275101</wfw:commentRss><description>&lt;p&gt;Try using the HTML Agility pack from &lt;a target="_blank" href="http://www.codeplex.com/htmlagilitypack"&gt;http://www.codeplex.com/htmlagilitypack&lt;/a&gt;&lt;br /&gt;&lt;/p&gt;</description></item><item><title>RegEx Help Needed</title><link>http://forums.asp.net/thread/3274868.aspx</link><pubDate>Sat, 04 Jul 2009 17:57:23 GMT</pubDate><guid isPermaLink="false">4c671506-2930-414c-a40b-8bf57ded5924:3274868</guid><dc:creator>ddelella</dc:creator><author>ddelella</author><slash:comments>0</slash:comments><comments>http://forums.asp.net/thread/3274868.aspx</comments><wfw:commentRss>http://forums.asp.net/commentrss.aspx?SectionID=36&amp;PostID=3274868</wfw:commentRss><description>&lt;pre class="xhtml" name="code"&gt;&amp;lt;td valign=&amp;quot;top&amp;quot;&amp;gt;&lt;/pre&gt;&lt;pre class="xhtml" name="code"&gt;&amp;lt;img src=&amp;quot;/images/b.gif&amp;quot; width=&amp;quot;1&amp;quot; height=&amp;quot;6&amp;quot;&amp;gt;&lt;/pre&gt;&lt;pre class="xhtml" name="code"&gt;&amp;lt;br&amp;gt;&lt;/pre&gt;&lt;pre class="xhtml" name="code"&gt;&amp;lt;a href=&amp;quot;/title/tt0133093/&amp;quot; onclick=&amp;quot;(new Image()).src=&amp;#39;/rg/find-title-1/title_popular/images/b.gif?link=/title/tt0133093/&amp;#39;;&amp;quot;&amp;gt;The Matrix&amp;lt;/a&amp;gt; (1999) (V)     &lt;/pre&gt;&lt;pre class="xhtml" name="code"&gt;&amp;lt;/td&amp;gt;&lt;/pre&gt;
&lt;p&gt;&lt;br /&gt;&amp;nbsp;I need some help in creating a regular expression that finds elements matching the above criteria.&amp;nbsp; The string starts with a td valign top and ends with the closing tag.&amp;nbsp; The image and br tags are always there but are irrelevant.&amp;nbsp; As far as the anchor tag i need the value of the href to be a group named &amp;quot;url&amp;quot;.&amp;nbsp; I need the innerHTML of the anchor tag to be a group called &amp;quot;title&amp;quot; and i need the 4 digits in the first set of () after the acnhor tags to be in a group called &amp;quot;year&amp;quot;.&amp;nbsp; I have a RegEx which is picking up the anchor tag and the value for the year but it is to vague and is also picking up so extra values.&amp;nbsp; If possible I would like to pick only anchors with the &amp;quot;/title_popular/&amp;quot; or &amp;quot;/title_approx/&amp;quot; in the on click statement.&amp;nbsp; Below is my current anchor, any help would be great.&amp;nbsp; For those who recognize the value above its for scrapping the html of the imdb search.&amp;nbsp; The results of this regex should be 20 items, 4 popular and 16 approximate when search &lt;a href="http://www.imdb.com/find&amp;amp;q=matrix?s=tt"&gt;http://www.imdb.com/find&amp;amp;q=matrix?s=tt&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;font color="#0000ff" size="2"&gt;&lt;font color="#0000ff" size="2"&gt;Dim&lt;/font&gt;&lt;/font&gt;&lt;font size="2"&gt; myRegex &lt;/font&gt;&lt;font color="#0000ff" size="2"&gt;&lt;font color="#0000ff" size="2"&gt;As&lt;/font&gt;&lt;/font&gt;&lt;font size="2"&gt; &lt;/font&gt;&lt;font color="#0000ff" size="2"&gt;&lt;font color="#0000ff" size="2"&gt;New&lt;/font&gt;&lt;/font&gt;&lt;font size="2"&gt; Regex(&lt;/font&gt;&lt;font color="#a31515" size="2"&gt;&lt;font color="#a31515" size="2"&gt;&amp;quot;&amp;lt;a\s+(?:(?:\w+\s*=\s*)(?:\w+|&amp;quot;&amp;quot;[^&amp;quot;&amp;quot;]*&amp;quot;&amp;quot;|&amp;#39;[^&amp;#39;]*&amp;#39;))*?\s*href\s*=\s*(?&amp;lt;url&amp;gt;\w+|&amp;quot;&amp;quot;[^&amp;quot;&amp;quot;]*&amp;quot;&amp;quot;|&amp;#39;[^&amp;#39;]*&amp;#39;)(?:(?:\s+\w+\s*=\s*)(?:\w+|&amp;quot;&amp;quot;[^&amp;quot;&amp;quot;]*&amp;quot;&amp;quot;|&amp;#39;[^&amp;#39;]*&amp;#39;))*?&amp;gt;(?&amp;lt;title&amp;gt;.+?)&amp;lt;/a&amp;gt;(?&amp;lt;year&amp;gt;.+?)&amp;lt;/td&amp;gt;&amp;quot;&lt;/font&gt;&lt;/font&gt;&lt;font size="2"&gt;)&lt;/font&gt;&lt;/p&gt;</description></item></channel></rss>