Last post May 28, 2016 03:02 AM by Yohann Lu
May 22, 2016 04:54 AM|RobertW57|LINK
I've been given the task of reading the HTML contents from one URL within my company, allowing edits to occur, and then reposting at another location. This is the C# code I'm using to retrieve the source HTML:
WebClient client = new WebClient();
result = client.DownloadString(importUrl);
A Diff comparison of this string with what is obtained from "View Source" in any browser reveals that they're quite different.
I'm wondering if there is another approach I could use to get identical data as one would see in a browser?
In case it helps, here's a test page I'm using: http://az745613.vo.msecnd.net/mobilehomescreen/gb
Any help would be much appreciated!
May 23, 2016 04:43 AM|Yohann Lu|LINK
From your description, You can try the following code.
// GET api/values
public string Get()
HttpWebRequest request = (HttpWebRequest)WebRequest.Create("http://az745613.vo.msecnd.net/mobilehomescreen/gb");//post url in controller action
// set post headers
request.Method = "Get";
request.KeepAlive = true;
request.ContentType = "application/x-www-form-urlencoded";
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Stream resStream = response.GetResponseStream();
string content = "";
using (StreamReader sr = new StreamReader(resStream))
content = sr.ReadToEnd();
May 23, 2016 07:16 AM|lextm|LINK
WebClient uses its own user agent, which is different from any web browser. That might trigger your web server to return some response completely different.
Try to simulate a user agent of the web browser you are testing, and that should remedy the differences.
May 23, 2016 05:44 PM|RobertW57|LINK
I deeply appreciate you share that code with me! Wow, it works soooo much better than my original approach.
With that said, I have performed a DIFF between the source code of the page in Firefox vs. what your code is returning. While much closer, it's not identical.
This is a subject area that is really new to me so I'm not clear on why there are any differences at all. Might you be able to enlighten me?
May 28, 2016 03:02 AM|Yohann Lu|LINK
I think that there may be subtle differences exist between different browsers. So we cannot get exactly the same content from different browsers.