Hi guys,
I'm facing a problem with content charset encoding in an ASP.NET page (C# 2.0). I'm trying to read a remote text file (Japanese charset) and then write it out on my local webpage.
I've defined the charset as UTF-8 (for both reading from stream and writing to response buffer), but it still doesn't display the content in the correct charset. If I change the charset to ISO-2022-JP then it works fine.
How can I read remote webpages/text-files (multi-byte encoding) and convert/write them in UTF-8 for correct display. Is there a generic universal solution to it, which can work with all double-byte or multi-byte language charset's?
I'm new to content encoding so please bear with me.
Thanks!
Here's the code I have:
string strMsg = "";
// Read remote webpage ...
HttpWebRequest req = (HttpWebRequest)WebRequest.Create("http://www.jpcert.or.jp/at/2004/at040003.txt");
req.Timeout = 7000; // 7sec
StreamReader stream = new StreamReader(req.GetResponse().GetResponseStream(), System.Text.Encoding.UTF8, true);
StringBuilder sb = new StringBuilder();
sb.Append(stream.ReadToEnd());
stream.Close();
strMsg = sb.ToString();
sb = null;
stream = null;
req = null;
// Write local copy ...
Response.Clear();
Response.ClearHeaders();
Response.ContentType = "text/plain; charset=UTF-8"; // Note: ISO-2022-JP works
Response.Write(strMsg);
Response.End();