Multi-byte encoding for remote content

Last post 02-05-2007 4:49 AM by NickVic. 0 replies.

Sort Posts:

  • Multi-byte encoding for remote content

    02-05-2007, 4:49 AM
    • Loading...
    • NickVic
    • Joined on 02-05-2007, 4:37 AM
    • Posts 1

    Hi guys,
    I'm facing a problem with content charset encoding in an ASP.NET page (C# 2.0). I'm trying to read a remote text file (Japanese charset) and then write it out on my local webpage.

    I've defined the charset as UTF-8 (for both reading from stream and writing to response buffer), but it still doesn't display the content in the correct charset. If I change the charset to ISO-2022-JP then it works fine.

    How can I read remote webpages/text-files (multi-byte encoding) and convert/write them in UTF-8 for correct display. Is there a generic universal solution to it, which can work with all double-byte or multi-byte language charset's?

    I'm new to content encoding so please bear with me.

    Thanks!

    Here's the code I have:
     
    string strMsg = "";

    // Read remote webpage ...
    HttpWebRequest req = (HttpWebRequest)WebRequest.Create("http://www.jpcert.or.jp/at/2004/at040003.txt");
    req.Timeout = 7000; // 7sec

    StreamReader stream = new StreamReader(req.GetResponse().GetResponseStream(), System.Text.Encoding.UTF8, true);
    StringBuilder sb = new StringBuilder();

    sb.Append(stream.ReadToEnd());
    stream.Close();
    strMsg = sb.ToString();

    sb = null;
    stream = null;
    req = null;


    // Write local copy ...
    Response.Clear();
    Response.ClearHeaders();
    Response.ContentType = "text/plain; charset=UTF-8"; // Note: ISO-2022-JP works
    Response.Write(strMsg);
    Response.End();
     

Page 1 of 1 (1 items)
Microsoft Communities
Page view counter