I am opening and reading and I need to detect the encoding of the file that I am reading. There are different system users who will be using different encoding in their files so I have no control over this.
While the default encoding normally works there are special foreign language (non-English) characters that are not being read correctly, instead there are garbage characters in their place that I see "myString" above. Good foreign characters are visible in
the file but the result in myString are unreadable garbage characters that are not valid.
I have researched this and it has been suggested to determine the encoding of the file and then use that encoding, but I have not been able to find out how to to do this.
so in a nutshell: how can I use vb.net to detect the encoding of an excel file?
I want to determine the encoding for a file, not a thread. I was just researching this and used the following to get what I believe is the BOM (byte order mark) to hopefully determine the encoding of the file,
Dim enc As Encoding = Encoding.Default
Dim buffer() As Byte = New Byte(4) {}
Dim file As New System.IO.FileStream(myFileName, IO.FileMode.Open)
file.Read(buffer, 0, 5)
file.Close()
I ran this and traced the code, I am getting the following from "buffer"" for my test file Character codes that translate into a double quote and then the first 4 characters of the word that is visible in the first (top-most, left-most) cell in the excel document.
I'm assuming that this means that there is no encoding information at the beginning of the file. Not sure where to go from here.
I got this, the last line of the code below gives me the encoding, I might have to close the streamreader and re-open it with the correct encoding but it does the job, gives me the encoding of the file I just opened:
Dim strEncryptionType As String = String.Empty
Dim myStreamRdr As System.IO.StreamReader = New System.IO.StreamReader(myFileName, True)
Dim myString As String = myStreamRdr.ReadToEnd()
strEncryptionType = mmFileIA.CurrentEncoding.EncodingName
Marked as answer by MyronCope on Oct 01, 2010 03:19 PM
MyronCope
Participant
1656 Points
1345 Posts
detect the encoding of an Excel file
Sep 30, 2010 02:42 PM|LINK
using vb.net/asp.net 2005
I am opening and reading and I need to detect the encoding of the file that I am reading. There are different system users who will be using different encoding in their files so I have no control over this.
While the default encoding normally works there are special foreign language (non-English) characters that are not being read correctly, instead there are garbage characters in their place that I see "myString" above. Good foreign characters are visible in the file but the result in myString are unreadable garbage characters that are not valid.
I have researched this and it has been suggested to determine the encoding of the file and then use that encoding, but I have not been able to find out how to to do this.
so in a nutshell: how can I use vb.net to detect the encoding of an excel file?
thanks,
MC
TonyDong
Contributor
4777 Points
939 Posts
Re: detect the encoding of an Excel file
Sep 30, 2010 04:32 PM|LINK
MyronCope
Participant
1656 Points
1345 Posts
Re: detect the encoding of an Excel file
Sep 30, 2010 04:52 PM|LINK
hi, thanks for your feedback.
I want to determine the encoding for a file, not a thread. I was just researching this and used the following to get what I believe is the BOM (byte order mark) to hopefully determine the encoding of the file,
Dim enc As Encoding = Encoding.Default Dim buffer() As Byte = New Byte(4) {} Dim file As New System.IO.FileStream(myFileName, IO.FileMode.Open) file.Read(buffer, 0, 5) file.Close()I ran this and traced the code, I am getting the following from "buffer"" for my test file Character codes that translate into a double quote and then the first 4 characters of the word that is visible in the first (top-most, left-most) cell in the excel document. I'm assuming that this means that there is no encoding information at the beginning of the file. Not sure where to go from here.
MyronCope
Participant
1656 Points
1345 Posts
Re: detect the encoding of an Excel file
Sep 30, 2010 06:17 PM|LINK
I got this, the last line of the code below gives me the encoding, I might have to close the streamreader and re-open it with the correct encoding but it does the job, gives me the encoding of the file I just opened:
Dim strEncryptionType As String = String.Empty Dim myStreamRdr As System.IO.StreamReader = New System.IO.StreamReader(myFileName, True) Dim myString As String = myStreamRdr.ReadToEnd() strEncryptionType = mmFileIA.CurrentEncoding.EncodingName