Last post Aug 21, 2015 06:57 AM by Afzaal.Ahmad.Zeeshan
Aug 20, 2015 06:18 AM|Hisanth|LINK
I am trying to save Tamil Language text using c#
But it was stored in the format of ???????? ??????? ??????? ????????
I am using in the field of data type as nvarchar.
If i am add coded into database, it was stored in a correct Tamil Language format ( Right click on DB / Select Edit Top 200 rows / Then i have enter my local language data ). It shows & retrive perfect.
while using c# it was change the format ????????????
Help me! how to fix it.
Aug 20, 2015 06:29 AM|Afzaal.Ahmad.Zeeshan|LINK
Hisanth, the problem is neither with SQL Server, nor with ASP.NET itself. Rather the problem is with the encoding used to map those Unicode characters to the byte-wise representation on the machine.
Tamil is supported in Unicode and so is SQL Server and ASP.NET. Unicode character support is provided natively, in .NET framework's
char object. That is why Tamil, Hindi, Marathi etc. all are supported in these frameworks and products. You need to learn how to correctly map the characters to bytes. The characters that you see mean that the data you submitted has been lost!
For example the following code works correctly,
var message = "தமிழ் விசைப்பலகை"; // Sample text from Google
// Save the message in database
// For example
var command = new SqlCommand("INSERT INTO table_name (column_name) VALUES (@message)");
// Add the parameters
command.Parameters.Add(new SqlParameter("message", message);
Once you execute the above code it will work fine (unless the code you execute messes up with the encodings). The problem is with the encoding being used, I have written an article covering Unicode and encodings.
Reading and writing Unicode in .NET.
I hope I have provided you with enough details to configure and debug the application yourself, if you need any further support please do reply. :-)
Aug 20, 2015 06:48 AM|Hisanth|LINK
If i am directly assigned with parameter it was saved well. Problem with using class
Aug 21, 2015 06:57 AM|Afzaal.Ahmad.Zeeshan|LINK
As I have already mentioned, the problem is neither with your ASP.NET web application nor with SQL Server and also not with the network stream. The problem is the encoding being used! What encoding are you using? UTF-8, UTF-16 (both endians) and all other
encodings have different types for storing the byte-representation of the characters. The problem is with that encoding, which maps the characters to an unknown byte-representation and thus when you try to convert the data back there is no longer any content
If you are using the Unicode encoding in .NET?
var encoding = System.Text.Encoding.Unicode;
Then you must know that this is UTF-16LE! If you consider using BE in this case or any encoding to decode the message,
all of your data will be lost. This is being your case of problem. I have no idea which encoding is being used, can you consider posting the code where (and how) you capture the message?
If that is a third-party tool to get the data from the forms, then please consider asking this question on their forum as they may have a good answer for you related to encoding of the characters.
As discussed in that article, the only problem that will cause your applications to loose Unicode data is the use of different encodings on different modules. Chose one, stick to it. In my case, I will suggest you should use UTF-8, which
keeps growing on your own need! Please think again, chose one encoding and please re-write the function. The data loss is evident, you cannot ignore it like this. :-)