Last post Nov 28, 2015 06:01 PM by srutzky
Dec 28, 2010 02:32 AM|ironazar|LINK
I want to serialize something in unicode language (persian) in xml file (<?xml version="1.0" encoding="utf-8"?>). In fact i try to use SEARCHAROO7 codes from (www.searcharoo.net) for searching in doc/docx and pdf files. it works for a limited number of files.
but it causes this error (The surrogate pair (0xD993, 0x2A0) is invalid. A high surrogate character (0xD800 - 0xDBFF) must always be paired with a low surrogate character (0xDC00 - 0xDFFF)) for a large number of files. i tested it to realize
if some files have problems with unicode, but i found out that there is no problem for each file alone.
please tell me how i fix it.
Dec 30, 2010 03:26 AM|Qin Dian Tang - MSFT|LINK
Here are some possible solutions about this error. Hope them help:
Jan 02, 2011 03:28 AM|ironazar|LINK
I had visited those links before my post. But i did not get any result.
Thanks a lot.
Jan 02, 2011 08:30 PM|Qin Dian Tang - MSFT|LINK
It is hard for me to solve the issue. I think it is encoding issue. Try something else for encoding the Persian.
Nov 28, 2015 06:01 PM|srutzky|LINK
The error message is very specific: those two values -- 0xD993 and
0x2A0 -- are not a valid combination. In UTF-16, all characters are either one or two sets of double-byte codes. Surrogate Pairs are the two sets of double-byte codes that map the Supplementary Characters. And, as the error message states,
those Surrogate Pairs can only be made up of certain combinations of those double-byte codes.
However, the 0x2A0 code seems like bad data in the first place since it is technically 1.5 bytes, unless it is supposed to represent
0x2A00. Of course, even if the error message has a display issue and the actual value really is
0x2A00, that is still an invalid Surrogate Pair sequence.