How to fix UTF encoding for whitespaces?
194 160
is the UTF-8 encoding of a NO-BREAK SPACE
codepoint (the same codepoint that HTML calls
).
So it's really not a space, even though it looks like one. (You'll see it won't word-wrap, for instance.) A regular expression match for \s
would match it, but a plain comparison with a space won't.
To simply replace NO-BREAK spaces you can do the following:
src = src.Replace('\u00A0', ' ');
奇怪的字符,看起來像是空格,實際上又不是空格
BTW-TVA / BE 0437 971 826 RPR Brussel