Tuesday, March 4, 2014

What's the difference between encoding and charset?

Every encoding has a particular charset associated with it, but there can be more than one encoding for a given charset. A charset is simply what it sounds like, a set of characters. There are a large number of charsets, including many that are intended for particular scripts or languages.
However, we are well along the way in the transition to Unicode, which includes a character set capable of representing almost all the world's scripts. However, there are multiple encodings for Unicode. An encoding is a way of mapping a string of characters to a string of bytes. Examples of Unicode encodings include UTF-8UTF-16 BE, and UTF-16 LE . Each of these has advantages for particular applications or machine architectures.

No comments:

Post a Comment