EncodingStatic |
Name | Description |
---|---|
ASCII (get) | Gets an encoding for the ASCII (7-bit) character set. |
BigEndianUnicode (get) | Gets an encoding for the UTF-16 format that uses the big endian byte order. |
Default (get) | Gets an encoding for the operating systems current ANSI code page. |
Unicode (get) | Gets an encoding for the UTF-16 format using the little endian byte order. |
UTF32 (get) | Gets an encoding for the UTF-32 format using the little endian byte order. |
UTF7 (get) | Gets an encoding for the UTF-7 format. |
UTF8 (get) | Gets an encoding for the UTF-8 format. |
Name | Description |
---|---|
Convert | Converts a set of bytes from one encoding to another encoding. |
GetEncoding | Returns the encoding associated with the specified code page identifier or name. Optional parameters specify an error handler for characters that cannot be encoded and byte sequences that cannot be decoded. |
GetEncodings | Returns a list of minimal information about each encoding. |
Encoding is the process of transforming a set of Unicode characters into a sequence of bytes. In contrast, decoding is the process of transforming a sequence of encoded bytes into a set of Unicode characters.
Note that Encoding is intended to operate on Unicode characters instead of arbitrary binary data, such as byte arrays. If your application must encode arbitrary binary data into text, it should use a protocol such as uuencode, which is implemented by methods such as Convert.ToBase64CharArray.
VBCorLib provides the following implementations of the Encoding class to support current Unicode encodings and other encodings:
The Encoding class is primarily intended to convert between different encodings and Unicode. Often one of the derived Unicode classes is the correct choice for your application.
Your applications use the GetEncoding method to obtain other encodings. They should use the GetEncodings method to get a list of all encodings.
If the data to be converted is available only in sequential blocks (such as data read from a stream) or if the amount of data is so large that it needs to be divided into smaller blocks, your application should use the Decoder or the Encoder provided by the GetDecoder method or the GetEncoder method, respectively, of a derived class.
The UTF-16 and the UTF-32 encoders can use the big endian byte order (most significant byte first) or the little endian byte order (least significant byte first). For example, the Latin Capital Letter A (U+0041) is serialized as follows (in hexadecimal):
The GetPreamble method retrieves an array of bytes that includes the byte order mark (BOM). If this byte array is prefixed to an encoded stream, it helps the decoder to identify the encoding format used.
For more information on byte order and the byte order mark, see The Unicode Standard at the Unicode home page.
Note that the encoding classes allow errors to:
Your applications are recommended to throw exceptions on all data stream errors. An application either uses a "throwonerror" flag when applicable or uses the EncoderExceptionFallback and DecoderExceptionFallback classes. Best fit fallback is often not recommended because it can cause data loss or confusion and is slower than simple character replacements. For ANSI encodings, the best fit behavior is the default.
The following example converts a string from one encoding to another.
Public Sub Main() Dim UnicodeString As String Dim AsciiEncoding As Encoding Dim UnicodeEncoding As Encoding Dim AsciiBytes() As Byte Dim UnicodeBytes() As Byte Dim AsciiChars() As Integer Dim AsciiString As String Set Console.OutputEncoding = Encoding.UTF8 UnicodeString = t("This string contains the unicode character Pi (\u03a0)") ' Create two different encodings. Set AsciiEncoding = Encoding.ASCII Set UnicodeEncoding = Encoding.Unicode ' Convert the string into a byte array. UnicodeBytes = UnicodeEncoding.GetBytes(UnicodeString) ' Perform the convertion from one encoding to the other. AsciiBytes = Encoding.Convert(UnicodeEncoding, AsciiEncoding, UnicodeBytes) ' Convert the new Byte() into a Char() and then into a string. AsciiChars = AsciiEncoding.GetChars(AsciiBytes) AsciiString = NewString(AsciiChars) ' Display the strings created before and after the conversion. Console.WriteLine "Original string: " & UnicodeString Console.WriteLine "Ascii converted string: " & AsciiString Console.ReadKey End Sub ' This example code produces the following output. ' ' Original string: This string contains the unicode character Pi (Π) ' Ascii converted string: This string contains the unicode character Pi (?)