Encoding |
Name | Description |
---|---|
BodyName (get) | Returns the encoding name to be used in with the mail agent body tags. |
CodePage (get) | Returns the code page identifier for this encoding. |
DecoderFallback (get) | Gets the DecoderFallback object for the current Encoding object. |
DecoderFallback (set) | Sets the DecoderFallback object for the current Encoding object. |
EncoderFallback (get) | Gets the EncoderFallback object for the current Encoding object. |
EncoderFallback (set) | Sets the EncoderFallback for the current Encoding object. |
EncodingName (get) | When implemented in a derived class, gets the human-readable description of the current encoding. |
HeaderName (get) | Returns the encoding name to be used in with the mail agent header tags. |
IsBrowserDisplay (get) | Indicates if this encoding can be used by browsers to display text. |
IsBrowserSave (get) | Indicates if this encoding can be used to save data with this encoding. |
IsMailNewsDisplay (get) | Indicates if this encoding can be used to display mail and news by mail and news clients. |
IsMailNewsSave (get) | Indicates if this encoding can be used to save date by mail and news clients. |
IsReadOnly (get) | When implemented in a derived class, gets a value indicating whether the current encoding is read-only. |
IsSingleByte (get) | Returns if the current encoding uses single-byte code points. |
WebName (get) | Returns the encoding name registered with the Internet Assigned Numbers Authority. |
WindowsCodePage (get) | Returns the Windows Operating Systems code page for this encoding. |
Name | Description |
---|---|
Clone | Creates a shallow copy of the current Encoding object. |
Equals | Returns a boolean indicating if the value and this object instance are the same instance. |
GetByteCount | Returns the number of bytes that would be produced from the set of characters using this encoding. |
GetBytes | Encodes a set of characters into an array of bytes. |
GetBytesEx | Encodes a set of characters into an array of bytes, returning the number of bytes produced. |
GetCharCount | When implemented in a derived class, calculates the number of characters produced by decoding a sequence of bytes from the specified byte array. |
GetChars | When implemented in a derived class, decodes all the bytes in the specified byte array into a set of characters. |
GetCharsEx | When implemented in a derived class, decodes a sequence of bytes from the specified byte array into the specified character array. |
GetDecoder | When implemented in a derived class, obtains a decoder that converts an encoded sequence of bytes into a sequence of characters. |
GetEncoder | When implemented in a derived class, obtains an encoder that converts a sequence of Unicode characters into an encoded sequence of bytes. |
GetHashCode | Returns a pseudo-unique number identifying this instance. |
GetMaxByteCount | When implemented in a derived class, calculates the maximum number of bytes produced by encoding the specified number of characters. |
GetMaxCharCount | Returns the maximum number of characters than can be decoded from the number of bytes specified. |
GetPreamble | When implemented in a derived class, returns a sequence of bytes that specifies the encoding used. |
GetString | When implemented in a derived class, decodes all the bytes in the specified byte array into a string. |
ToString | Returns a string representation of this object instance. |
Encoding is the process of transforming a set of Unicode characters into a sequence of bytes. In contrast, decoding is the process of transforming a sequence of encoded bytes into a set of Unicode characters.
Note that Encoding is intended to operate on Unicode characters instead of arbitrary binary data, such as byte arrays. If your application must encode arbitrary binary data into text, it should use a protocol such as uuencode, which is implemented by methods such as Convert.ToBase64CharArray.
VBCorLib provides the following implementations of the Encoding class to support current Unicode encodings and other encodings:
The Encoding class is primarily intended to convert between different encodings and Unicode. Often one of the derived Unicode classes is the correct choice for your application.
Your applications use the GetEncoding method to obtain other encodings. They should use the GetEncodings method to get a list of all encodings.
If the data to be converted is available only in sequential blocks (such as data read from a stream) or if the amount of data is so large that it needs to be divided into smaller blocks, your application should use the Decoder or the Encoder provided by the GetDecoder method or the GetEncoder method, respectively, of a derived class.
The UTF-16 and the UTF-32 encoders can use the big endian byte order (most significant byte first) or the little endian byte order (least significant byte first). For example, the Latin Capital Letter A (U+0041) is serialized as follows (in hexadecimal):
The GetPreamble method retrieves an array of bytes that includes the byte order mark (BOM). If this byte array is prefixed to an encoded stream, it helps the decoder to identify the encoding format used.
For more information on byte order and the byte order mark, see The Unicode Standard at the Unicode home page.
Note that the encoding classes allow errors to:
Your applications are recommended to throw exceptions on all data stream errors. An application either uses a "throwonerror" flag when applicable or uses the EncoderExceptionFallback and DecoderExceptionFallback classes. Best fit fallback is often not recommended because it can cause data loss or confusion and is slower than simple character replacements. For ANSI encodings, the best fit behavior is the default.
The following example converts a string from one encoding to another.
Public Sub Main() Dim UnicodeString As String Dim AsciiEncoding As Encoding Dim UnicodeEncoding As Encoding Dim AsciiBytes() As Byte Dim UnicodeBytes() As Byte Dim AsciiChars() As Integer Dim AsciiString As String Set Console.OutputEncoding = Encoding.UTF8 UnicodeString = t("This string contains the unicode character Pi (\u03a0)") ' Create two different encodings. Set AsciiEncoding = Encoding.ASCII Set UnicodeEncoding = Encoding.Unicode ' Convert the string into a byte array. UnicodeBytes = UnicodeEncoding.GetBytes(UnicodeString) ' Perform the convertion from one encoding to the other. AsciiBytes = Encoding.Convert(UnicodeEncoding, AsciiEncoding, UnicodeBytes) ' Convert the new Byte() into a Char() and then into a string. AsciiChars = AsciiEncoding.GetChars(AsciiBytes) AsciiString = NewString(AsciiChars) ' Display the strings created before and after the conversion. Console.WriteLine "Original string: " & UnicodeString Console.WriteLine "Ascii converted string: " & AsciiString Console.ReadKey End Sub ' This example code produces the following output. ' ' Original string: This string contains the unicode character Pi (Π) ' Ascii converted string: This string contains the unicode character Pi (?)