UTF8Encoding |
Encoding | |
ICloneable | |
IObject |
Name | Description |
---|---|
BodyName (get) | Gets the encoding name to be used in with the mail agent body tags. |
CodePage (get) | Gets the code page identifier for this encoding. |
DecoderFallback (get) | Gets the DecoderFallback object for the current Encoding object. |
DecoderFallback (set) | Sets the DecoderFallback object for the current Encoding object. |
EncoderFallback (get) | Gets the EncoderFallback object for the current Encoding object. |
EncoderFallback (set) | Sets the EncoderFallback object for the current Encoding object. |
EncodingName (get) | Gets the human-readable description of the current encoding. |
HeaderName (get) | Gets the encoding name to be used in with the mail agent header tags. |
IsBrowserDisplay (get) | Gets if this encoding can be used by browsers to display text. |
IsBrowserSave (get) | Gets if this encoding can be used to save data with this encoding. |
IsMailNewsDisplay (get) | Gets if this encoding can be used to display mail and news by mail and news clients. |
IsMailNewsSave (get) | Gets if this encoding can be used to save data by mail and news clients. |
IsReadOnly (get) | When implemented in a derived class, gets a value indicating whether the current encoding is read-only. |
IsSingleByte (get) | Gets if the current encoding uses single-byte code points. |
WebName (get) | Gets the encoding name registered with the Internet Assigned Numbers Authority. |
WindowsCodePage (get) | Gets the Windows Operating Systems code page for this encoding. |
Name | Description |
---|---|
Clone | Creates a clone of the current Encoding instance. |
Equals | Determines whether the specified value is equal to the current UTF8Encoding object. |
GetByteCount |
Calculates the number of bytes produced by encoding the characters in the specified String or Integer().
|
GetBytes | Encodes all the characters in the specified character array or string into a sequence of bytes. |
GetBytesEx | Encodes a set of characters into an array of bytes, returning the number of bytes produced. |
GetCharCount | Calculates the number of characters produced by decoding a sequence of bytes from the specified byte array. |
GetChars | Decodes a sequence of bytes from the specified byte array into a set of characters. |
GetCharsEx | Decodes a sequence of bytes from the specified byte array into the specified character array. |
GetDecoder | Obtains a decoder that converts a UTF-8 encoded sequence of bytes into a sequence of Unicode characters. |
GetEncoder | Obtains an encoder that converts a sequence of Unicode characters into a UTF-8 encoded sequence of bytes. |
GetHashCode | Returns the hash code for the current instance. |
GetMaxByteCount | Calculates the maximum number of bytes produced by encoding the specified number of characters. |
GetMaxCharCount | Calculates the maximum number of characters produced by decoding the specified number of bytes. |
GetPreamble | Returns a Unicode byte order mark encoded in UTF-8 format, if the constructor for this instance requests a byte order mark. |
GetString | Decodes a range of bytes from a byte array into a string. |
ToString | Returns a string representation of the current object. |
Encoding is the process of transforming a set of Unicode characters into a sequence of bytes. Decoding is the process of transforming a sequence of encoded bytes into a set of Unicode characters.
UTF-8 encoding represents each code point as a sequence of one to four bytes.
The GetByteCount method determines how many bytes result in encoding a set of Unicode characters, and the GetBytes method performs the actual encoding.
Likewise, the GetCharCount method determines how many characters result in decoding a sequence of bytes, and the GetChars and GetString methods perform the actual decoding.
UTF8Encoding corresponds to the Windows code page 65001.
Optionally, the UTF8Encoding object provides a preamble, which is an array of bytes that can be prefixed to the sequence of bytes resulting from the encoding process. If the preamble contains a byte order mark (BOM), it helps the decoder determine the byte order and the transformation format or UTF. The GetPreamble method retrieves an array of bytes that can include the BOM. For more information on byte order and the byte order mark, see The Unicode Standard at the Unicode home page.
Note |
---|
To enable error detection and to make the class instance more secure, the application should use the UTF8Encoding constructor that takes a ThrowOnInvalidBytes parameter and set that parameter to true. With error detection, a method that detects an invalid sequence of characters or bytes throws a ArgumentException. Without error detection, no exception is thrown, and the invalid sequence is generally ignored. |
The following example demonstrates how to use a UTF8Encoding to encode a string of Unicode characters and store them in a byte array. Notice that when encodedBytes is decoded back to a string there is no loss of data.
Public Sub Main() Dim UTF8 As New UTF8Encoding Dim UnicodeString As String Dim EncodedBytes() As Byte Dim DecodedString As String Dim b As Variant Set Console.OutputEncoding = Encoding.UTF8 ' A Unicode string with two characters outside an 8-bit code range. UnicodeString = t("This unicode string contains two characters with codes outside an 8-bit code range, Pi (\u03a0) and Sigma (\u03a3).") Console.WriteLine "Original string:" Console.WriteLine UnicodeString ' Encode the string. EncodedBytes = UTF8.GetBytes(UnicodeString) Console.WriteLine Console.WriteLine "Encoded bytes:" For Each b In EncodedBytes Console.WriteValue "[{0}]", b Next Console.WriteLine ' Decode bytes back to string. ' Notice Pi and Sigma characters are still present. DecodedString = UTF8.GetString(EncodedBytes) Console.WriteLine Console.WriteLine "Decoded bytes:" Console.WriteLine DecodedString Console.ReadKey End Sub