UTF8Encoding: GetEncoder |
Obtains an encoder that converts a sequence of Unicode characters into a UTF-8 encoded sequence of bytes.
Public Function GetEncoder ( ) As Encoder
The Encoder.GetBytes method converts sequential blocks of characters into sequential blocks of bytes, in a manner similar to the GetBytesEx method. However, an Encoder maintains state information between calls so it can correctly encode character sequences that span blocks. The Encoder also preserves trailing characters at the end of data blocks and uses the trailing characters in the next encoding operation. For example, a data block might end with an unmatched high surrogate, and the matching low surrogate might be in the next data block. Therefore, GetDecoder and GetEncoder are useful for network transmission and file operations, because those operations often deal with blocks of data instead of a complete data stream.
If error detection is enabled, that is, the ThrowOnInvalidCharacters parameter of the constructor is set to true, error detection is also enabled in the Encoder returned by this method. If error detection is enabled and an invalid sequence is encountered, the state of the encoder is undefined and processing must stop.
The following example demonstrates how to use the GetEncoder method to obtain an encoder to convert a sequence of characters into a UTF-8 encoded sequence of bytes.
Public Sub Main() Dim Chars() As Integer Dim Bytes() As Byte Dim UTF8Encoder As Encoder Dim ByteCount As Long Dim BytesEncodedCount As Long Dim b As Variant Chars = NewChars("a", "b", "c", &H300, &HA0A0) Set UTF8Encoder = Encoding.UTF8.GetEncoder ByteCount = UTF8Encoder.GetByteCount(Chars, 2, 3, True) ReDim Bytes(0 To ByteCount - 1) BytesEncodedCount = UTF8Encoder.GetBytes(Chars, 2, 3, Bytes, 0, True) Console.WriteLine "{0} bytes used to encode characters.", BytesEncodedCount Console.WriteValue "Encoded bytes: " For Each b In Bytes Console.WriteValue "[{0}]", b Next Console.WriteLine Console.ReadKey End Sub ' This code produces the following output. ' ' 6 bytes used to encode characters. ' Encoded bytes: [99][204][128][234][130][160]