EncodingStatic: UTF8 (get) |
Gets an encoding for the UTF-8 format.
Public Property Get UTF8 ( ) As UTF8Encoding
This property returns a UTF8Encoding object that encodes Unicode characters into a sequence of one to four bytes per character, and that decodes a UTF-8-encoded byte array to Unicode characters.
The UTF8Encoding object that is returned by this property may not have the appropriate behavior for your application. It uses replacement fallback to replace each string that it cannot encode and each byte that it cannot decode with a question mark ("?") character. Instead, you can call the NewUTF8Encoding(Boolean, Boolean) constructor to instantiate a UTF8Encoding object whose fallback is either an EncoderFallbackException or a DecoderFallbackException, as the following example illustrates.
Public Sub Main() Dim Enc As UTF8Encoding Dim Value As String Dim Value2 As String Dim Bytes() As Byte Dim Byt As Variant Set Enc = NewUTF8Encoding(True, True) Value = t("\u00C4 \uD802\u0033 \u00AE") On Error GoTo Catch Bytes = Enc.GetBytes(Value) For Each Byt In Bytes Debug.Print Object.ToString(Byt, "X2"); Next Debug.Print Value2 = Enc.GetString(Bytes) Debug.Print Value2 Exit Sub Catch: Dim Ex As EncoderFallbackException Catch Ex, Err Debug.Print CorString.Format("Unable to encode {0} at index {1}", IIf(Ex.CharUnknownHigh <> 0, _ CorString.Format("U+{0:X4} U+{1:X4}", Ex.CharUnknownHigh, Ex.CharUnknownLow), _ CorString.Format("U+{0:X4}", Ex.CharUnknown)), Ex.Index) End Sub ' The example displays the following output: ' Unable to encode U+D802 at index 2
Read Only.
The following example determines the number of bytes required to encode a character array, encodes the characters, and displays the resulting bytes.
Public Sub Main() Dim Chars() As Integer Dim U7 As Encoding Dim U8 As Encoding Dim U16LE As Encoding Dim U16BE As Encoding Dim U32 As Encoding ' The characters to encode: ' Latin Small Letter Z (U+007A) ' Latin Small Letter A (U+0061) ' Combining Breve (U+0306) ' Latin Small Letter AE With Acute (U+01FD) ' Greek Small Letter Beta (U+03B2) ' a high-surrogate value (U+D8FF) ' a low-surrogate value (U+DCFF) Chars = NewChars("z", "a", ChrW$(&H306), ChrW$(&H1FD), ChrW$(&H3B2), ChrW$(&HD8FF), ChrW$(&HDCFF)) Set U7 = Encoding.UTF7 Set U8 = Encoding.UTF8 Set U16LE = Encoding.Unicode Set U16BE = Encoding.BigEndianUnicode Set U32 = Encoding.UTF32 PrintCountsAndBytes Chars, U7 PrintCountsAndBytes Chars, U8 PrintCountsAndBytes Chars, U16LE PrintCountsAndBytes Chars, U16BE PrintCountsAndBytes Chars, U32 End Sub Private Sub PrintCountsAndBytes(ByRef Chars() As Integer, ByVal Enc As Encoding) Dim IBC As Long Dim IMBC As Long Dim Bytes() As Byte Debug.Print CorString.Format("{0,-30} :", Enc.ToString); IBC = Enc.GetByteCount(Chars) Debug.Print CorString.Format(" {0,-3}", IBC); IMBC = Enc.GetMaxByteCount(CorArray.Length(Chars)) Debug.Print CorString.Format(" {0, -3} :", IMBC); Bytes = Enc.GetBytes(Chars) PrintHexBytes Bytes End Sub Private Sub PrintHexBytes(ByRef Bytes() As Byte) Dim i As Long If CorArray.IsNullOrEmpty(Bytes) Then Debug.Print "<none>" Else For i = 0 To UBound(Bytes) Debug.Print CorString.Format("{0:X2} ", Bytes(i)); Next Debug.Print End If End Sub ' This code produces the following output. ' ' CorLib.UTF7Encoding : 18 23 :7A 61 2B 41 77 59 42 2F 51 4F 79 32 50 2F 63 2F 77 2D ' CorLib.UTF8Encoding : 12 24 :7A 61 CC 86 C7 BD CE B2 F1 8F B3 BF ' CorLib.UnicodeEncoding : 14 16 :7A 00 61 00 06 03 FD 01 B2 03 FF D8 FF DC ' CorLib.UnicodeEncoding : 14 16 :00 7A 00 61 03 06 01 FD 03 B2 D8 FF DC FF ' CorLib.UTF32Encoding : 24 32 :7A 00 00 00 61 00 00 00 06 03 00 00 FD 01 00 00 B2 03 00 00 FF FC 04 00