Convert UTF-8 Char Array to Raw Byte Array in Java

in #java5 years ago

Given a UTF-8 Char Array, we can use the following Java Function to Convert to Raw Byte Array. Each UTF-8 Character has 3 types: 3 bytes, 2 bytes or 1 byte depending on the first byte range.

public static byte[] char2Byte(char[] a) {
    int len = 0;
    // obtain the length of the byte array
    for (char c : a) {
      if (c > 0x7FF) {
        len += 3;
      } else if (c > 0x7F) {
        len += 2;
      } else {
        len++;
      }
    }
    // fill the byte array with UTF-8 characters
    var result = new byte[len];
    int i = 0;
    for (char c : a) {
      if (c > 0x7FF) {
        result[i++] = (byte) (((c >> 12) & 0x0F) | 0xE0);
        result[i++] = (byte) (((c >> 6) & 0x3F) | 0x80);
        result[i++] = (byte) ((c & 0x3F) | 0x80);
      } else if (c > 127) {
        result[i++] = (byte) (((c >> 6) & 0x1F) | 0xC0);
        result[i++] = (byte) ((c & 0x3F) | 0x80);
      } else {
        result[i++] = (byte) (c & 0x7F);
      }
    }
    return result;
}

First, we iterate the char array to compute the total length of the result byte array, and then second pass, we fill the byte array with corresponding UTF-8 value.

Reposted to Blog

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Thank you for reading ^^^^^^^^^^^^^^^

NEW! Following my Trail (Upvote or/and Downvote)

Follow me for topics of Algorithms, Blockchain and Cloud.
I am @justyy - a Steem Witness
https://steemyy.com

My contributions

Delegation Service

Important Update of Delegation Service!

  • Delegate 1000 to justyy: Link
  • Delegate 5000 to justyy: Link
  • Delegate 10000 to justyy: Link

Support me

If you like my work, please:

  1. Delegate SP: https://steemyy.com/sp-delegate-form/?delegatee=justyy
  2. Vote @justyy as Witness: https://steemyy.com/witness-voting/?witness=justyy&action=approve
  3. Set @justyy as Proxy: https://steemyy.com/witness-voting/?witness=justyy&action=proxy
    Alternatively, you can vote witness or set proxy here: https://steemit.com/~witnesses

Coin Marketplace

STEEM 0.05
TRX 0.33
JST 0.079
BTC 63789.74
ETH 1693.74
USDT 1.00
SBD 0.41