Class Utf8View

java.lang.Object
express.mvp.roray.utils.memory.Utf8View

public final class Utf8View extends Object
A reusable, zero-allocation flyweight for viewing a UTF-8 encoded slice of a MemorySegment as a String.

This object does NOT copy the string data. It holds a reference to the underlying segment. The toString() method is the only one that allocates a new String on the heap and should only be used for debugging or moving data off the critical path.

  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Class
    Description
    static enum 
    Types of UTF-8 validation errors.
    static final class 
    Result of UTF-8 validation, containing error type and position if invalid.
  • Constructor Summary

    Constructors
    Constructor
    Description
     
  • Method Summary

    Modifier and Type
    Method
    Description
    long
    Returns the length of the UTF-8 data in bytes.
    int
    Compares this view to another Utf8View lexicographically.
    boolean
    A zero-allocation method to compare this view with another Utf8View.
    boolean
    A zero-allocation method to compare the view's content with a Java String.
    int
    Computes a hash code for this view's content.
    boolean
    Checks if this view has been wrapped around valid data.
    boolean
    Returns true if this view contains well-formed UTF-8 data.
    long
    Returns the offset within the segment where the UTF-8 data starts.
    Returns the underlying MemorySegment this view is wrapping.
    The ONLY method that allocates a heap object.
    Validates that this view contains well-formed UTF-8 data.
    void
    wrap(MemorySegment segment, long offset, int length)
    Wraps a segment slice.

    Methods inherited from class Object

    clone, equals, finalize, getClass, notify, notifyAll, wait, wait, wait
  • Constructor Details

    • Utf8View

      public Utf8View()
  • Method Details

    • wrap

      public void wrap(MemorySegment segment, long offset, int length)
      Wraps a segment slice. This makes the view object point to the new data.
    • segment

      public MemorySegment segment()
      Returns the underlying MemorySegment this view is wrapping.
      Returns:
      The MemorySegment, or null if not wrapped.
    • offset

      public long offset()
      Returns the offset within the segment where the UTF-8 data starts.
      Returns:
      The byte offset.
    • byteSize

      public long byteSize()
      Returns the length of the UTF-8 data in bytes.
      Returns:
      The byte length.
    • isValid

      public boolean isValid()
      Checks if this view has been wrapped around valid data.
      Returns:
      true if wrapped with a non-null segment, false otherwise.
    • toString

      public String toString()
      The ONLY method that allocates a heap object. Use this only when you need to convert the view to a standard Java String (e.g., for logging).
      Overrides:
      toString in class Object
      Returns:
      A new String object containing the data.
    • equalsString

      public boolean equalsString(String other)
      A zero-allocation method to compare the view's content with a Java String. This method performs byte-by-byte UTF-8 comparison without allocating any heap objects, making it suitable for high-frequency trading and other zero-GC scenarios.
      Parameters:
      other - The String to compare against.
      Returns:
      true if the content is identical, false otherwise.
      Throws:
      IllegalStateException - if this view is not valid (not wrapped).
    • equals

      public boolean equals(Utf8View other)
      A zero-allocation method to compare this view with another Utf8View. Performs byte-by-byte comparison without allocating any heap objects.
      Parameters:
      other - The Utf8View to compare against.
      Returns:
      true if the content is identical, false otherwise.
    • compareTo

      public int compareTo(Utf8View other)
      Compares this view to another Utf8View lexicographically. Performs zero-allocation byte-by-byte comparison.
      Parameters:
      other - The Utf8View to compare against.
      Returns:
      negative if this < other, 0 if equal, positive if this > other.
      Throws:
      IllegalArgumentException - if other is null.
    • hashCode

      public int hashCode()
      Computes a hash code for this view's content. Uses the same algorithm as String.hashCode() but operates on UTF-8 bytes.

      Note: This method allocates no heap objects but may perform significant computation for long strings.

      Overrides:
      hashCode in class Object
      Returns:
      The hash code.
    • validateUtf8

      public Utf8View.ValidationResult validateUtf8()
      Validates that this view contains well-formed UTF-8 data.

      This method performs a complete validation of the UTF-8 byte sequence according to RFC 3629, checking for:

      • Valid UTF-8 byte sequence structure (correct leading and continuation bytes)
      • Overlong encodings (e.g., using 2 bytes for ASCII)
      • Surrogate code points (U+D800-U+DFFF) which are invalid in UTF-8
      • Code points beyond U+10FFFF
      • Incomplete multi-byte sequences

      Performance: This method is zero-allocation and O(n) in the byte length. For high-throughput scenarios where data is trusted, validation may be skipped.

      Returns:
      Utf8View.ValidationResult.VALID if the data is well-formed UTF-8, or an error result indicating the type and position of the first error
    • isValidUtf8

      public boolean isValidUtf8()
      Returns true if this view contains well-formed UTF-8 data.

      This is a convenience method equivalent to validateUtf8().isValid().

      Returns:
      true if the data is valid UTF-8, false otherwise