Struct CodePointSource
A code point source is a wrapper around a string or StringBuilder that allows retrieving UTF-32 code points at a given index using GetCodePoint(int, bool). Additionally, it allows enumerating every code point in the underlying string or StringBuilder. This class also contains ToString(int), which converts a code point into its string representation, but caches the result to avoid allocating excess memory.
public readonly struct CodePointSource : IEnumerable<int>, IEnumerable
- Implements
- Inherited Members
- Extension Methods
Constructors
CodePointSource(string)
Creates a new code point source from the given string.
public CodePointSource(string strg)
Parameters
CodePointSource(StringBuilder)
Creates a new code point source from the given StringBuilder.
public CodePointSource(StringBuilder builder)
Parameters
builder
StringBuilderThe StringBuilder whose code points to inspect.
Properties
Length
The length of this code point, in characters. Note that this is not representative of the amount of code points in this source.
public int Length { get; }
Property Value
Methods
EnsureSurrogateBoundary(int, bool)
Returns an index in this code point source that is as close to index
as possible, but not between two members of a surrogate pair. If the index
is already not between surrogate pairs, it is returned unchanged.
public int EnsureSurrogateBoundary(int index, bool increase)
Parameters
index
intThe index to ensure is not between surrogates.
increase
boolWhether the returned index should be increased by 1 (instead of decreased by 1) when it is between surrogates.
Returns
- int
An index close to
index
, but not between surrogates.
GetCodePoint(int, bool)
Returns the code point at the given index
in this code point source's underlying string, where the index is measured in characters and not code points.
The resulting code point will either be a single char cast to an int, at which point the returned length will be 1, or a UTF-32 int character made up of two char values, at which point the returned length will be 2.
public (int CodePoint, int Length) GetCodePoint(int index, bool indexLowSurrogate = false)
Parameters
index
intThe index at which to return the code point, which is measured in characters.
indexLowSurrogate
boolWhether the
index
represents a low surrogate. If this is false, theindex
represents a high surrogate and the low surrogate will be looked for in the following character. If this is true, theindex
represents a low surrogate and the high surrogate will be looked for in the previous character.
Returns
GetEnumerator()
Returns an enumerator that iterates through the collection.
public IEnumerator<int> GetEnumerator()
Returns
- IEnumerator<int>
A IEnumerator<T> that can be used to iterate through the collection.
ToString(int)
Converts the given UTF-32 codePoint
into a string using ConvertFromUtf32(int), but caches the result in a Dictionary<TKey, TValue> cache to avoid allocating excess memory.
public static string ToString(int codePoint)
Parameters
codePoint
intThe UTF-32 code point to convert.
Returns
- string
The string representation of the code point.