Skip to content

ST::string

Michael Hansen edited this page Jun 7, 2022 · 16 revisions

ST::string

Headers

#include <string_theory/string>

Public Types

Member Type Definition
size_type size_t
difference_type ptrdiff_t
value_type ST::char_buffer::value_type
const_pointer const value_type *
const_reference const value_type &
const_iterator ST::char_buffer::const_iterator
const_reverse_iterator std::reverse_iterator<const_iterator>
Name Summary
case_sensitivity_t Enumeration for case sensitivity selection

Public Functions

Name Summary
(constructor) ST::string constructors
set Set a string's content
set_validated Set a string's content from pre-validated UTF-8 data
operator= Set a string's content with an overloaded = operator
operator+= Append additional content to a string
c_str Get a C-style string pointer to the string's internal UTF-8 data
u8_str Get a C-style string pointer to the string's internal UTF-8 data
data Return a pointer to the string's internal UTF-8 data
at Return a reference to a specific byte in the string
operator[] Return a reference to a specific byte in the string
char_at Return a specific byte in the string
front Return a reference to the first character in the string
back Return a reference to the last character in the string
begin
cbegin
Return an iterator to the front of the string
end
cend
Return an iterator to the end of the string
rbegin
crbegin
Return a reverse iterator to the start of the string
rend
crend
Return a reverse iterator to the end of the string
to_utf8 Get a ST::char_buffer copy of the string's UTF-8 data
to_utf16 Convert a string to UTF-16
to_utf32 Convert a string to UTF-32
to_wchar Convert a string to a platform's wchar_t buffer type
to_latin_1 Convert a string to Latin-1
to_buffer Convert a string to an ST::buffer<T>
to_std_string
to_std_wstring
to_std_u16string
to_std_u32string
to_std_u8string
Get a std::string copy of the string data
to_path Convert a string to a std::filesystem::path object
view Create a std::string_view of all or part of this string
operator std::string_view Implicit cast to std::string_view of this string
size Return the number of UTF-8 bytes contained in the string
empty
is_empty
Return whether the string is empty
clear Reset the string to the empty state
to_short
to_int
to_long
to_long_long
to_ushort
to_uint
to_ulong
to_ulong_long
Convert a string to an integer
to_float
to_double
Convert a string to a floating point number
to_int64
to_uint64
Convert a string to a 64-bit integer
to_bool Convert a string to a boolean
compare
compare_i
Compare string content lexicographically
compare_n
compare_ni
Compare N characters of string content lexicographically
operator==
operator!=
operator<
Overloaded operators for string comparison
find Find characters or substrings within this string
find_last Find the last instance of characters or substrings within this string
contains Determine whether a string contains a character or substring
trim_left Trim characters from the left side of the string
trim_right Trim characters from the right side of the string
trim Trim characters from both sides of the string
substr Extract a substring from part or all of this string
left Extract a substring from the left side of this string
right Extract a substring from the right side of this string
starts_with Determine whether a string starts with a given prefix
ends_with Determine whether a string ends with a given suffix
before_first Extract part of a string before the first match
after_first Extract part of a string after the first match
before_last Extract part of a string before the last match
after_last Extract part of a string after the last match
replace Replace instances of some search text within a string
to_upper Convert a string to upper-case
to_lower Convert a string to lower-case
split Split a string based on a given separator, preserving empty parts
tokenize Split a string into tokens separated by one or more delimiters

Static Public Members

Name Summary
from_validated Convert pre-validated UTF-8 data to an ST::string
from_utf8 Convert UTF-8 data to an ST::string
from_utf16 Convert UTF-16 data to an ST::string
from_utf32 Convert UTF-32 data to an ST::string
from_wchar Convert wchar_t data to an ST::string
from_latin_1 Convert Latin-1 data to an ST::string
from_std_string
from_std_wstring
Convert std::string types to an ST::string
from_path Convert std::filesystem::path objects to an ST::string
from_int
from_uint
Create a string representation of an integer
from_float
from_double
Create a string representation of floating point numbers
from_int64
from_uint64
Create a string representation of a 64-bit integer
from_bool Create a string representation of a boolean value
fill Create a string whose contents are filled with a given character

Related Non-Members

Name Summary
hash Hash functor for use in hashing containers
hash_i Case-insensitive hash functor for use in hashing containers
less_i Case-insensitive less functor for use in sorted containers
equal_i Case-insensitive equal functor for use in containers
operator+ String concatenation operator
operator==
operator!=
String comparison operators
operator"" _st ST::string user-defined literal operator

Macros

Name Summary
ST_LITERAL Efficient construction of ST::string literals
ST_WHITESPACE Default collection of characters to treat as whitespace for tokenize

Details

ST::string provides storage and manipulation tools for Unicode strings. The string data is stored internally as UTF-8 (in a ST::char_buffer object). This makes it easier for dealing with streams and files already in UTF-8 encoding, but means that many unicode characters make take up more than one code point (byte) in the string object.

ST::string objects can be easily converted to/from a few other encodings, including UTF-16, UTF-32 (UCS4), and wchar_t arrays (assumed to be either UTF-16 or UTF-32 depending on the platform).

With the exception of operator= and operator+= overloads and the set() method, ST::string objects are immutable. All operations which manipulate the string data (including operator= and operator+=) will create a new string buffer internally with a copy of the necessary data. This means that all ST::string members are re-entrant. Furthermore, the buffers returned by to_utf8() and c_str() are accessors to the string's internal storage, meaning they do not have to be stored externally in order to remain valid.

Faster string Literals

Although it's perfectly valid to create strings with normal C string literals, ST::string provides some helpers which skip the normal validation and checks when you know the input data is a string literal already encoded as valid UTF-8 bytes.

// The following are equivalent, but the second line will generally be faster
ST::string greeting = "Hello";
ST::string greeting_2 = ST_LITERAL("Hello");

// If you compiler supports user-defined literals, the second greeting can
// also be written more concisely as
ST::string greeting_3 = "Hello"_st;

Dealing with non-unicode data

ST::string will by default check that its data is valid UTF-8. If it finds any input which it can't encode, it will either report the error or substitute it with a substitute character (U+FFFD), depending on the conversion options.

It is also possible to tell ST::string to skip its checks, if you know the input data is already valid UTF-8. However, passing invalid UTF-8 data into an ST::string with ST::assume_valid is undefined behavior, and may cause unexpected results and bugs.

Finally, it is possible to treat input as Latin-1 (ISO-8859-1) data, which always succeeds. However, passing UTF-8 data as Latin-1 may result in the individual UTF-8 bytes showing up as Latin-1 character sequences.

char *bad_input = "...";

// This will throw ST::unicode_error with a message about what was wrong if
// the conversion encounters any sequences it can't decode.
ST::string str1(bad_input, ST_AUTO_SIZE, ST::check_validity);

// This will replace any character sequences it can't decode with the Unicode
// substitute character (U+FFFD).
ST::string str3(bad_input, ST_AUTO_SIZE, ST::substitute_invalid);

// This will assume the input is already valid UTF-8.  Passing invalid UTF-8
// data with ST::assume_valid is undefined behavior, and may have unexpected
// results and bugs.
ST::string str4(bad_input, ST_AUTO_SIZE, ST::assume_valid);

// Conversion always succeeds; treat data as Latin-1
ST::string str5 = ST::string::from_latin_1(bad_input);

Member Type Documentation

ST::case_sensitivity_t

enum case_sensitivity_t
{
    case_sensitive,
    case_insensitive
};

Indicates the case sensitivity for various find and comparison operations.

  • case_sensitive: Consider upper- and lower-case characters as different when doing string comparisons and searches.
  • case_insensitive: Consider upper- and lower-case characters as equal when doing string comparisons and searches.

Member Documentation

ST::string constructors

Signature
string() noexcept (1)
string(const ST::null_t &) noexcept (2)
string(const char *cstr, size_t size = ST_AUTO_SIZE, utf_validation_t validation = ST_DEFAULT_VALIDATION) (3)
string(const char8_t *cstr, size_t size = ST_AUTO_SIZE, utf_validation_t validation = ST_DEFAULT_VALIDATION) (4)
string(const wchar_t *wstr, size_t size = ST_AUTO_SIZE, utf_validation_t validation = ST_DEFAULT_VALIDATION) (5)
string(const char16_t *cstr, size_t size = ST_AUTO_SIZE, utf_validation_t validation = ST_DEFAULT_VALIDATION) (6)
string(const char32_t *cstr, size_t size = ST_AUTO_SIZE, utf_validation_t validation = ST_DEFAULT_VALIDATION) (7)
string(const string &copy) (8)
string(string &&move) noexcept (9)
string(const ST::char_buffer &init, utf_validation_t validation = ST_DEFAULT_VALIDATION) (10)
string(ST::char_buffer &&init, utf_validation_t validation = ST_DEFAULT_VALIDATION) (11)
string(const ST::wchar_buffer &init, utf_validation_t validation = ST_DEFAULT_VALIDATION) (12)
string(const ST::utf16_buffer &init, utf_validation_t validation = ST_DEFAULT_VALIDATION) (13)
string(const ST::utf32_buffer &init, utf_validation_t validation = ST_DEFAULT_VALIDATION) (14)
string(const std::string &init, utf_validation_t validation = ST_DEFAULT_VALIDATION) (15)
string(const std::wstring &init, utf_validation_t validation = ST_DEFAULT_VALIDATION) (16)
string(const std::u16string &init, utf_validation_t validation = ST_DEFAULT_VALIDATION) (17)
string(const std::u32string &init, utf_validation_t validation = ST_DEFAULT_VALIDATION) (18)
string(const std::u8string &init, utf_validation_t validation = ST_DEFAULT_VALIDATION) (19)
string(const std::string_view &init, utf_validation_t validation = ST_DEFAULT_VALIDATION) (20)
string(const std::wstring_view &init, utf_validation_t validation = ST_DEFAULT_VALIDATION) (21)
string(const std::u16string_view &init, utf_validation_t validation = ST_DEFAULT_VALIDATION) (22)
string(const std::u32string_view &init, utf_validation_t validation = ST_DEFAULT_VALIDATION) (23)
string(const std::u8string_view &init, utf_validation_t validation = ST_DEFAULT_VALIDATION) (24)
string(const std::filesystem::path &init) (25)
  1. Default constructor for strings. Creates an empty string.
  2. Shortcut constructor for empty strings. This is equivalent to the empty constructor (1).
  3. Construct a string from the first size bytes of the string pointed to by cstr. The data is expected to be encoded as UTF-8.
  4. Construct a string from the first size bytes of the UTF-8 string pointed to by cstr.
  5. Construct a string from the first size wide characters of the string pointed to by wstr. The data is expected to be encoded as either UTF-16 or UTF-32, depending on your platform's wchar_t support.
  6. Construct a string from the first size characters of the UTF-16 string pointed to by cstr.
  7. Construct a string from the first size characters of the UTF-32 string pointed to by cstr.
  8. Construct a string whose contents are a copy of copy.
  9. Move the contents of move into this string object.
  10. Construct a string from the contents of init. The data stored in init is expected to be encoded as UTF-8.
  11. Move the contents of init into this string's internal UTF-8 buffer. The data stored in init will still be checked according to validation, and is expected to be encoded as UTF-8.
  12. Construct a string from the wide character data provided in init. The data provided in init is expected to be encoded as either UTF-16 or UTF-32, depending on your platform's wchar_t support.
  13. Construct a string from the UTF-16 data provided in init.
  14. Construct a string from the UTF-32 data provided in init.
  15. Construct a string from the contents of init. The string data is expected to be encoded as UTF-8.
  16. Construct a string from the contents of init. The string data in init is expected to be encoded as either UTF-16 or UTF-32, depending on your platform's wchar_t support.
  17. Construct a string from the contents of the UTF-16 string in init.
  18. Construct a string from the contents of the UTF-32 string in init.
  19. Construct a string from the contents of the UTF-8 string in init.
  20. Construct a string from the string view captured by view. The string data is expected to be encoded as UTF-8.
  21. Construct a string from the string view captured by view. The string data in view is expected to be encoded as either UTF-16 or UTF-32, depending on your platform's wchar_t support.
  22. Construct a string from the UTF-16 string view captured by view.
  23. Construct a string from the UTF-32 string view captured by view.
  24. Construct a string from the UTF-8 string view captured by view.
  25. Construct a string from the filesystem path in init.

For the variants which take a size, if size is ST_AUTO_SIZE, the length of the input will be determined as if with strlen() or equivalent.

Changed in 1.1: Added std::string, std::wstring, and std::filesystem::path constructors.

Changed in 1.7: Added const char16_t *, const char32_t *, std::u16string, and std::u32string constructors.

Changed in 2.0: Added std::*string_view constructors.

Changed in 2.2: Added const char8_t * and std::u8string* overloads.

Changed in 3.4: Deprecated null_t overload.

See also operator=(), set(), from_utf8(), from_utf16(), from_utf32(), from_wchar(), from_std_string(), from_path()


ST::string::after_first

Signature
string after_first(char sep, ST::case_sensitivity_t cs = case_sensitive) const (1)
string after_first(const char *sep, ST::case_sensitivity_t cs = case_sensitive) const (2)
string after_first(const char8_t *sep, ST::case_sensitivity_t cs = case_sensitive) const (3)
string after_first(const string &sep, ST::case_sensitivity_t cs = case_sensitive) const (4)

Returns the part of this string after the first instance of sep found within the string. If sep is not found in the string, an empty string is returned.

Changed in 2.2: Added const char8_t * overload.


ST::string::after_last

Signature
string after_last(char sep, ST::case_sensitivity_t cs = case_sensitive) const (1)
string after_last(const char *sep, ST::case_sensitivity_t cs = case_sensitive) const (2)
string after_last(const char8_t *sep, ST::case_sensitivity_t cs = case_sensitive) const (3)
string after_last(const string &sep, ST::case_sensitivity_t cs = case_sensitive) const (4)

Returns the part of this string after the last instance of sep found within the string. If sep is not found in the string, the whole string is returned.

Changed in 2.2: Added const char8_t * overload.


ST::string::at

Signature
const char &at(size_t position) const

Returns a reference to the UTF-8 code unit (byte) at the specified position. Like ST::char_buffer::at(), this is bounds checked and may throw std::out_of_range if the provided position is outside the string's boundaries.

Since string_theory 2.0.

See also c_str(), operator[]()


ST::string::back

Signature
const char &back() const noexcept

Return a reference to the last character in the string. If the string is empty, this returns a reference to the terminating nul character.

Since string_theory 2.0.


ST::string::before_first

Signature
string before_first(char sep, ST::case_sensitivity_t cs = case_sensitive) const (1)
string before_first(const char *sep, ST::case_sensitivity_t cs = case_sensitive) const (2)
string before_first(const char8_t *sep, ST::case_sensitivity_t cs = case_sensitive) const (3)
string before_first(const string &sep, ST::case_sensitivity_t cs = case_sensitive) const (4)

Returns the part of this string before the first instance of sep found within the string. If sep is not found in the string, the whole string is returned.

Changed in 2.2: Added const char8_t * overload.


ST::string::before_last

Signature
string before_last(char sep, ST::case_sensitivity_t cs = case_sensitive) const (1)
string before_last(const char *sep, ST::case_sensitivity_t cs = case_sensitive) const (2)
string before_last(const char8_t *sep, ST::case_sensitivity_t cs = case_sensitive) const (3)
string before_last(const string &sep, ST::case_sensitivity_t cs = case_sensitive) const (4)

Returns the part of this string before the last instance of sep found within the string. If sep is not found in the string, an empty string is returned.

Changed in 2.2: Added const char8_t * overload.


ST::string::begin

Signature
const_iterator begin() const noexcept (1)
const_iterator cbegin() const noexcept (2)

Return an iterator to the beginning of the string.

Since string_theory 2.0.

See also end()


ST::string::c_str

Signature
const char *c_str() const noexcept (1)
const char *c_str(const char *substitute) const noexcept (2)

Returns a pointer to the stored UTF-8 string data. This buffer should always be nul-terminated, so it's safe to use in functions which require C-style string buffers.

For variant #2, If this string is empty, the pointer provided in substitute will be returned instead.

Changed in 1.7: Added the variant without a substitute string (#1). Previously, the second form would default its parameter to "", which was less efficient than the first variant.


ST::string::char_at [removed]

Signature
char char_at(size_t position) const noexcept

Note: This method is deprecated in string_theory 2.0. New code should switch to using either at() or operator[]().

Returns the UTF-8 code unit (byte) at the specified position. Note that this may return a byte in the middle of a UTF-8 multi-byte sequence! The position is not bounds-checked, so accessing positions outside the range [0, size()+1] will result in undefined behavior.

Deprecated in string_theory 2.0.

Removed in string_theory 3.0.

See also at(), operator[]()


ST::string::clear

Signature
void clear() noexcept

Clear the string data, resetting this string to the empty state.

Since string_theory 3.4

See also empty()


ST::string::compare

Signature
int compare(const string &str, ST::case_sensitivity_t cs = case_sensitive) const noexcept (1)
int compare(const char *str, ST::case_sensitivity_t cs = case_sensitive) const noexcept (2)
int compare(const char8_t *str, ST::case_sensitivity_t cs = case_sensitive) const noexcept (3)
int compare_i(const string &str) const noexcept (4)
int compare_i(const char *str) const noexcept (5)
int compare_i(const char8_t *str) const noexcept (6)

Compare this string to str lexicographically, in a manner similar to strcmp. If this string is "less than" (or "before") str, this function returns a negative value. If this string is "greater than" str, this function returns a positive value. If this string and str are considered equal, this returns 0.

Set cs to case_insensitive in order to perform a case-insensitive comparison.

The compare_i() variants (4-6) are provided as convenience shortcuts for compare(str, case_insensitive)

Changed in 2.2: Added the const char8_t * overloads.

See also operator==(), operator<()


ST::string::compare_n

Signature
int compare_n(const string &str, size_t count, ST::case_sensitivity_t cs = case_sensitive) const noexcept (1)
int compare_n(const char *str, size_t count, ST::case_sensitivity_t cs = case_sensitive) const noexcept (2)
int compare_n(const char8_t *str, size_t count, ST::case_sensitivity_t cs = case_sensitive) const noexcept (3)
int compare_ni(const string &str, size_t count) const noexcept (4)
int compare_ni(const char *str, size_t count) const noexcept (5)
int compare_ni(const char8_t *str, size_t count) const noexcept (6)

Compare up to the first count bytes of this string to str, in a manner similar to strncmp.

Set cs to case_insensitive in order to perform a case-insensitive comparison.

The compare_ni() variants (4-6) are provided as convenience shortcuts for compare_n(str, count, case_insensitive)

Changed in 2.2: Added the const char8_t * overloads.

See also compare()


ST::string::contains

Signature
bool contains(char ch, ST::case_sensitivity_t cs = case_sensitive) const noexcept (1)
bool contains(const char *substr, ST::case_sensitivity_t cs = case_sensitive) const noexcept (2)
bool contains(const char8_t *substr, ST::case_sensitivity_t cs = case_sensitive) const noexcept (3)
bool contains(const string &substr, ST::case_sensitivity_t cs = case_sensitive) const noexcept (4)
bool contains(const char *substr, size_t count, ST::case_sensitivity_t cs = case_sensitive) const noexcept (5)
bool contains(const char8_t *substr, size_t count, ST::case_sensitivity_t cs = case_sensitive) const noexcept (6)

Returns true if ch (1) or substr (2-4) is contained anywhere in this string.

For variants 5 and 6, no more than count bytes are matched from the substr.

Changed in 2.2: Added const char8_t * overload.

Changed in 3.5: Added overloads 5-6 with an explicit count.

See also find()


ST::string::data

Signature
const char *data() const noexcept

Returns a pointer to the stored UTF-8 string data. This buffer should always be nul-terminated, so it's safe to use in functions which require C-style string buffers.

Since string_theory 3.5.


ST::string::empty

Signature
bool empty() const noexcept (1)
bool is_empty() const noexcept (2)

Returns true if this string is empty (i.e. its size is 0). Note that even for an empty string, the first character pointed to by c_str() can be accessed, and should be the nul character ('\0').

Changed in 2.0: Added empty() and deprecated is_empty().

Changed in 3.0: Removed is_empty().

See also size(), clear()


ST::string::end

Signature
const_iterator end() const noexcept (1)
const_iterator cend() const noexcept (2)

Return an iterator to the end of the string.

Since string_theory 2.0.

See also begin()


ST::string::ends_with

Signature
bool ends_with(const string &suffix, ST::case_sensitivity_t cs = case_sensitive) const (1)
bool ends_with(const char *suffix, ST::case_sensitivity_t cs = case_sensitive) const (2)
bool ends_with(const char8_t *suffix, ST::case_sensitivity_t cs = case_sensitive) const (3)

Return true if this string ends with suffix.

Changed in 2.0: Marked these functions noexcept.

Changed in 2.2: Added const char8_t * overload.


ST::string::fill [static]

Signature
static string fill(size_t count, char c)

Create a string which is pre-populated with count copies of the ASCII character in c.


ST::string::find

Signature
ST_ssize_t find(char ch, ST::case_sensitivity_t cs = case_sensitive) const noexcept (1)
ST_ssize_t find(const char *substr, ST::case_sensitivity_t cs = case_sensitive) const noexcept (2)
ST_ssize_t find(const char8_t *substr, ST::case_sensitivity_t cs = case_sensitive) const noexcept (3)
ST_ssize_t find(const string &substr, ST::case_sensitivity_t cs = case_sensitive) const noexcept (4)
ST_ssize_t find(size_t start, char ch, ST::case_sensitivity_t cs = case_sensitive) const noexcept (5)
ST_ssize_t find(size_t start, const char *substr, ST::case_sensitivity_t cs = case_sensitive) const noexcept (6)
ST_ssize_t find(size_t start, const char8_t *substr, ST::case_sensitivity_t cs = case_sensitive) const noexcept (7)
ST_ssize_t find(size_t start, const string &substr, ST::case_sensitivity_t cs = case_sensitive) const noexcept (8)
ST_ssize_t find(const char *substr, size_t count, ST::case_sensitivity_t cs = case_sensitive) const noexcept (9)
ST_ssize_t find(const char8_t *substr, size_t count, ST::case_sensitivity_t cs = case_sensitive) const noexcept (10)
ST_ssize_t find(size_t start, const char *substr, size_t count, ST::case_sensitivity_t cs = case_sensitive) const noexcept (11)
ST_ssize_t find(size_t start, const char8_t *substr, size_t count, ST::case_sensitivity_t cs = case_sensitive) const noexcept (12)

Find the first instance of ch (1, 5) or substr (2-4, 6-8) within the string, and return its byte position. If ch or substr isn't found, returns -1.

For variants 5-8, searching starts at byte position start.

For variants 9-12, no more than count bytes are matched from the substr.

Changed in 1.6: Added overloads 5, 6 and 8 to start at a specific position.

Changed in 2.2: Added const char8_t * overloads.

Changed in 3.5: Added overloads 9-12 with an explicit count.


ST::string::find_last

Signature
ST_ssize_t find_last(char ch, ST::case_sensitivity_t cs = case_sensitive) const noexcept (1)
ST_ssize_t find_last(const char *substr, ST::case_sensitivity_t cs = case_sensitive) const noexcept (2)
ST_ssize_t find_last(const char8_t *substr, ST::case_sensitivity_t cs = case_sensitive) const noexcept (3)
ST_ssize_t find_last(const string &substr, ST::case_sensitivity_t cs = case_sensitive) const noexcept (4)
ST_ssize_t find_last(size_t max, char ch, ST::case_sensitivity_t cs = case_sensitive) const noexcept (5)
ST_ssize_t find_last(size_t max, const char *substr, ST::case_sensitivity_t cs = case_sensitive) const noexcept (6)
ST_ssize_t find_last(size_t max, const char8_t *substr, ST::case_sensitivity_t cs = case_sensitive) const noexcept (7)
ST_ssize_t find_last(size_t max, const string &substr, ST::case_sensitivity_t cs = case_sensitive) const noexcept (8)
ST_ssize_t find_last(const char *substr, size_t count, ST::case_sensitivity_t cs = case_sensitive) const noexcept (9)
ST_ssize_t find_last(const char8_t *substr, size_t count, ST::case_sensitivity_t cs = case_sensitive) const noexcept (10)
ST_ssize_t find_last(size_t max, const char *substr, size_t count, ST::case_sensitivity_t cs = case_sensitive) const noexcept (11)
ST_ssize_t find_last(size_t max, const char8_t *substr, size_t count, ST::case_sensitivity_t cs = case_sensitive) const noexcept (12)

Find the last instance of ch (1, 5) or substr (2-4, 6-8) within the string, and return its byte position. If ch or substr isn't found, returns -1.

For variants 5-8, searching looks only within the first max bytes of the string.

For variants 9-12, no more than count bytes are matched from the substr.

Changed in 1.6: Added overloads 5, 6 and 8 to search within the first max bytes.

Changed in 2.2: Added const char8_t * overloads.

Changed in 3.5: Added overloads 9-12 with an explicit count.


ST::string::from_bool [static]

Signature
static string from_bool(bool value)

Creates the string literal "true" or "false", depending on value.


ST::string::from_float [static]

Signature
static string from_float(float value, char format = 'g') (1)
static string from_float(double value, char format = 'g') (2)
static string from_double(double value, char format = 'g') (3)

Create a string representation of the floating-point number in value. The format character has the same meaning as printf's floating point formats, and should be one of 'e', 'f' or 'g'.

Changed in 3.2: Added from_float(double, char) overload.


ST::string::from_int [static]

Signature
static string from_int(short value, int base = 10, bool upper_case = false) (1)
static string from_int(int value, int base = 10, bool upper_case = false) (2)
static string from_int(long value, int base = 10, bool upper_case = false) (3)
static string from_int(long long value, int base = 10, bool upper_case = false) (4)
static string from_uint(unsigned short value, int base = 10, bool upper_case = false) (5)
static string from_uint(unsigned int value, int base = 10, bool upper_case = false) (6)
static string from_uint(unsigned long value, int base = 10, bool upper_case = false) (7)
static string from_uint(unsigned long long value, int base = 10, bool upper_case = false) (8)

Create a string representation of the integer in value.

Changed in 3.2: Added (unsigned) short, long, long long overloads.


ST::string::from_int64 [static]

Signature
static string from_int64(int64_t value, int base = 10, bool upper_case = false) (1)
static string from_uint64(uint64_t value, int base = 10, bool upper_case = false) (2)

Create a string representation of the 64-bit integer in value.

Deprecated in string_theory 4.0


ST::string::from_latin_1 [static]

Signature
static string from_latin_1(const char *astr, size_t size = ST_AUTO_SIZE) (1)
static string from_latin_1(const ST::char_buffer &astr) (2)
  1. Construct a string from the first size bytes of the Latin-1 / ISO-8859-1 string data in astr. If size is ST_AUTO_SIZE, the length of the input will be determined with ::strlen().
  2. Construct a string from the Latin-1 / ISO-8859-1 string data in astr.

ST::string::from_std_string [static]

Signature
static string from_std_string(const std::string &sstr, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (1)
static string from_std_string(const std::wstring &wstr, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (2)
static string from_std_wstring(const std::wstring &wstr, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (3)
static string from_std_string(const std::u16string &ustr, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (4)
static string from_std_string(const std::u32string &ustr, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (5)
static string from_std_string(const std::u8string &ustr, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (6)
static string from_std_string(const std::string_view &view, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (7)
static string from_std_string(const std::wstring_view &view, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (8)
static string from_std_wstring(const std::wstring_view &view, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (9)
static string from_std_string(const std::u16string_view &view, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (10)
static string from_std_string(const std::u32string_view &view, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (11)
static string from_std_string(const std::u8string_view &view, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (12)

Construct a string from a std::string or std::string_view. The string is expected to be encoded as the appropriate UTF encoding for the string type (see set() for details).

Since string_theory 1.1.

Changed in 1.7: Added std::u16string and std::u32string overloads.

Changed in 2.0: Added std::*string_view overloads.

Changed in 2.2: Added std::u8string* overloads.

See also (constructor), set()


ST::string::from_path [static]

Signature
static string from_path(const std::filesystem::path &path)

Construct a string from a filesystem path, using the system's default encoding.

Since string_theory 1.1.


ST::string::from_utf8 [static]

Signature
static string from_utf8(const char *utf8, size_t size = ST_AUTO_SIZE, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (1)
static string from_utf8(const char8_t *utf8, size_t size = ST_AUTO_SIZE, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (2)
static string from_utf8(const ST::char_buffer &utf8, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (3)
  1. Construct a string from the first size bytes of the UTF-8 string data in utf8. If size is ST_AUTO_SIZE, the length of the input will be determined as if with strlen().
  2. Construct a string from the first size bytes of the UTF-8 string data in utf8. If size is ST_AUTO_SIZE, the length of the input will be determined as if with strlen().
  3. Construct a string from the UTF-8 string data in utf8.

Changed in 2.2: Added const char8_t * overload.


ST::string::from_utf16 [static]

Signature
static string from_utf16(const char16_t *utf16, size_t size = ST_AUTO_SIZE, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (1)
static string from_utf16(const ST::utf16_buffer &utf16, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (2)
  1. Construct a string from the first size characters of the UTF-16 string data in utf16. If size is ST_AUTO_SIZE, the length of the input will be determined with the equivalent of strlen().
  2. Construct a string from the UTF-16 string data in utf16.

ST::string::from_utf32 [static]

Signature
static string from_utf32(const char32_t *utf32, size_t size = ST_AUTO_SIZE, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (1)
static string from_utf32(const ST::utf32_buffer &utf32, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (2)
  1. Construct a string from the first size characters of the UTF-32 string data in utf32. If size is ST_AUTO_SIZE, the length of the input will be determined with the equivalent of strlen().
  2. Construct a string from the UTF-32 string data in utf32.

ST::string::from_validated

Signature
static string from_validated(const char *text, size_t size) (1)
static string from_validated(const char8_t *text, size_t size) (2)
static string from_validated(const ST::char_buffer &buffer) (3)
static string from_validated(ST::char_buffer &&buffer) (4)
  1. Construct a string from the validated UTF-8 data pointed to by text.
  2. Construct a string from the validated UTF-8 data pointed to by text.
  3. Construct a string from the validated UTF-8 buffer in buffer.
  4. Construct a string owning the validated UTF-8 buffer in buffer.

Since string_theory 2.2.

See also set_validated()


ST::string::from_wchar [static]

Signature
static string from_wchar(const wchar_t *wstr, size_t size = ST_AUTO_SIZE, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (1)
static string from_wchar(const ST::wchar_buffer &wstr, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (2)
  1. Construct a string from the first size wide characters of the wide string data in wstr. If size is ST_AUTO_SIZE, the length of the input will be determined with wcslen().
  2. Construct a string from the wide character string data in wstr.

Note that the data is expected to be either UTF-16 or UTF-32 encoded, depending on your platform's wchar_t support.


ST::string::front

Signature
const char &front() const noexcept

Return a reference to the first character in the string. If the string is empty, this returns a reference to the terminating nul character.

Since string_theory 2.0.


ST::string::left

Signature
string left(size_t size) const

Convenience function to extract a substring from the left side of the string. This is equivalent to substr(0, size).

See also substr()


ST::string::operator=

Signature
string &operator=(const ST::null_t &) noexcept (1)
string &operator=(const char *cstr) (2)
string &operator=(const char8_t *cstr) (3)
string &operator=(const wchar_t *wstr) (4)
string &operator=(const char16_t *cstr) (5)
string &operator=(const char32_t *cstr) (6)
string &operator=(const string &copy) (7)
string &operator=(string &&move) noexcept (8)
string &operator=(const ST::char_buffer &init) (9)
string &operator=(ST::char_buffer &&init) (10)
string &operator=(const ST::wchar_buffer &init) (11)
string &operator=(const ST::utf16_buffer &init) (12)
string &operator=(const ST::utf32_buffer &init) (13)
string &operator=(const std::string &init) (14)
string &operator=(const std::wstring &init) (15)
string &operator=(const std::u16string &init) (16)
string &operator=(const std::u32string &init) (17)
string &operator=(const std::u8string &init) (18)
string &operator=(const std::string_view &init) (19)
string &operator=(const std::wstring_view &init) (20)
string &operator=(const std::u16string_view &init) (21)
string &operator=(const std::u32string_view &init) (22)
string &operator=(const std::u8string_view &init) (23)
string &operator=(const std::filesystem::path &init) (24)
  1. Shortcut operator=() overload to reset the string to the empty string. Equivalent to calling clear().
  2. Set the string content to the contents of the string pointed to by cstr. This is equivalent to set(cstr).
  3. Set the string content to the contents of the string pointed to by cstr. This is equivalent to set(cstr).
  4. Set the string content from the wide string pointed to by wstr. This is equivalent to set(wstr).
  5. Set the string content from the UTF-16 string pointed to by cstr. This is equivalent to set(cstr).
  6. Set the string content from the UTF-32 string pointed to by cstr. This is equivalent to set(cstr).
  7. Set the string to the same value as copy.
  8. Move the contents of move into this string object.
  9. Set the string from the contents of init. The data stored in init is expected to be encoded as UTF-8.
  10. Move the contents of init into this string's internal UTF-8 buffer. The data stored in init will still be checked according to validation, and is expected to be encoded as UTF-8.
  11. Set the string content from the wide character data provided in init. The data provided in init is expected to be encoded as either UTF-16 or UTF-32, depending on your platform's wchar_t support.
  12. Set the string content from the UTF-16 data provided in init.
  13. Set the string content from the UTF-32 data provided in init.
  14. Set the string content from the string in init. The string data is expected to be encoded as UTF-8.
  15. Set the string content from the wide string in init. The string data is expected to be encoded as either UTF-16 or UTF-32, depending on your platform's wchar_t support.
  16. Set the string content from the UTF-16 string in init.
  17. Set the string content from the UTF-32 string in init.
  18. Set the string content from the UTF-8 string in init.
  19. Set the string content from the string view contained in view. The string data is expected to be encoded as UTF-8.
  20. Set the string content from the wide string view contained in init. The string data is expected to be encoded as either UTF-16 or UTF-32, depending on your platform's wchar_t support.
  21. Set the string content from the UTF-16 string view contained in init.
  22. Set the string content from the UTF-32 string view contained in init.
  23. Set the string content from the UTF-8 string view contained in init.
  24. Set the string content from the filesystem path in init.

Changed in 1.1: Added std::string, std::wstring, and std::filesystem::path overloads.

Changed in 1.7: Added const char16_t *, const char32_t *, std::u16string, and std::u32string overloads.

Changed in 2.0: Added std::*string_view overloads.

Changed in 2.2: Added const char8_t * and std::u8string* overloads.

Changed in 3.4: Deprecated null_t overload.

See also (constructor), set()


ST::string::operator+=

Signature
string &operator+=(const char *cstr) (1)
string &operator+=(const wchar_t *wstr) (2)
string &operator+=(const char16_t *cstr) (3)
string &operator+=(const char32_t *cstr) (4)
string &operator+=(const char8_t *cstr) (5)
string &operator+=(const string &other) (6)
string &operator+=(char ch) (7)
string &operator+=(wchar_t ch) (8)
string &operator+=(char16_t ch) (9)
string &operator+=(char32_t ch) (10)
  1. Append the contents of cstr to the end of this string. The input is expected to be encoded as UTF-8.
  2. Append the contents of wstr to the end of this string. The input is converted to UTF-8 in the same manner as from_wchar().
  3. Append the contents of cstr to the end of this string. The input is converted to UTF-8 in the same manner as from_utf16().
  4. Append the contents of cstr to the end of this string. The input is converted to UTF-8 in the same manner as from_utf32().
  5. Append the contents of cstr to the end of this string.
  6. Append the contents of other to the end of this string.
  7. Append the ASCII character ch to the end of this string.
  8. Append the wide character ch to the end of this string.
  9. Append the unicode character ch to the end of this string.
  10. Append the unicode character ch to the end of this string.

Changed in 1.6: Added overloads 6-9 to append individual characters.

Changed in 1.7: Added const char16_t * and const char32_t * overloads.

Changed in 2.2: Added const char8_t * and std::u8string* overloads.


ST::string comparison operators

Signature
bool operator==(const ST::null_t &) const noexcept (1)
bool operator==(const string &other) const noexcept (2)
bool operator==(const char *other) const noexcept (3)
bool operator==(const char8_t *other) const noexcept (4)
bool operator!=(const ST::null_t &) const noexcept (5)
bool operator!=(const string &other) const noexcept (6)
bool operator!=(const char *other) const noexcept (7)
bool operator!=(const char8_t *other) const noexcept (8)
bool operator<(const string &other) const noexcept (9)
  1. Returns empty().
  2. Convenience operator. This is equivalent to checking compare(other, ST::case_sensitive) == 0
  3. Convenience operator. This is equivalent to checking compare(other, ST::case_sensitive) == 0
  4. Convenience operator. This is equivalent to checking compare(other, ST::case_sensitive) == 0
  5. Returns !empty().
  6. Convenience operator. This is equivalent to checking compare(other, ST::case_sensitive) != 0
  7. Convenience operator. This is equivalent to checking compare(other, ST::case_sensitive) != 0
  8. Convenience operator. This is equivalent to checking compare(other, ST::case_sensitive) != 0
  9. Convenience operator. This is provided to work with std::less for STL-style containers.

For more control over string comparisons, see the compare() family of functions.

Changed in 2.2: Added const char8_t * overloads.

Changed in 3.4: Deprecated null_t overloads.

See also compare(), struct less_i


ST::string::operator[]

Signature
const char &operator[](size_t position) const noexcept

Returns a reference to the UTF-8 code unit (byte) at the specified position. Like ST::char_buffer::operator[](), this is not bounds checked. However, accessing characters outside of the string and its terminating nul character will result in undefined behavior.

Since string_theory 2.0.

See also c_str(), at()


ST::string::operator std::string_view [removed]

Signature
operator std::string_view() const

This operator overload allows implicit conversion of an ST::string to a std::string_view into the entire contents of the string.

Since string_theory 2.0.

Removed in string_theory 3.0.


ST::string::rbegin

Signature
const_reverse_iterator rbegin() const noexcept (1)
const_reverse_iterator crbegin() const noexcept (2)

Return a reverse iterator to the reverse-start of the string.

Since string_theory 2.0.

See also rend()


ST::string::rend

Signature
const_reverse_iterator rend() const noexcept (1)
const_reverse_iterator crend() const noexcept (2)

Return a reverse iterator to the reverse-end of the buffer.

Since string_theory 2.0.

See also rbegin()


ST::string::replace

Signature
string replace(const char *from, const char *to, ST::case_sensitivity_t cs = case_sensitive, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) const (1)
string replace(const string &from, const char *to, ST::case_sensitivity_t cs = case_sensitive, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) const (2)
string replace(const char *from, const string &to, ST::case_sensitivity_t cs = case_sensitive, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) const (3)
string replace(const string &from, const string &to, ST::case_sensitivity_t cs = case_sensitive, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) const (4)
string replace(const string &from, const string &to, ST::case_sensitivity_t cs = case_sensitive) const (5)
string replace(const char8_t *from, const char8_t *to, ST::case_sensitivity_t cs = case_sensitive, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) const (6)
string replace(const string &from, const char8_t *to, ST::case_sensitivity_t cs = case_sensitive, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) const (7)
string replace(const char8_t *from, const string &to, ST::case_sensitivity_t cs = case_sensitive, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) const (8)

Return a string which has all instances of the string from replaced with the string in to.

Changed in 2.2: Added const char8_t * overloads.

Changed in 3.0: Added overload #5 and deprecated #4.


ST::string::right

Signature
string right(size_t size) const

Convenience function to extract a substring from the right side of the string. This is equivalent to substr(-size).

See also substr()


ST::string::set

Signature
void set(const ST::null_t &) noexcept (1)
void set(const char *cstr, size_t size = ST_AUTO_SIZE, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (2)
void set(const wchar_t *wstr, size_t size = ST_AUTO_SIZE, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (3)
void set(const char16_t *cstr, size_t size = ST_AUTO_SIZE, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (4)
void set(const char32_t *cstr, size_t size = ST_AUTO_SIZE, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (5)
void set(const char8_t *cstr, size_t size = ST_AUTO_SIZE, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (6)
void set(const string &copy) (7)
void set(string &&move) noexcept (8)
void set(const ST::char_buffer &init, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (9)
void set(ST::char_buffer &&init, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (10)
void set(const ST::wchar_buffer &init, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (11)
void set(const ST::utf16_buffer &init, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (12)
void set(const ST::utf32_buffer &init, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (13)
void set(const std::string &init, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (14)
void set(const std::wstring &init, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (15)
void set(const std::u16string &init, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (16)
void set(const std::u32string &init, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (17)
void set(const std::u8string &init, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (18)
void set(const std::string_view &init, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (19)
void set(const std::wstring_view &init, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (20)
void set(const std::u16string_view &init, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (21)
void set(const std::u32string_view &init, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (22)
void set(const std::u8string_view &init, ST::utf_validation_t validation = ST_DEFAULT_VALIDATION) (23)
void set(const std::filesystem::path &init) (24)
  1. Reset the string to the empty string. Equivalent to calling clear().
  2. Set the string content to the first size bytes of the string pointed to by cstr. The data pointed to by cstr is expected to be encoded as UTF-8.
  3. Set the string content from the first size wide characters of the string pointed to by wstr. The data pointed to by wstr is expected to be encoded as either UTF-16 or UTF-32, depending on your platform's wchar_t support.
  4. Set the string content from the first size characters of the UTF-16 string pointed to by cstr.
  5. Set the string content from the first size characters of the UTF-32 string pointed to by cstr.
  6. Set the string content from the first size bytes of the UTF-8 string pointed to by cstr.
  7. Set the string to the same value as copy.
  8. Move the contents of move into this string object.
  9. Set the string from the contents of init. The data stored in init is expected to be encoded as UTF-8.
  10. Move the contents of init into this string's internal UTF-8 buffer. The data stored in init will still be checked according to validation, and is expected to be encoded as UTF-8.
  11. Set the string content from the wide character data provided in init. The data provided in init is expected to be encoded as either UTF-16 or UTF-32, depending on your platform's wchar_t support.
  12. Set the string content from the UTF-16 data provided in init.
  13. Set the string content from the UTF-32 data provided in init.
  14. Set the string content from the string provided in init. The string is expected to be encoded as UTF-8.
  15. Set the string content from the wide string data provided in init. The string is expected to be encoded as either UTF-16 or UTF-32, depending on your platform's wchar_t support.
  16. Set the string content from the UTF-16 string provided in init.
  17. Set the string content from the UTF-32 string provided in init.
  18. Set the string content from the UTF-8 string provided in init.
  19. Set the string content from the string view captured in view. The string is expected to be encoded as UTF-8.
  20. Set the string content from the wide string view captured in view. The string is expected to be encoded as either UTF-16 or UTF-32, depending on your platform's wchar_t support.
  21. Set the string content from the UTF-16 string view captured in view.
  22. Set the string content from the UTF-32 stirng view captured in view.
  23. Set the string content from the UTF-8 stirng view captured in view.
  24. Set the string content from the filesystem path in init.

For the variants which take a size, if size is ST_AUTO_SIZE, the length of the input will be determined as if with strlen() or equivalent.

Changed in 1.1: Added std::string, std::wstring, and std::filesystem::path overloads.

Changed in 1.7: Added const char16_t *, const char32_t *, std::u16string, and std::u32string overloads.

Changed in 2.0: Added std::*string_view overloads.

Changed in 2.2: Added const char8_t * and std::u8string* overloads.

Changed in 3.4: Deprecated null_t overload.

See also (constructor)(), set_validated(), operator=(), from_utf8(), from_utf16(), from_utf32(), from_std_string(), from_path()


ST::string::set_validated

Signature
void set_validated(const char *text, size_t size) (1)
void set_validated(const char8_t *text, size_t size) (2)
void set_validated(const ST::char_buffer &buffer) (3)
void set_validated(ST::char_buffer &&buffer) (4)
  1. Set the string content from the validated UTF-8 data pointed to by text.
  2. Set the string content from the validated UTF-8 data pointed to by text.
  3. Set the string content from the validated UTF-8 buffer in buffer.
  4. Move the validated UTF-8 buffer in buffer into the string's internal buffer.

Since string_theory 2.2.

See also set(), from_validated()


Signature
size_t size() const noexcept

Returns the size (in bytes) of the string data, not including the nul-terminator.

See also ST::buffer::size()


ST::string::split

Signature
std::vector<string> split(char split_char, size_t max_splits = ST_AUTO_SIZE, ST::case_sensitivity_t cs = case_sensitive const (1)
std::vector<string> split(const char *splitter, size_t max_splits = ST_AUTO_SIZE, ST::case_sensitivity_t cs = case_sensitive const (2)
std::vector<string> split(const string &splitter, size_t max_splits = ST_AUTO_SIZE, ST::case_sensitivity_t cs = case_sensitive const (3)
std::vector<string> split(const char8_t *splitter, size_t max_splits = ST_AUTO_SIZE, ST::case_sensitivity_t cs = case_sensitive const (4)

Split the string into pieces separated by split_char or splitter. If max_splits is not ST_AUTO_SIZE and there are more than max_splits separators in the string, the extras will be preserved in the final element of the returned vector. Specifically, the maximum size of the returned vector is max_splits + 1 elements.

Note that adjacent separators are treated individually: Two instances of split_char or splitter next to each other will result in an empty string in the result. If this string is empty, a vector with a single empty string element will be returned.

Changed in 2.2: Added const char8_t * overload.

See also tokenize()


ST::string::tokenize

Signature
std::vector<string> tokenize(const char *delims = ST_WHITESPACE) const

Split the string into pieces separated by any of the characters in delims. Any sequence of adjacent delimiters will be treated as a single separator, meaning that no elements of the returned vector will be empty. If this string is empty, an empty vector will be returned.

See also split()


ST::string::starts_with

Signature
bool starts_with(const string &prefix, ST::case_sensitivity_t cs = case_sensitive) const noexcept (1)
bool starts_with(const char *prefix, ST::case_sensitivity_t cs = case_sensitive) const noexcept (2)
bool starts_with(const char8_t *prefix, ST::case_sensitivity_t cs = case_sensitive) const noexcept (3)

Return true if this string starts with prefix. Equivalent to compare_n(prefix, prefix.size(), cs) == 0.

Changed in 2.0: Marked these functions noexcept.

Changed in 2.2: Added const char8_t * overload.

See also compare_n()


ST::string::substr

Signature
string substr(ST_ssize_t start, size_t size = ST_AUTO_SIZE) const

Return a string whose contents are a copy of at most size bytes from this string, starting at position start. If size is ST_AUTO_SIZE or start + size is greather than the size of the string, this will return the remainder of the string from the starting position. If start is negative, then the starting position is relative to the end of the string.

See also left(), right()


ST::string::to_bool

Signature
bool to_bool() const noexcept (1)
bool to_bool(ST::conversion_result &result) const noexcept (2)

Convert the string to a boolean. If the string is either "true" or "false" (case insensitive), those values are converted to the respective boolean values. Otherwise, this behaves like to_int(), where a non-zero result is treated as true.

Note that variant #1 has no way of reporting errors. An empty string will return false, as will any string which cannot be converted to based on the rules described above. For variant #2, The result of the conversion is stored in result.


ST::string::to_buffer

Signature
void to_buffer(ST::char_buffer &result, bool utf8, ST::utf_validation_t validation) const (1)
void to_buffer(ST::char_buffer &result, bool utf8 = true, bool substitute_out_of_range = true) const (2)
void to_buffer(ST::wchar_buffer &result) const (3)
void to_buffer(ST::utf16_buffer &result) const (4)
void to_buffer(ST::utf32_buffer &result) const (5)

Conversion helpers for easier use in templates:

  1. Deprecated alias for #2.
  2. Convert the string content to a char_buffer. If utf8 is true, the buffer will be a copy of the string's internal UTF-8 buffer; otherwise, it will be converted to Latin-1.
  3. Convert the string content to a wchar_buffer. The string will be encoded as either UTF-16 or UTF-32 depending on your platform's wchar_t support. This is equivalent to result = to_wchar().
  4. Convert the string content to a ST::utf16_buffer. This is equivalent to result = to_utf16().
  5. Convert the string content to a ST::utf32_buffer. This is equivalent to result = to_utf32().

Since string_theory 2.4.

Changed in 3.0: Deprecated the char_buffer overload taking a utf_validation_t and added the bool overload as a replacement.

See also to_utf8(), to_latin_1(), to_wchar(), to_utf16(), to_utf32()


ST::string::to_float

Signature
float to_float() const noexcept (1)
float to_float(ST::conversion_result &result) const noexcept (2)
double to_double() const noexcept (3)
double to_double(ST::conversion_result &result) const noexcept (4)

Convert the string to a floating-point number, in a manner similar to strtod.

Note that variants #1 and #3 hav no way of reporting errors. An empty string will return 0, and a string with other characters not parseable as a number will get partially converted, up to the first invalid character. For variants #2 and #4, The result of the conversion is stored in result.


ST::string::to_int

Signature
short to_short(int base = 0) const noexcept (1)
short to_short(ST::conversion_result &result, int base = 0) const noexcept (2)
int to_int(int base = 0) const noexcept (3)
int to_int(ST::conversion_result &result, int base = 0) const noexcept (4)
long to_long(int base = 0) const noexcept (5)
long to_long(ST::conversion_result &result, int base = 0) const noexcept (6)
long long to_long_long(int base = 0) const noexcept (7)
long long to_long_long(ST::conversion_result &result, int base = 0) const noexcept (8)
unsigned short to_ushort(int base = 0) const noexcept (9)
unsigned short to_ushort(ST::conversion_result &result, int base = 0) const noexcept (10)
unsigned int to_uint(int base = 0) const noexcept (11)
unsigned int to_uint(ST::conversion_result &result, int base = 0) const noexcept (12)
unsigned long to_ulong(int base = 0) const noexcept (13)
unsigned long to_ulong(ST::conversion_result &result, int base = 0) const noexcept (14)
unsigned long long to_ulong_long(int base = 0) const noexcept (15)
unsigned long long to_ulong_long(ST::conversion_result &result, int base = 0) const noexcept (16)

Convert the string to an integer. If base is 0, this function will try to guess the base in a similar manner to strtol or strtoul.

Note that variants without an ST::conversion_result & parameter have no way of reporting errors. An empty string will return 0, and a string with other characters not in the specified base will get partially converted, up to the first invalid character. For variants with a conversion_result parameter, the result of the conversion is stored in result.

Changed in 3.2: Added (unsigned) short, long, long long conversions.


ST::string::to_int64

Signature
int64_t to_int64(int base = 0) const noexcept (1)
int64_t to_int64(ST::conversion_result &result, int base = 0) const noexcept (2)
uint64_t to_uint64(int base = 0) const noexcept (3)
uint64_t to_uint64(ST::conversion_result &result, int base = 0) const noexcept (4)

Convert the string to a 64-bit integer. If base is 0, this function will try to guess the base in a similar manner to strtoll or strtoull.

Note that variants without an ST::conversion_result & parameter have no way of reporting errors. An empty string will return 0, and a string with other characters not in the specified base will get partially converted, up to the first invalid character. For variants with a conversion_result parameter, the result of the conversion is stored in result.

Deprecated in string_theory 4.0.


ST::string::to_latin_1

Signature
ST::char_buffer to_latin_1(ST::utf_validation_t validation) const (1)
ST::char_buffer to_latin_1(bool substitute_out_of_range = true) const (2)

Convert the string content to Latin-1 / ISO-8859-1. Any characters outside of the Latin-1 range will be replaced by ? if substitute_out_of_range is true, or will throw a ST::unicode_error otherwise.

Changed in 3.0: Deprecated the overload taking utf_validation_t and added the bool overload as a replacement.


ST::string::to_lower

Signature
string to_lower() const

Returns a copy of this string with all 7-bit ASCII characters converted to lower-case.


ST::string::to_std_string

Signature
std::string to_std_string(bool utf8, ST::utf_validation_t validation) const (1)
std::string to_std_string(bool utf8 = true, bool substitute_out_of_range = true) const (2)
std::wstring to_std_wstring() const (3)
std::u16string to_std_u16string() const (4)
std::u32string to_std_u32string() const (5)
std::u8string to_std_u8string() const (6)
void to_std_string(std::string &result, bool utf8, ST::utf_validation_t validation) const (7)
void to_std_string(std::string &result, bool utf8 = true, bool substitute_out_of_range = true) const (8)
void to_std_string(std::wstring &result) const (9)
void to_std_string(std::u16string &result) const (10)
void to_std_string(std::u32string &result) const (11)
void to_std_string(std::u8string &result) const (12)
  1. Deprecated alias for #2.
  2. Convert the string content to a std::string. If utf8 is true, the data will be converted as UTF-8; otherwise, it will be converted as Latin-1.
  3. Convert the string content to a std::wstring. The string will be encoded as either UTF-16 or UTF-32 depending on your platform's wchar_t support.
  4. Convert the string content to a UTF-16 encoded std::u16string.
  5. Convert the string content to a UTF-32 encoded std::u32string.
  6. Convert the string content to a UTF-8 encoded std::u8string.
  7. Deprecated alias for #8.
  8. Convenience wrapper for #2 for easier use in templates.
  9. Convenience wrapper for #3 for easier use in templates.
  10. Convenience wrapper for #4 for easier use in templates.
  11. Convenience wrapper for #5 for easier use in templates.
  12. Convenience wrapper for #6 for easier use in templates.

Since string_theory 1.1.

Changed in 1.7: Added std::u16string and std::u32string variants.

Changed in 2.0: Added variants taking an output reference for easier use within templates and other generic code.

Changed in 2.2: Added std::u8string variants.

Changed in 3.0: Deprecated std::string overloads taking a utf_validation_t and added the bool overloads as a replacement.


ST::string::to_path

Signature
std::filesystem::path to_path() const

Convert the string to a filesystem path using the system's default path encoding.

Since string_theory 1.1.


ST::string::to_upper

Signature
string to_upper() const

Returns a copy of this string with all 7-bit ASCII characters converted to upper-case.


ST::string::to_utf8

Signature
ST::char_buffer to_utf8() const noexcept

Return a copy of the internal UTF-8 string data buffer.


ST::string::to_utf16

Signature
ST::utf16_buffer to_utf16() const

Convert the string content to UTF-16.


ST::string::to_utf32

Signature
ST::utf32_buffer to_utf32() const

Convert the string content to UTF-32.


ST::string::to_wchar

Signature
ST::wchar_buffer to_wchar() const

Convert the string content to a wchar_t buffer. The buffer is encoded either as UTF-16 or UTF-32, depending on your platform's wchar_t support.


ST::string::trim

Signature
string trim(const char *charset = ST_WHITESPACE) const

Return a string which has any characters found in charset removed from both the left and right sides of the string.

See also trim_left(), trim_right()


ST::string::trim_left

Signature
string trim_left(const char *charset = ST_WHITESPACE) const

Return a string which has any characters found in charset removed from the left side of the string.

See also trim()


ST::string::trim_right

Signature
string trim_right(const char *charset = ST_WHITESPACE) const

Return a string which has any characters found in charset removed from the right side of the string.

See also trim()


ST::string::u8_str

Signature
const char8_t *u8_str() const noexcept (1)
const char8_t *u8_str(const char8_t *substitute) const noexcept (2)

Returns a pointer to the stored UTF-8 string data. This buffer should always be nul-terminated, so it's safe to use in functions which require C-style string buffers.

For variant #2, If this string is empty, the pointer provided in substitute will be returned instead.

Since string_theory 2.2.


ST::string::view

Signature
std::string_view view(size_t start = 0, size_t length = ST_AUTO_SIZE) const

Return a std::string_view into part or all of the string's content.

Since string_theory 2.0.


Related Non-Member Documentation

ST::equal_i

struct equal_i
{
    bool operator()(const string &left, const string &right) const noexcept
};

Functor object which returns true for case-insensitive string comparisons where the left string is equal to the right string. This is designed for STL-style containers which need case-insensitive ordering.

See also struct hash_i, struct less_i, struct equal_i, operator==()


ST::hash

struct hash
{
    size_t operator()(const string &str) const noexcept;
};

namespace std
{
    template <>
    struct hash<ST::string>
    {
         size_t operator()(const ST::string &str) const noexcept;
    };
}

Functor object which returns a reasonable hash for the provided string. This is designed for STL-style containers which use hashing for indexes, e.g. std::unordered_map.

The std::hash specialization is also provided for convenience for using ST::string objects in hash containers without needing to explicitly use the ST::hash object for hashing.

Changed in 1.7: Added std::hash specialization.

See also struct hash_i


ST::hash_i

struct hash_i
{
    size_t operator()(const string &str) const noexcept;
};

Functor object which returns a reasonable case-insensitive hash for the provided string. This is designed for STL-style containers which use hashing for indexes, e.g. std::unordered_map.

See also struct hash, struct less_i, struct equal_i


struct ST::less_i

struct less_i
{
    bool operator()(const string &left, const string &right) const noexcept
};

Functor object which returns true for case-insensitive string comparisons where the left string is less than the right string. This is designed for STL-style containers which need case-insensitive ordering.

See also struct hash_i, struct less_i, struct equal_i, operator<()


ST::operator+

Signature
string operator+(const string &left, const string &right) (1)
string operator+(const string &left, const char *right) (2)
string operator+(const char *left, const string &right) (3)
string operator+(const string &left, const wchar_t *right) (4)
string operator+(const wchar_t *left, const string &right) (5)
string operator+(const string &left, const char16_t *right) (6)
string operator+(const char16_t *left, const string &right) (7)
string operator+(const string &left, const char32_t *right) (8)
string operator+(const char32_t *left, const string &right) (9)
string operator+(const string &left, const char8_t *right) (10)
string operator+(const char8_t *left, const string &right) (11)
string operator+(const string &left, char right) (12)
string operator+(const string &left, wchar_t right) (13)
string operator+(const string &left, char16_t right) (14)
string operator+(const string &left, char32_t right) (15)
string operator+(char left, const string &right) (16)
string operator+(wchar_t left, const string &right) (17)
string operator+(char16_t left, const string &right) (18)
string operator+(char32_t left, const string &right) (19)

Returns a string which is the concatenation of left and right.

Changed in 1.6: Added overloads 12-19 to concatenate individual characters.

Changed in 1.7: Added const char16_t * and const char32_t * overloads.

Changed in 2.2: Added const char8_t * overloads.

See also operator+=()


ST::string non-member comparison operators

Signature
bool operator==(const ST::null_t &, const string &right) noexcept (1)
bool operator!=(const ST::null_t &, const string &right) noexcept (2)
  1. Returns right.empty().
  2. Returns !right.empty().

Changed in 3.4: Deprecated null_t overloads.

See also operator==()


ST::literals::operator"" _st

using namespace ST::literals;
Signature
ST::string operator"" _st(const char *str, size_t size) (1)
ST::string operator"" _st(const wchar_t *str, size_t size) (2)
ST::string operator"" _st(const char16_t *str, size_t size) (3)
ST::string operator"" _st(const char32_t *str, size_t size) (4)
ST::string operator"" _st(const char8_t *str, size_t size) (5)

User-defined literal operator to convert string literals to ST::string objects efficiently.

  1. The string literal should be encoded as UTF-8.

    Because this operator assumes UTF-8 data, it is as efficient as using ST_LITERAL() to construct the string.

  2. The string literal should be encoded as either UTF-16 or UTF-32, depending on your platform's wchar_t support.

  3. The string literal should be encoded as UTF-16.

  4. The string literal should be encoded as UTF-32.

  5. The string literal should be encoded as UTF-8.

Changed in 2.2: Added const char8_t * overload.

Changed in 3.0: Moved these to the ST::literals namespace to avoid polluting the global namespace.

See also ST_LITERAL(), from_utf8(), from_wchar(), from_utf16(), from_utf32()


Macro Documentation

ST_LITERAL(str)

#define ST_LITERAL(str)  ...

Construct an ST::string from static UTF-8 data in an efficient manner. This bypasses the normal constructor size and validity checks to construct the string object directly (at compile time), making it much faster than the string("...") or from_utf8("...") constructors.

See also operator"" _st()


ST_WHITESPACE

#define ST_WHITESPACE " \t\r\n"

A default set of whitespace characters, useful for trimming and tokenizing strings.

See also trim(), tokenize()

Clone this wiki locally