ministring.h revision c58f1213e628a545081c70e26c6b67a841cff880
883N/A * IPRT - C++ string class. 883N/A * Copyright (C) 2007-2012 Oracle Corporation 883N/A * This file is part of VirtualBox Open Source Edition (OSE), as 883N/A * you can redistribute it and/or modify it under the terms of the GNU 883N/A * General Public License (GPL) as published by the Free Software 883N/A * Foundation, in version 2 as it comes in the "COPYING" file of the 883N/A * VirtualBox OSE distribution. VirtualBox OSE is distributed in the 883N/A * hope that it will be useful, but WITHOUT ANY WARRANTY of any kind. 883N/A * The contents of this file may alternatively be used under the terms 883N/A * of the Common Development and Distribution License Version 1.0 883N/A * (CDDL) only, as it comes in the "COPYING.CDDL" file of the 883N/A * VirtualBox OSE distribution, in which case the provisions of the 883N/A * CDDL are applicable instead of those of the GPL. 883N/A * You may elect to license modified versions of this file under the 1574N/A * terms and conditions of either the GPL or the CDDL or both. 883N/A/** @defgroup grp_rt_cpp_string C++ String support 1653N/A/** @brief C++ string class. 956N/A * This is a C++ string class that does not depend on anything else except IPRT 883N/A * memory management functions. Semantics are like in std::string, except it 883N/A * Note that RTCString does not differentiate between NULL strings 883N/A * and empty strings. In other words, RTCString("") and RTCString(NULL) 883N/A * behave the same. In both cases, RTCString allocates no memory, reports 883N/A * a zero length and zero allocated bytes for both, and returns an empty 883N/A * C string from c_str(). 1574N/A * @note RTCString ASSUMES that all strings it deals with are valid UTF-8. 1574N/A * The caller is responsible for not breaking this assumption. 883N/A /** @remarks Much of the code in here used to be in com::Utf8Str so that 883N/A * com::Utf8Str can now derive from RTCString and only contain code 883N/A * that is COM-specific, such as com::Bstr conversions. Compared to 883N/A * the old Utf8Str though, RTCString always knows the length of its 883N/A * member string and the size of the buffer so it can use memcpy() 913N/A * Creates an empty string that has no memory allocated. * Creates a copy of another RTCString. * This allocates s.length() + 1 bytes for the new instance, unless s is empty. * @param a_rSrc The source string. * Creates a copy of a C string. * This allocates strlen(pcsz) + 1 bytes for the new instance, unless s is empty. * @param pcsz The source string. * Create a partial copy of another RTCString. * @param a_rSrc The source string. * @param a_offSrc The byte offset into the source string. * @param a_cchSrc The max number of chars (encoded UTF-8 bytes) * to copy from the source string. * Create a partial copy of a C string. * @param a_pszSrc The source string (UTF-8). * @param a_cchSrc The max number of chars (encoded UTF-8 bytes) * to copy from the source string. This must not * be '0' as the compiler could easily mistake * that for the va_list constructor. * Create a string containing @a a_cTimes repetitions of the character @a * @param a_cTimes The number of times the character is repeated. * @param a_ch The character to fill the string with. * Create a new string given the format string and its arguments. * @param a_pszFormat Pointer to the format string (UTF-8), * @param a_va Argument vector containing the arguments * specified by the format string. * @remarks Not part of std::string. * String length in bytes. * Returns the length of the member string in bytes, which is equal to strlen(c_str()). * In other words, this does not count unicode codepoints; use utf8length() for that. * The byte length is always cached so calling this is cheap and requires no * String length in unicode codepoints. * As opposed to length(), which returns the length in bytes, this counts * the number of unicode codepoints. This is *not* cached so calling this * @returns Number of codepoints in the member string. * The allocated buffer size (in bytes). * Returns the number of bytes allocated in the internal string buffer, which is * at least length() + 1 if length() > 0; for an empty string, this returns 0. * @returns m_cbAllocated. * Make sure at that least cb of buffer space is reserved. * Requests that the contained memory buffer have at least cb bytes allocated. * This may expand or shrink the string's storage, but will never truncate the * contained string. In other words, cb will be ignored if it's smaller than * @param cb New minimum size (in bytes) of member memory buffer. * @throws std::bad_alloc On allocation error. The object is left unchanged. * Deallocates all memory. * Assigns a copy of pcsz to "this". * @param pcsz The source string. * @throws std::bad_alloc On allocation failure. The object is left describing * @returns Reference to the object. * Assigns a copy of s to "this". * @param s The source string. * @throws std::bad_alloc On allocation failure. The object is left describing * @returns Reference to the object. * Assigns the output of the string format operation (RTStrPrintf). * @param pszFormat Pointer to the format string, * @param ... Ellipsis containing the arguments specified by * @throws std::bad_alloc On allocation error. The object is left unchanged. * @returns Reference to the object. * Assigns the output of the string format operation (RTStrPrintfV). * @param pszFormat Pointer to the format string, * @param va Argument vector containing the arguments * specified by the format string. * @throws std::bad_alloc On allocation error. The object is left unchanged. * @returns Reference to the object. * Appends the string "that" to "this". * @param that The string to append. * @throws std::bad_alloc On allocation error. The object is left unchanged. * @returns Reference to the object. * Appends the string "that" to "this". * @param pszThat The C string to append. * @throws std::bad_alloc On allocation error. The object is left unchanged. * @returns Reference to the object. * Appends the given character to "this". * @param ch The character to append. * @throws std::bad_alloc On allocation error. The object is left unchanged. * @returns Reference to the object. * Appends the given unicode code point to "this". * @param uc The unicode code point to append. * @throws std::bad_alloc On allocation error. The object is left unchanged. * @returns Reference to the object. * Shortcut to append(), RTCString variant. * @param that The string to append. * @returns Reference to the object. * Shortcut to append(), const char* variant. * @param pszThat The C string to append. * @returns Reference to the object. * Shortcut to append(), char variant. * @param pszThat The character to append. * @returns Reference to the object. * Converts the member string to upper case. * @returns Reference to the object. /* Folding an UTF-8 string may result in a shorter encoding (see testcase), so recalculate the length afterwars. */ * Converts the member string to lower case. * @returns Reference to the object. /* Folding an UTF-8 string may result in a shorter encoding (see testcase), so recalculate the length afterwars. */ * Returns the byte at the given index, or a null byte if the index is not * smaller than length(). This does _not_ count codepoints but simply points * into the member C string. * @param i The index into the string buffer. * @returns char at the index or null. * Returns the contained string as a C-style const char* pointer. * This never returns NULL; if the string is empty, this returns a * pointer to static null byte. * @returns const pointer to C-style string. inline const char *
c_str()
const * Returns a non-const raw pointer that allows to modify the string directly. * As opposed to c_str() and raw(), this DOES return NULL for an empty string * because we cannot return a non-const pointer to a static "" global. * -# Be sure not to modify data beyond the allocated memory! Call * capacity() to find out how large that buffer is. * -# After any operation that modifies the length of the string, * you _must_ call RTCString::jolt(), or subsequent copy operations * may go nowhere. Better not use mutableRaw() at all. * Clean up after using mutableRaw. * Intended to be called after something has messed with the internal string * buffer (e.g. after using mutableRaw() or Utf8Str::asOutParam()). Resets the * internal lengths correctly. Otherwise subsequent copy operations may go * Returns @c true if the member string has no length. * This is @c true for instances created from both NULL and "" input * This states nothing about how much memory might be allocated. * @returns @c true if empty, @c false if not. * Returns @c false if the member string has no length. * This is @c false for instances created from both NULL and "" input * This states nothing about how much memory might be allocated. * @returns @c false if empty, @c true if not. /** Case sensitivity selector. */ * Compares the member string to a C-string. * @param pcszThat The string to compare with. * @param cs Whether comparison should be case-sensitive. * @returns 0 if equal, negative if this is smaller than @a pcsz, positive /* This klugde is for m_cch=0 and m_psz=NULL. pcsz=NULL and psz="" are treated the same way so that str.compare(str2.c_str()) works. */ * Compares the member string to another RTCString. * @param pcszThat The string to compare with. * @param cs Whether comparison should be case-sensitive. * @returns 0 if equal, negative if this is smaller than @a pcsz, positive * Compares the two strings. * @returns true if equal, false if not. * @param that The string to compare with. * Compares the two strings. * @returns true if equal, false if not. * @param pszThat The string to compare with. /* This klugde is for m_cch=0 and m_psz=NULL. pcsz=NULL and psz="" are treated the same way so that str.equals(str2.c_str()) works. */ * Compares the two strings ignoring differences in case. * @returns true if equal, false if not. * @param that The string to compare with. /* Unfolded upper and lower case characters may require different amount of encoding space, so the length optimization doesn't work. */ * Compares the two strings ignoring differences in case. * @returns true if equal, false if not. * @param pszThat The string to compare with. /* This klugde is for m_cch=0 and m_psz=NULL. pcsz=NULL and psz="" are treated the same way so that str.equalsIgnoreCase(str2.c_str()) works. */ /** @name Comparison operators. /** Max string offset value. * When returned by a method, this indicates failure. When taken as input, * typically a default, it means all the way to the string terminator. * Find the given substring. * Looks for pcszFind in "this" starting at "pos" and returns its position * as a byte (not codepoint) offset, counting from the beginning of "this" at 0. * @param pcszFind The substring to find. * @param pos The (byte) offset into the string buffer to start * @returns 0 based position of pcszFind. npos if not found. * Replaces all occurences of cFind with cReplace in the member string. * In order not to produce invalid UTF-8, the characters must be ASCII * values less than 128; this is not verified. * @param chFind Character to replace. Must be ASCII < 128. * @param chReplace Character to replace cFind with. Must be ASCII < 128. * Count the occurences of the specified character in the string. * @param ch What to search for. Must be ASCII < 128. * @remarks QString::count * Count the occurences of the specified sub-string in the string. * @param psz What to search for. * @param cs Case sensitivity selector. * @remarks QString::count * Count the occurences of the specified sub-string in the string. * @param pStr What to search for. * @param cs Case sensitivity selector. * @remarks QString::count * Returns a substring of "this" as a new Utf8Str. * Works exactly like its equivalent in std::string. With the default * parameters "0" and "npos", this always copies the entire string. The * "pos" and "n" arguments represent bytes; it is the caller's responsibility * to ensure that the offsets do not copy invalid UTF-8 sequences. When * used in conjunction with find() and length(), this will work. * @param pos Index of first byte offset to copy from "this", counting from 0. * @param n Number of bytes to copy, starting with the one at "pos". * The copying will stop if the null terminator is encountered before * n bytes have been copied. * Returns a substring of "this" as a new Utf8Str. As opposed to substr(), * this variant takes codepoint offsets instead of byte offsets. * @param pos Index of first unicode codepoint to copy from * "this", counting from 0. * @param n Number of unicode codepoints to copy, starting with * the one at "pos". The copying will stop if the null * terminator is encountered before n codepoints have * Returns true if "this" ends with "that". * @param that Suffix to test for. * @param cs Case sensitivity selector. * @returns true if match, false if mismatch. * Returns true if "this" begins with "that". * @param that Prefix to test for. * @param cs Case sensitivity selector. * @returns true if match, false if mismatch. * Returns true if "this" contains "that" (strstr). * @param that Substring to look for. * @param cs Case sensitivity selector. * @returns true if match, false if mismatch. * Attempts to convert the member string into a 32-bit integer. * @returns 32-bit unsigned number on success. * Attempts to convert the member string into an unsigned 32-bit integer. * @returns 32-bit unsigned number on success. * Attempts to convert the member string into an 64-bit integer. * @returns 64-bit unsigned number on success. * Attempts to convert the member string into an unsigned 64-bit integer. * @returns 64-bit unsigned number on success. * Attempts to convert the member string into an unsigned 64-bit integer. * @param i Where to return the value on success. * @returns IPRT error code, see RTStrToInt64. * Attempts to convert the member string into an unsigned 32-bit integer. * @param i Where to return the value on success. * @returns IPRT error code, see RTStrToInt32. /** Splitting behavior regarding empty sections in the string. */ KeepEmptyParts,
/**< Empty parts are added as empty strings to the result list. */ * Splits a string separated by strSep into its parts. * @param a_rstrSep The separator to search for. * @param a_enmMode How should empty parts be handled. * @returns separated strings as string list. * Joins a list of strings together using the provided separator. * @param a_rList The list to join. * @param a_rstrSep The separator used for joining. * @returns joined string. * Hide operator bool() to force people to use isEmpty() explicitly. * Destructor implementation, also used to clean up in operator=() before * assigning a new string. * Protected internal helper to copy a string. * This ignores the previous object state, so either call this from a * constructor or call cleanup() first. copyFromN() unconditionally sets * the members to a copy of the given other strings and makes no * assumptions about previous contents. Can therefore be used both in copy * constructors, when member variables have no defined value, and in * assignments after having called cleanup(). * @param pcszSrc The source string. * @param cchSrc The number of chars (bytes) to copy from the * source strings. RTSTR_MAX is NOT accepted. * @throws std::bad_alloc On allocation failure. The object is left * describing a NULL string. char *
m_psz;
/**< The string buffer. */ size_t m_cch;
/**< strlen(m_psz) - i.e. no terminator included. */ /** @addtogroup grp_rt_cpp_string * Concatenate two strings. * @param a_rstr1 String one. * @param a_rstr2 String two. * @returns the concatenate string. * Concatenate two strings. * @param a_rstr1 String one. * @param a_psz2 String two. * @returns the concatenate string. * Concatenate two strings. * @param a_psz1 String one. * @param a_rstr2 String two. * @returns the concatenate string. * Class with RTCString::printf as constructor for your convenience. * Constructing a RTCString string object from a format string and a variable * number of arguments can easily be confused with the other RTCString * constructors, thus this child class. * The usage of this class is like the following: RTCStringFmt strName("program name = %s", argv[0]); * Constructs a new string given the format string and the list of the * arguments for the format string. * @param a_pszFormat Pointer to the format string (UTF-8), * @param ... Ellipsis containing the arguments specified by