Commonly Used Long String Routines in Delphi

The long string handling routines cover several functional areas. Within these areas, some are used for the same purpose, the differences being whether they use a particular criterion in their calculations. The following tables list these routines by these functional areas:

  • Comparison
  • Case conversion
  • Modification
  • Sub-string

Where appropriate, the tables also provide columns indicating whether a routine satisfies the following criteria.

  • Uses case sensitivity: If locale settings are used, it determines the definition of case. If the routine does not use locale settings, analyses are based upon the ordinal values of the characters. If the routine is case-insensitive, there is a logical merging of upper and lower case characters that is determined by a predefined pattern.
  • Uses locale settings: Locale settings allow you to customize your application for specific locales, in particular, for Asian language environments. Most locale settings consider lowercase characters to be less than the corresponding uppercase characters.

This is in contrast to ASCII order, in which lowercase characters are greater than uppercase characters. Routines that use the system locale are typically prefaced with Ansi (that is, AnsiXXX).

  • Supports the multi-byte character set (MBCS): MBCSs are used when writing code for far eastern locales. Multi-byte characters are represented by one or more character codes, so the length in bytes does not necessarily correspond to the length of the string.

The routines that support MBCS parse one- and multibyte characters. ByteType and StrByteType determine whether a particular byte is the lead byte of a multibyte character. Be careful when using multibyte characters not to truncate a string by cutting a character in half.

Do not pass characters as a parameter to a function or procedure, since the size of a character cannot be predetermined. Pass, instead, a pointer to a to a character or string.

When you declare a long string:

S: string;

you do not need to initialize it. Long strings are automatically initialized to empty. To test a string for empty you can either use the EmptyStr variable:

S = EmptyStr; or test against an empty string: S = ‘’;

An empty string has no valid data. Therefore, trying to index an empty string is like trying to access nil and will result in an access violation. Similarly, if you cast an empty string to a PChar, the result is a nil pointer. So, if you are passing such a PChar to a routine that needs to read or write to it, be sure that the routine can handle nil.

If it cannot, then you can either initialize the string:

S := ‘No longer nil’; proc(PChar(S));// proc does not need to handle nil now

or set the length, using the SetLength procedure:

SetLength(S, 100);//sets the dynamic length of S to 100 proc(PChar(S));// proc does not need to handle nil now

When you use SetLength, existing characters in the string are preserved, but the contents of any newly allocated space is undefined. Following a call to SetLength, S is guaranteed to reference a unique string, that is a string with a reference count of one.

To obtain the length of a string, use the Length function. Remember when declaring a string that:

S: string[n];

implicitly declares a short string, not a long string of n length. To declare a long string of specifically n length, declare a variable of type string and use the SetLength procedure.

S: string; SetLength(S, n);

Short, long, and wide strings can be mixed in assignments and expressions, and the compiler automatically generates code to perform the necessary string type conversions. However, when assigning a string value to a short string variable, be aware that the string value is truncated if it is longer than the declared maximum length of the short string variable.

Long strings are already dynamically allocated. If you use one of the built-in pointer types, such as PAnsiString, PString, or PWideString, remember that you are introducing another level of indirection. Be sure this is what you intend.

Additional functions (CopyQStringListToTstrings, Copy TStringsToQStringList, QStringListToTStringList) are provided for converting underlying Qt string types and CLX string types. These functions are located in Qtypes.pas.