CIS 150 Strings and characters

Objectives

  • Differentiate the string and char data types in C++
  • Differentiate between C-style strings and C++ string objects
  • Use character handling functions
  • Use C++ string handling functions

The char data type

  • The char data type in C++ is used to store single characters.
  • A space, a digit, a letter are all considered single characters.
  • Some characters require a special two character representation. This is called "escaping." To escape a character, you put a backslash in front of it.
  • Common escape sequences are: \n for a newline, \r for a carriage return, \t for a tab, \b for a backspace, \" for a quote, \0 for the null character, and \\ for a backslash.
  • The most common mistake beginners make is using the wrong slash to escape a character. /n is two characters, while \n is one character (a newline).
  • char literals use the apostrophe (single quote) as a delimiter. Example: 'a', '?', '8'
  • The char data type is an integer data type internally. So if you add 1 to 'a', you get 'b'.
  • Technically, an 'A' is actually 65 internally, and 'a' is 97 internally.
  • The char data type is one of the built-in data types in C++.

Character functions

  • See www.cplusplus.com/reference/cctype/ for details.
  • The cctype library provides many useful character functions. Use: #include <cctype>
  • Note: No using statement is needed to use the cctype library functions.
  • Assuming the variable ch has been declared as a char data type:
    • isupper(ch) returns an int other than 0 (true) if ch is an uppercase letter, otherwise it returns 0 (false)
    • islower(ch) returns an int other than 0 (true) if ch is a lowercase letter, otherwise it returns 0 (false)
    • isalpha(ch) returns an int other than 0 (true) if ch is an alphabetic character, otherwise it returns 0 (false)
    • isdigit(ch) returns an int other than 0 (true) if ch is a digit (0-9), otherwise it returns 0 (false)
    • isxdigit(ch) returns an int other than 0 (true) if ch is a hexadecimal digit, otherwise it returns 0 (false)
    • isprint(ch) returns an int other than 0 (true) if ch is a printable character, otherwise it returns 0 (false)
    • isalnum(ch) returns an int other than 0 (true) if ch is an alphanumeric character, otherwise it returns 0 (false)
    • isspace(ch) returns an int other than 0 (true) if ch is a whitespace (tab, space, etc.) character, otherwise it returns 0 (false)
    • isprint(ch) returns an int other than 0 (true) if ch is a printable character, otherwise it returns 0 (false)
  • There are also a couple of very useful character conversion functions.
  • Assuming the variable ch has been declared as a char data type:
    • tolower(ch) returns the lowercase of ch, or just ch if it was not an uppercase letter
    • toupper(ch) returns the uppercase of ch, or just ch if it was not a lowercase letter

C-style strings

  • C-style strings are an array of characters with a null character at the end of the string.
  • C-style strings are a built-in data type since arrays and chars are part of the language.
  • You will usually only see C-style strings as string literals in this course.
  • String literals use the double quote as a delimiter. Example: "This is a C-style string"
  • If you want access to the C-style string library, use: #include <cstring>
  • No "using" statement is needed for using C-style string functions
  • The main function you are likely to want is strlen(str). It returns the number of characters in a C-style string. Example: cout << strlen("CIS 150") << endl; // will display the number 7
  • Note: In the example above, the C-style string literal could have been a C-style string variable.
  • You can access specific characters of a C-style string. For example, to display the third character of a string variable called name, use: cout << name[2];
  • Note that the positions in a string are 0-based, so the first character is at position 0, the second at position 1, etc.
  • You can also set specific positions of a C-style string. For example, to set the fourth character of a string variable called name to 'x', use: name[3] = 'x';
  • Commonly C-style string functions (note that many of these functions may have security restrictions on some compilers):
    • Returns the number of characters in str up to, but not including the terminating null: strlen(str)
    • Copy characters from source_str to dest_str: strcpy(dest_str, source_str)
    • Copy up to numChars characters from source_str to dest_str: strncpy(dest_str, source_str, numChars)
    • Copy and append characters from source_str to the end of dest_str: strcat(dest_str, source_str, numChars)
    • Copy and append up to numChars characters from source_str to the end of dest_str: strncat(dest_str, source_str, numChars)
    • Returns pointer to first occurrence of ch in str, or null pointer if not found: strchr(str, ch)
    • Returns pointer to last occurrence of ch in str, or null pointer if not found: strrchr(str, ch)
    • Returns pointer to first occurrence of str2 in str1, or null pointer if not found: strstr(str1, str2)
    • Compares str1 and str2, returns < 0 if str1 < str2, > 0 if str1 > str2, 0 if str1 == str2: strcmp(str1, str2)
    • Compares up to numChars characters of str1 and str2, returns < 0 if str1 < str2, > 0 if str1 > str2, 0 if str1 == str2: strncmp(str1, str2, numChars)
    • Used to break a string into tokens given a list of delimiters (beyond scope of this course): strtok(str, str_delimiters)

C++ string objects

  • C++ string objects are NOT a primitive, built-in data type, They are C++ objects.
  • Before main(), include the string library: #include <string>
  • After the include, but before main(): using namespace std;
  • To be more specific, you could use: using std::string;
  • String variables are declared just like any other variable: string name;
  • You can initialize string variables when they are declared: string name = "Bob";
  • Note: In the example above, the string object was initialized using a C-style string literal.

Using C++ string objects

  • You can print string objects using cout: cout << name;
  • You can get string objects from the keyboard using cin: cin >> name;
  • Note: The above example will not be able to get names which contain a space or a tab.
  • You can get string objects which contain whitepace from the keyboard using getline: getline(cin, name);
  • There are many commonly used string functions. Assuming that str is a string object variable:
    • To get the length of the string: str.size()
    • To get the length of the string: str.length()
    • To clear all the characters from the string: str.clear()
    • To display the fourth character from the string: cout << str[3];
    • To set the second character of the string to '?': str[1] = '?';
    • To get a C-ctyle string from a C++ string object: str.c_str()