CIS 160 - Text processing

Objectives

  • Convert String objects to numeric values
  • Convert numeric values to String objects
  • Tokenize strings using StringTokenizer
  • Tokenize strings using String's "split" method

Convert a String to a number

  • Many times input in Java is in the form of a String, even if it is a number that is wanted. You have to be able to convert a String into a number.
  • All of the numeric data types have wrapper class to convert Strings to numbers.
  • For the following examples, assume the String str represents an appropriate numeric value.
  • byte n = Byte.parseByte(str); // converts str to a byte
  • short n = Short.parseShort(str); // converts str to a short
  • int n = Integer.parseInt(str); // converts str to a int
  • long n = Long.parseLong(str); // converts str to a long
  • float n = Float.parseFloat(str); // converts str to a float
  • double n = Double.parseDouble(str); // converts str to a double
  • All of the conversions may throw a NumberFormatException if the String is not valid for the type of number requested. A try/catch block can be used to catch that exception.
  • The integer conversions (byte, short, int, long) may also specify the number base to be used in the conversion as a second argument.

Example of input validation for integers

The following example will catch errors such as the user typing in their name instead of a number.

int value = 0; boolean valid = false; String strIn; java.util.Scanner kbd = new java.util.Scanner(System.in); while (!valid) { System.out.print("Enter an integer: "); strIn = kbd.nextLine(); try { value = Integer.parseInt(strIn); // range checking code could go here valid = true; } catch (NumberFormatException e) { System.out.println("Error: Invalid integer"); } }

Converting a number to a String

  • There are many ways to convert a number into a String.
  • You can concatenate a number to a String:
    String s = "" + 6.3 // results in the String: "6.3"
  • The String class has several conversion methods for converting a number to a String
  • String s = String.valueOf(13); // converts an int to a String
  • String s = String.valueOf(13L); // converts a long to a String
  • String s = String.valueOf(13.2); // converts a double to a String
  • String s = String.valueOf(13.2f); // converts a float to a String
  • String.format can be used to do the conversion and apply special formatting at the same time: String s = String.format("%07.3f", 3.14159); // results in s being "003.142"

Using StringTokenizer

  • Strings can be broken into tokens using StringTokenizer. You supply the String to be tokenized and a list of delimiters, which are symbols that specify the breaks between tokens.
  • One common file format to tokenize is CSV (comma separated variable), where commas are the delimiters between the tokens.
  • Example of a CSV String with three tokens: Smith,1.40,-98
  • A StringTokenizer can tell you quickly how many tokens are present in a String.
  • A StringTokenizer can also let you get each token one-at-a-time.
  • The delimiters you specify in a StringTokenizer are limited to single characters, but if multiple delimiters occur with no tokens in between, they are treated as a single delimiter.
  • The .countTokens() method gives you a count of how many tokens were found in the String.

StringTokenizer example

String strIn = "This is a test, albeit a short one."; java.util.StringTokenizer tok = new StringTokenizer(strIn, " .,"); System.out.printf("%d tokens were found%n", tok.countTokens()); while (tok.hasMoreTokens()) { String token = tok.nextToken(); System.out.println(token); } // Displays: 8 tokens were found This is a test albeit a short one

Using String.split(String regex)

  • The split method provides another way to tokenize strings.
  • The split method expects a regular expression as an argument. The regular expression specifies the delimiters. Since it is a regular expression, the delimiter specification can get quite sophisticated (and unreadable).
  • Since a regular expression is used, the delimiters are not limited to single characters.
  • The split method returns an array of Strings as its result.
  • You can quickly determine how many tokens were found by checking the size of the String array returned.

String.split() example

String strIn = "Kishwaukee College,CIS 160,A-1374,12"; String[] fields = strIn.split(","); System.out.printf("%d tokens were found%n", fields.length); for (String s : fields) { System.out.println(s); } // Displays: 4 tokens were found Kishwaukee College CIS 160 A-1374 12