Fundamentals » Strings »

 

Strings

User Interaction Table Of Contents Selection Statements


Strings are very common and fundamental in most programming languages. In Python, strings are a built-in type and categorized as an immutable ordered sequence of characters.

Creating Strings

Python strings are objects of the built-in str class. In an earlier chapter you learned that string literals are represented as a literal sequence of characters enclosed within a pair of either single (') or double quotes (").

'string'
"string's"

When a literal string appears in your program, a str object is automatically created. Thus, the statement

name = "John Smith"
 

creates an object storing the given literal string and its reference is assigned to name. You can also create a string using the str() constructor

student = str( 'Jane Green' )
 

The string constructor can also be used to create string representations of numeric and boolean values.

x = 45
intStr = str( x )          # '45'
floatStr = str( 56.89 )    # '56.89'
boolStr = str( False )     # 'False'
 

Likewise, the numeric type constructors can be used to convert numeric strings to the respective type. An exception is raised if the string does not contain a valid literal numeric value.

i = int( "85" )      # 85
f = float( '3.14' )  # 3.14
 

Strings can also be created via various string operations and methods which are presented in later sections.

Escape Sequences

Escape sequences are used in Python as they are in Java to represent or encode special characters such as newlines, tabs and even the quote characters. Consider the following example

msg = 'Start a newline here.\nusing the \\n character."

which produces

Start a newline here.
using the \n character.

The common escape sequences are shown in the following table

SequenceDescription
\\Backslash (produces \)
\'Single quote (produces ')
\"Double quote (produces ")
\nNewline (produces a newline)
\tHorizontal tab

Multiline Strings

In addition to literal strings using single or double quotes, Python also has a literal block string in which in which white space (spaces and newlines) is maintained without the need for newline characters or string concatenation. A pair of triple quotes is used to represent a block string as illustrated in the following example

"""This is a string which
can continue onto a new line. When printed, it will appear
   exactly as written between the trip quotes.
"""

When the string object is created for this literal, Python inserts newline characters (\n) at the end of each line of the text block and adds blank spaces everywhere they appear within the literal, including at the front of each line. Single or double quotes can be used with the tripple quote representation

'''Here is another
        multiline string example 
     using triple single quotes.'''

Basic Operations

Python provides a number of basic operations that can be performed on strings. Some of these are equivalent to operations in Java, but others are advanced features of Python. A complete list of the string methods and operators are described in the appendix.

String Concatenation

In Java, two strings can be concatenated using the plus operator as illustrated below

String strvar = "This is ";
String fullstr = strvar + "a string";
 

The same can be done in Python

strvar = 'This is '
fullstr = strvar + "a string"
 

A major difference however, is that string concatenation can only be done with strings and not other data types. To append a numeric value to the end of a string, you must first create a string representation using the str() constructor.

result = "The value of x is " + str( x )
 

Two strings literals can also be concatenated by placing them adjacent to each other

print "These two string literals " "will be concatenated."
 

String Length

In Java, the length of a string was obtained using the length() method

// Java string length
System.out.println("Length of the string = " + name.length());
 

Python provides the built-in len() function that is used to get the length of a string

print "Length of the string = ", len( name )
 

Character Access

To access an individual character within a string, Java provided the charAt() method. The following example extracts and prints the first and last character of a string

// Java character access.
String msg = "This is a string";
System.out.println( "The first character is " +
                    msg.charAt( 0 ) );
System.out.println( "The last characater is " +
                    msg.charAt( msg.length() - 1 ) );
 

In Python, we use an array subscript with the first character having an index of zero. The following illustrates an equivalent Python code segment for the Java code above

msg = "This is a string!"
print "The first character is", msg[ 0 ]
print "The last character is", msg[ len( msg ) - 1 ]
 

Python allows you to index from the end instead of the front by using negative subscripts. The last statement in th previous code segment could be rewritten as follows

print "The last character is", msg[ -1 ]
 

The following figure illustrates the front and back index references using the name string created in an earlier example

If you attempt to access an element of the string that is out of range [0..len(s)] or [-len(s)..−1] an exception will be raised.

Extracting Substrings

In Java, you were able to extract a substring from a string object using the substring() method of the String class

// Java substring extraction.
String name = "John Smith";
String first = name.substring( 0, 4 )// "John"
String last = name.substring( 5 );      // "Smith"
 

Python provides the slicing operator for extracting a substring. The following example is the Python equivalent of the previous example

name = "John Smith"
first = name[0:4]
last = name[5:]
 

You can also slice a string from the end using a negative index. The following statement

end = name[:-4]
 

extracts the substring “John S” from the string name and creates a new string which is assigned to end.

String Duplication

Python provides a string operation not found in Java for duplicating or repeating a string. Printing a dashed line is a common operation in text-based applications. One way to do it is as a string literal

print "---------------------------------------------"
 

or we could do it the easy way and use the repeat operator

print "-" * 45
 

which produces the same results. When applied to a string and an integer value, the * operator is treated as the Python repeat operator.

Formatted Strings

Python overloads the binary and modulus operator (%) to work with strings. When applied to a string, it creates a new formatted string similar to the printf() method in Java and the sprintf() function in C. Consider the following example

output = "The average grade = %5.2f" % avgGrade
print output
 

which creates a new string using the format definition string from the left side of the % operator and replacing the format specifier (%5.2f) with the value of the avgGrade variable. The resulting string is then printed to standard output. The more common style is to combine these two statements

print "The average grade = %5.2f" % avgGrade
 

which is then equivalent to Java’s printf() method

System.out.printf( "The average grade = %5.2f\n", avgGrade );
 

The formal description for the string format operator is shown below and consists of two parts: the format definition string and the values used to replace the format codes within that definition

   format-definition-string % replacement-value(s)

If more than one format specifier is used in the format definition, the replacement values must be placed within parentheses and separated with a comma

print "Origin: (%d, %d)\n" % (pointX, pointY)
 

The general structure of a format specifier is

%[flags][width][.precision]code

where

flags
Used to indicate zero fills (0) which fills preceding blank spaces within the field with 0 and optional justification within the given field width: + for right-justification or - for left-justification.
width
An integer value indicating the number of spaces in the current field used when formatting the replacement value.
precision
The number of digits to be printed after the decimal place when printing a real value.
code
One of the format specificer codes which are the same as those found in Java
CodeDescription
%sString (or any object)
%cCharacter (from an ASCII value)
%dDecimal or integer value
%iInteger value (same as %d)
%uUnsigned integer
%oOctal integer
%xHexadecimal integer
%XSame as %x but uppercase
%eFloating-point with exponent
%ESame as %e but uppercase
%fFloating-point no exponent
%gSame as %e or %f
%GSame as %g but uppercase
%%Prints a literal %

The following sample program, which is a modified version of the wages.py program, illustrates the use of formatted strings.

Program: wagesfmt.py
# wagesfmt.py
# Computes the taxes and wages for an employee given the
# number of hours worked and their pay rate. The results
# are printed using formatted strings.

# Set tax rates as constants.
STATE_TAX_RATE = 0.035
FED_TAX_RATE = 0.15

# Extract data from the user.
employee = raw_input( "Employee name: " )
hours = float( raw_input( "Hours worked: " ) )
payRate = float( raw_input( "Pay rate: " ) )

# Compute the employee's taxes and wages.
wages = hours * payRate
stateTaxes = wages * STATE_TAX_RATE
fedTaxes = wages * FED_TAX_RATE
takeHome = wages - stateTaxes - fedTaxes

# Print the results.
print "PAY REPORT"
print "Employee: %s" % employee
print "----------------------------------"
print "Wages:       %8.2f" % wages
print "State Taxes: %8.2f" % stateTaxes
print "Fed Taxes:   %8.2f" % fedTaxes
print "Pay:         %8.2f" % takeHome
 



User Interaction Table Of Contents Selection Statements

© 2006 - 2008: Rance Necaise - Page last modified on July 31, 2008, at 10:07 AM