Python Tutorial (11) - Strings

Time: Column:Python views:218

Strings in Python

Strings are one of the most commonly used data types in Python. We can create strings using either single quotes (') or double quotes (").

Creating a string is simple; just assign a value to a variable. For example:

var1 = 'Hello World!'
var2 = "Runoob"

Accessing String Values in Python

Python does not have a distinct type for single characters; a single character is treated as a string in Python.

To access substrings, you can use square brackets ([]) to slice the string. The syntax for string slicing is as follows:

variable[start_index:end_index]

The index value starts at 0, and -1 is used to indicate the last position from the end.

Python Tutorial (11) - Strings

Example (Python 3.0+)

#!/usr/bin/python3
 
var1 = 'Hello World!'
var2 = "Runoob"
 
print("var1[0]: ", var1[0])
print("var2[1:5]: ", var2[1:5])

Output:

var1[0]:  H
var2[1:5]:  unoo


Updating Strings in Python

You can slice a part of a string and concatenate it with another string, as shown in the following example:

Example (Python 3.0+)

#!/usr/bin/python3
 
var1 = 'Hello World!'
 
print("Updated String: ", var1[:6] + 'Runoob!')

Output:

Updated String:  Hello Runoob!


Python Escape Characters

When special characters are needed within strings, Python uses a backslash (\) to introduce escape characters. Here’s a table of common escape sequences:

Escape CharacterDescriptionExample
\ (at line end)Line continuation characterprint("line1 \ line2 \ line3")
\\Backslashprint("\\") results in \
\'Single quoteprint('\'') results in '
\"Double quoteprint("\"") results in "
\aBell soundprint("\a") triggers a beep sound
\bBackspaceprint("Hello \b World!")
\000Null characterprint("\000")
\nNewlineprint("\n")
\vVertical tabprint("Hello \v World!")
\tHorizontal tabprint("Hello \t World!")
\rCarriage returnprint("Hello\rWorld!")
\fForm feedprint("Hello \f World!")
\oooOctal valueprint("\110\145\154\154\157")
\xhhHexadecimal valueprint("\x48\x65\x6c\x6c\x6f")

Example: Using \r to Create a Progress Bar

import time

for i in range(101):
    print("\r{:3}%".format(i), end=' ')
    time.sleep(0.05)

In the following example, we demonstrate the effects of various escape characters such as single quotes, newlines, tabs, backspaces, form feeds, ASCII, binary, octal, and hexadecimal values:

Example

print('\'Hello, world!\'')  # Output: 'Hello, world!'

print("Hello, world!\nHow are you?")  # Output: Hello, world!
                                      #         How are you?

print("Hello, world!\tHow are you?")  # Output: Hello, world!    How are you?

print("Hello,\b world!")  # Output: Hello world!

print("Hello,\f world!")  # Output:
                          # Hello,
                          #  world!

print("ASCII value of 'A' is:", ord('A'))  # Output: ASCII value of 'A' is: 65

print("\x41 is the ASCII code for A")  # Output: A is the ASCII code for A

You can also convert decimal numbers to binary, octal, and hexadecimal formats:

decimal_number = 42
binary_number = bin(decimal_number)  # Decimal to binary
print('Binary conversion:', binary_number)  # Output: Binary conversion: 0b101010

octal_number = oct(decimal_number)  # Decimal to octal
print('Octal conversion:', octal_number)  # Output: Octal conversion: 0o52

hexadecimal_number = hex(decimal_number)  # Decimal to hexadecimal
print('Hexadecimal conversion:', hexadecimal_number)  # Output: Hexadecimal conversion: 0x2a

Python String Operators

In the examples below, the variable a holds the value "Hello" and b holds "Python":

OperatorDescriptionExample
+Concatenates stringsa + b results in: HelloPython
*Repeats the stringa * 2 results in: HelloHello
[]Accesses characters in a string by indexa[1] results in: e
[ : ]Slices a part of the string (left inclusive, right exclusive)a[1:4] results in: ell
inMembership operator - returns True if the character is in the string'H' in a results in: True
not inMembership operator - returns True if the character is not in the string'M' not in a results in: True
r/RRaw string - interprets the string literally (without escape characters)print(r'\n') outputs: \n
%String formatting (explained in the next section)See formatting section below

Example (Python 3.0+)

#!/usr/bin/python3
 
a = "Hello"
b = "Python"
 
print("a + b results in:", a + b)
print("a * 2 results in:", a * 2)
print("a[1] results in:", a[1])
print("a[1:4] results in:", a[1:4])
 
if "H" in a:
    print("H is in variable a")
else:
    print("H is not in variable a")
 
if "M" not in a:
    print("M is not in variable a")
else:
    print("M is in variable a")
 
print(r'\n')
print(R'\n')

Output:

a + b results in: HelloPython
a * 2 results in: HelloHello
a[1] results in: e
a[1:4] results in: ell
H is in variable a
M is not in variable a
\n
\n

Python String Formatting

Python supports string formatting, which allows you to insert values into a string that contains placeholders. The basic syntax uses the % operator, similar to the sprintf function in C.

Example (Python 3.0+)

#!/usr/bin/python3
 
print("My name is %s and I am %d years old!" % ('Xiao Ming', 10))

Output:

My name is Xiao Ming and I am 10 years old!

Python String Formatting Symbols:

SymbolDescription
%cFormats a character and its ASCII code
%sFormats a string
%dFormats an integer
%uFormats an unsigned integer
%oFormats an unsigned octal number
%xFormats an unsigned hexadecimal number (lowercase)
%XFormats an unsigned hexadecimal number (uppercase)
%fFormats a floating-point number, allows specifying precision
%eFormats a floating-point number in scientific notation
%ESame as %e but uses uppercase E
%gShorter of %f or %e
%GShorter of %f or %E
%pFormats the address of a variable in hexadecimal

Formatting Operator Auxiliaries:

SymbolFunction
*Defines width or precision
-Left-align the result
+Shows a plus sign for positive numbers
<sp>Adds a space before positive numbers
#Displays a zero (0) for octal numbers or 0x/0X for hex
0Pads numbers with zeros (instead of spaces)
%%Inserts a literal % symbol
(var)Maps variables (dictionary parameters)
m.n.Minimum field width (m), precision (n) if applicable

Starting with Python 2.6, a new formatting method str.format() was introduced, which offers enhanced formatting capabilities.

Python Triple Quotes

Python triple quotes allow for multi-line strings, including special characters like newlines, tabs, and more. Here’s an example:

Example (Python 3.0+)

#!/usr/bin/python3

para_str = """This is an example of a multi-line string.
Multi-line strings can include tabs
TAB ( \t ).
They can also include newlines [ \n ].
"""
print(para_str)

The output of this code is:

This is an example of a multi-line string.
Multi-line strings can include tabs
TAB (    ).
They can also include newlines [  ].

Triple quotes free programmers from the hassle of handling special characters and string concatenation, offering a WYSIWYG (What You See Is What You Get) format.

A typical use case for triple quotes is when you need to embed large blocks of HTML or SQL. String concatenation and escaping special characters would become cumbersome in such scenarios.

errHTML = '''
<HTML><HEAD><TITLE>
Friends CGI Demo</TITLE></HEAD>
<BODY><H3>ERROR</H3>
<B>%s</B><P>
<FORM><INPUT TYPE=button VALUE=Back
ONCLICK="window.history.back()"></FORM>
</BODY></HTML>
'''
cursor.execute('''
CREATE TABLE users (  
login VARCHAR(8), 
uid INTEGER,
prid INTEGER)
''')

f-strings

f-strings were introduced in Python 3.6 and are known as "formatted string literals." They provide a concise and easy-to-read syntax for formatting strings.

Previously, formatting was done using the % operator:

Example

>>> name = 'Runoob'
>>> 'Hello %s' % name
'Hello Runoob'

In f-strings, the string starts with an f and expressions are embedded inside curly braces {}. Python evaluates the expressions and inserts the results into the string:

Example

>>> name = 'Runoob'
>>> f'Hello {name}'  # Variable substitution
'Hello Runoob'
>>> f'{1 + 2}'       # Using an expression
'3'

>>> w = {'name': 'Runoob', 'url': 'www.runoob.com'}
>>> f'{w["name"]}: {w["url"]}'
'Runoob: www.runoob.com'

This approach is simpler and eliminates the need to decide between using %s, %d, etc.

In Python 3.8, you can use the = symbol to include both the expression and its result:

Example

>>> x = 1
>>> print(f'{x + 1}')   # Python 3.6
2

>>> x = 1
>>> print(f'{x + 1=}')   # Python 3.8
x + 1=2

Unicode Strings

In Python 2, regular strings are stored as 8-bit ASCII, while Unicode strings are stored as 16-bit Unicode, allowing for a wider range of characters. The syntax involves adding the prefix u before the string.

In Python 3, all strings are Unicode by default.

Built-in String Methods in Python

Below are some common built-in string methods:

No.Method & Description
1capitalize() Converts the first character of the string to uppercase.
2center(width, fillchar) Returns a centered string of the specified width, filled with the fillchar (default is space).
3count(str, beg=0, end=len(string)) Returns the number of occurrences of str in the string within the optional range [beg:end].
4bytes.decode(encoding="utf-8", errors="strict") Decodes the bytes object using the specified encoding.
5encode(encoding='UTF-8', errors='strict') Encodes the string using the specified encoding format.
6endswith(suffix, beg=0, end=len(string)) Checks if the string ends with the given suffix within the optional range [beg:end]. Returns True or False.
7expandtabs(tabsize=8) Replaces tabs in the string with spaces. The default tab size is 8.
8find(str, beg=0, end=len(string)) Searches for str within the string. Returns the starting index if found, otherwise -1.
9index(str, beg=0, end=len(string)) Similar to find(), but raises an error if str is not found.
10isalnum() Returns True if the string contains only alphanumeric characters.
11isalpha() Returns True if all characters in the string are alphabetic.
12isdigit() Returns True if the string contains only digits.
13islower() Returns True if all cased characters in the string are lowercase.
14isnumeric() Returns True if all characters in the string are numeric.
15isspace() Returns True if the string contains only whitespace characters.
16istitle() Returns True if the string is title-cased.
17isupper() Returns True if all cased characters in the string are uppercase.
18join(seq) Joins the elements of the sequence seq into a string, separated by the string invoking the method.
19len(string) Returns the length of the string.
20ljust(width, [fillchar]) Returns a left-justified string padded to the specified width using fillchar (default is space).
21lower() Converts all uppercase characters in the string to lowercase.
22lstrip() Removes leading whitespace or specified characters from the string.
23maketrans() Creates a translation table for string transformations.
24max(str) Returns the largest character in the string.
25min(str) Returns the smallest character in the string.
26replace(old, new, [max]) Replaces occurrences of old with new in the string, up to max times if specified.
27rfind(str, beg=0, end=len(string)) Searches for str from the right and returns the index if found, otherwise -1.
28rindex(str, beg=0, end=len(string)) Similar to rfind() but raises an error if str is not found.
29rjust(width, [fillchar]) Returns a right-justified string padded to the specified width.
30rstrip() Removes trailing whitespace or specified characters from the string.
31split(str="", num=string.count(str)) Splits the string using str as the delimiter. If num is specified, splits into num+1 parts.
32splitlines([keepends]) Splits the string by lines. If keepends is True, newlines are kept in the resulting list.
33startswith(substr, beg=0, end=len(string)) Checks if the string starts with substr within the optional range [beg:end].
34strip([chars]) Removes leading and trailing whitespace or specified characters.
35swapcase() Converts uppercase letters to lowercase and vice versa.
36title() Converts the string to title case.
37translate(table, deletechars="") Transforms characters in the string according to the translation table.
38upper() Converts all lowercase characters in the string to uppercase.
39zfill(width) Returns a string padded with zeros on the left to the specified width.
40isdecimal() Returns True if the string contains only decimal characters.