1+ [ Contents] ( ../Contents ) \| [ Previous (1.3 Numbers)] ( 03_Numbers ) \| [ Next (1.5 Lists)] ( 05_Lists )
2+
13# 1.4 Strings
24
3- ### Representing Text
5+ This section introduces way to work with text.
46
5- String are text literals written in programs with quotes.
7+ ### Representing Literal Text
8+
9+ String literals are written in programs with quotes.
610
711``` python
812# Single quote
@@ -20,12 +24,17 @@ look into my eyes, you're under.
2024'''
2125```
2226
23- Triple quotes capture all text enclosed in multiple lines.
27+ Normally strings may only span a single line. Triple quotes capture all text enclosed across multiple lines
28+ including all formatting.
29+
30+ There is no difference between using single (') versus double (")
31+ quotes. The same type of quote used to start a string must be used to
32+ terminate it.
2433
2534### String escape codes
2635
2736Escape codes are used to represent control characters and characters that can't be easily typed
28- at the keyboard. Here are some common escape codes:
37+ directly at the keyboard. Here are some common escape codes:
2938
3039```
3140'\n' Line feed
@@ -38,8 +47,8 @@ at the keyboard. Here are some common escape codes:
3847
3948### String Representation
4049
41- The characters in a string are Unicode and represent a so-called "code-point". You can
42- specify an exact code-point using the following escape sequences:
50+ Each character in a string is stored internally as a so-called Unicode "code-point" which is
51+ an integer. You can specify an exact code-point value using the following escape sequences:
4352
4453``` python
4554a = ' \xf1 ' # a = 'ñ'
@@ -54,6 +63,7 @@ available character codes.
5463### String Indexing
5564
5665Strings work like an array for accessing individual characters. You use an integer index, starting at 0.
66+ Negative indices specify a position relative to the end of the string.
5767
5868``` python
5969a = ' Hello world'
@@ -62,7 +72,7 @@ c = a[4] # 'o'
6272d = a[- 1 ] # 'd' (end of string)
6373```
6474
65- You can also slice or select substrings with ` : ` .
75+ You can also slice or select substrings specifying a range of indices with ` : ` .
6676
6777``` python
6878d = a[:5 ] # 'Hello'
@@ -71,6 +81,8 @@ f = a[3:8] # 'lowo'
7181g = a[- 5 :] # 'world'
7282```
7383
84+ The character at the ending index is not included. Missing indices assume the beginning or ending of the string.
85+
7486### String operations
7587
7688Concatenation, length, membership and replication.
@@ -161,7 +173,8 @@ TypeError: 'str' object does not support item assignment
161173
162174### String Conversions
163175
164- Use ` str() ` to convert any value to a string suitable for printing.
176+ Use ` str() ` to convert any value to a string. The result is a string holding the
177+ same text that would have been produced by the ` print() ` statement.
165178
166179``` python
167180>> > x = 42
@@ -172,7 +185,7 @@ Use `str()` to convert any value to a string suitable for printing.
172185
173186### Byte Strings
174187
175- A string of 8-bit bytes, commonly encountered with low-level I/O.
188+ A string of 8-bit bytes, commonly encountered with low-level I/O, is written as follows:
176189
177190``` python
178191data = b ' Hello World\r\n '
@@ -201,9 +214,13 @@ text = data.decode('utf-8') # bytes -> text
201214data = text.encode(' utf-8' ) # text -> bytes
202215```
203216
217+ The ` 'utf-8' ` argument specifies a character encoding. Other common
218+ values include ` 'ascii' ` and ` 'latin1' ` .
219+
204220### Raw Strings
205221
206- Raw strings are string literals with an uninterpreted backslash. They specified by prefixing the initial quote with a lowercase "r".
222+ Raw strings are string literals with an uninterpreted backslash. They
223+ are specified by prefixing the initial quote with a lowercase "r".
207224
208225``` python
209226>> > rs = r ' c:\n ewdata\t est' # Raw (uninterpreted backslash)
@@ -237,9 +254,9 @@ is covered later.
237254
238255## Exercises
239256
240- In these exercises, you experiment with operations on Python's string type.
241- You should do this at the Python interactive prompt where you can easily see the results.
242- Important note:
257+ In these exercises, you'll experiment with operations on Python's
258+ string type. You should do this at the Python interactive prompt
259+ where you can easily see the results. Important note:
243260
244261> In exercises where you are supposed to interact with the interpreter,
245262> ` >>> ` is the interpreter prompt that you get when Python wants
@@ -250,7 +267,7 @@ Important note:
250267
251268Start by defining a string containing a series of stock ticker symbols like this:
252269
253- ``` pycon
270+ ``` python
254271>> > symbols = ' AAPL,IBM,MSFT,YHOO,SCO'
255272>> >
256273```
@@ -259,7 +276,7 @@ Start by defining a string containing a series of stock ticker symbols like this
259276
260277Strings are arrays of characters. Try extracting a few characters:
261278
262- ``` pycon
279+ ``` python
263280>> > symbols[0 ]
264281?
265282>> > symbols[1 ]
@@ -273,8 +290,6 @@ Strings are arrays of characters. Try extracting a few characters:
273290>> >
274291```
275292
276- ### Exercise 1.14: Strings as read-only objects
277-
278293In Python, strings are read-only.
279294
280295Verify this by trying to change the first character of ` symbols ` to a lower-case 'a'.
@@ -287,22 +302,29 @@ TypeError: 'str' object does not support item assignment
287302>> >
288303```
289304
290- ### Exercise 1.15 : String concatenation
305+ ### Exercise 1.14 : String concatenation
291306
292307Although string data is read-only, you can always reassign a variable
293308to a newly created string.
294309
295310Try the following statement which concatenates a new symbol "GOOG" to
296311the end of ` symbols ` :
297312
298- ``` pycon
313+ ``` python
299314>> > symbols = symbols + ' GOOG'
300315>> > symbols
301316' AAPL,IBM,MSFT,YHOO,SCOGOOG'
302317>> >
303318```
304319
305- Oops! That's not what you wanted. Fix it so that the ` symbols ` variable holds the value ` 'HPQ,AAPL,IBM,MSFT,YHOO,SCO,GOOG' ` .
320+ Oops! That's not what you wanted. Fix it so that the ` symbols ` variable holds the value ` 'AAPL,IBM,MSFT,YHOO,SCO,GOOG' ` .
321+
322+ ``` python
323+ >> > symbols = ?
324+ >> > symbols
325+ ' AAPL,IBM,MSFT,YHOO,SCO,GOOG'
326+ >> >
327+ ```
306328
307329In these examples, it might look like the original string is being
308330modified, in an apparent violation of strings being read only. Not
@@ -311,12 +333,12 @@ time. When the variable name `symbols` is reassigned, it points to the
311333newly created string. Afterwards, the old string is destroyed since
312334it's not being used anymore.
313335
314- ### Exercise 1.16 : Membership testing (substring testing)
336+ ### Exercise 1.15 : Membership testing (substring testing)
315337
316338Experiment with the ` in ` operator to check for substrings. At the
317339interactive prompt, try these operations:
318340
319- ``` pycon
341+ ``` python
320342>> > ' IBM' in symbols
321343?
322344>> > ' AA' in symbols
@@ -326,13 +348,13 @@ True
326348>> >
327349```
328350
329- * Why did the check for "AA" return ` True ` ?*
351+ * Why did the check for ` 'AA' ` return ` True ` ?*
330352
331- ### Exercise 1.17 : String Methods
353+ ### Exercise 1.16 : String Methods
332354
333355At the Python interactive prompt, try experimenting with some of the string methods.
334356
335- ``` pycon
357+ ``` python
336358>> > symbols.lower()
337359?
338360>> > symbols
@@ -342,14 +364,14 @@ At the Python interactive prompt, try experimenting with some of the string meth
342364
343365Remember, strings are always read-only. If you want to save the result of an operation, you need to place it in a variable:
344366
345- ``` pycon
367+ ``` python
346368>> > lowersyms = symbols.lower()
347369>> >
348370```
349371
350372Try some more operations:
351373
352- ``` pycon
374+ ``` python
353375>> > symbols.find(' MSFT' )
354376?
355377>> > symbols[13 :17 ]
@@ -364,14 +386,14 @@ Try some more operations:
364386>> >
365387```
366388
367- ### Exercise 1.18 : f-strings
389+ ### Exercise 1.17 : f-strings
368390
369391Sometimes you want to create a string and embed the values of
370392variables into it.
371393
372394To do that, use an f-string. For example:
373395
374- ``` pycon
396+ ``` python
375397>> > name = ' IBM'
376398>> > shares = 100
377399>> > price = 91.1
@@ -383,6 +405,31 @@ To do that, use an f-string. For example:
383405Modify the ` mortgage.py ` program from [ Exercise 1.10] ( 03_Numbers ) to create its output using f-strings.
384406Try to make it so that output is nicely aligned.
385407
408+
409+ ### Exercise 1.18: Regular Expressions
410+
411+ One limitation of the basic string operations is that they don't
412+ support any kind of advanced pattern matching. For that, you
413+ need to turn to Python's ` re ` module and regular expressions.
414+ Regular expression handling is a big topic, but here is a short
415+ example:
416+
417+ ``` python
418+ >> > text = ' Today is 3/27/2018. Tomorrow is 3/28/2018.'
419+ >> > # Find all occurrences of a date
420+ >> > import re
421+ >> > re.findall(r ' \d + /\d + /\d + ' , text)
422+ [' 3/27/2018' , ' 3/28/2018' ]
423+ >> > # Replace all occurrences of a date with replacement text
424+ >> > re.sub(r ' ( \d + ) /( \d + ) /( \d + ) ' , r ' \3 -\1 -\2 ' , text)
425+ ' Today is 2018-3-27. Tomorrow is 2018-3-28.'
426+ >> >
427+ ```
428+
429+ For more information about the ` re ` module, see the official documentation at
430+ [ https://docs.python.org/library/re.html ] ( https://docs.python.org/3/library/re.html ) .
431+
432+
386433### Commentary
387434
388435As you start to experiment with the interpreter, you often want to
0 commit comments