io
— 處理資料串流的核心工具¶原始碼:Lib/io.py
io
模組替 Python 提供處理各種類型 IO 的主要工具。有三種主要的 IO 類型: 文字 I/O (text I/O)、二進位 I/O (binary I/O) 以及原始 I/O (raw I/O)。這些均為泛用 (generic) 類型,且每種類型都可以使用各式後端儲存 (backing store)。任一種屬於這些類型的具體物件稱為 file object。其它常見的名詞還有資料串流 (stream) 以及類檔案物件 (file-like objects)。
無論其類型為何,每個具體的資料串流物件也將具有各種能力:唯讀的、只接受寫入的、或者讀寫兼具的。它還允許任意的隨機存取(向前或向後尋找至任意位置),或者只能依順序存取(例如 socket 或 pipe 的情形下)。
所有的資料串流都會謹慎處理你所提供的資料的型別。舉例來說,提供一個 str
物件給二進位資料串流的 write()
方法將會引發 TypeError
。同樣地,若提供一個 bytes
物件給文字資料串流的 write()
方法,也會引發同樣的錯誤。
文字 I/O 要求和產出 str
物件。這意味著每當後端儲存為原生 bytes 時(例如在檔案的情形下),資料的編碼與解碼會以清楚易懂的方式進行,也可選擇同時轉換特定於平台的換行字元。
建立文字資料串流最簡單的方法是使用 open()
,可選擇性地指定編碼:
f = open("myfile.txt", "r", encoding="utf-8")
記憶體內的文字資料串流也可以使用 StringIO
物件建立:
f = io.StringIO("some initial text data")
文字資料串流 API 的詳細說明在 TextIOBase
文件當中。
二進位 I/O(也稱為緩衝 I/O (buffered I/O))要求的是類位元組物件 (bytes-like objects) 且產生 bytes
物件。不進行編碼、解碼或者換行字元轉換。這種類型的資料串流可用於各種非文字資料,以及需要手動控制對文字資料的處理時。
建立二進位資料串流最簡單的方法是使用 open()
,並在 mode 字串中加入 'b'
:
f = open("myfile.jpg", "rb")
記憶體內的二進位資料串流也可以透過 BytesIO
物件來建立:
f = io.BytesIO(b"some initial binary data: \x00\x01")
二進位資料串流 API 的詳細說明在 BufferedIOBase
文件當中。
其它函式庫模組可能提供額外的方法來建立文字或二進位資料串流。例如 socket.socket.makefile()
。
原始 I/O(也稱為無緩衝 I/O (unbuffered I/O))通常作為二進位以及文字資料串流的低階 building-block 使用;在使用者程式碼中直接操作原始資料串流很少有用。然而,你可以透過以無緩衝的二進位模式開啟一個檔案來建立一個原始資料串流:
f = open("myfile.jpg", "rb", buffering=0)
原始串流 API 在 RawIOBase
文件中有詳細描述。
TextIOWrapper
和 open()
預設編碼是根據區域設定的 (locale-specific) (locale.getencoding()
)。
然而,許多開發人員在開啟以 UTF-8 編碼的文字檔案(例如:JSON、TOML、Markdown等)時忘記指定編碼,因為多數 Unix 平台預設使用 UTF-8 區域設定。這會導致錯誤,因為對於大多數 Windows 使用者來說,預設地區編碼並非 UTF-8。舉例來說:
# May not work on Windows when non-ASCII characters in the file.
with open("README.md") as f:
long_description = f.read()
因此,強烈建議在開啟文字檔案時,明確指定編碼。若你想使用 UTF-8 編碼,請傳入 encoding="utf-8"
。若想使用目前的地區編碼,Python 3.10 以後的版本支援使用 encoding="locale"
。
也參考
在 Python UTF-8 模式下,可以將預設編碼從特定地區編碼改為 UTF-8。
Python 3.15 將預設使用 Python UTF-8 模式。
在 3.10 版被加入: 更多資訊請見 PEP 597。
要找出哪些地方使用到預設的地區編碼,你可以啟用 -X warn_default_encoding
命令列選項,或者設定環境變數 PYTHONWARNDEFAULTENCODING
。當使用到預設編碼時,會引發 EncodingWarning
。
如果你正在提供一個使用 open()
或 TextIOWrapper
且傳遞 encoding=None
作為參數的 API,你可以使用 text_encoding()
。如此一來如果 API 的呼叫方沒有傳遞 encoding
,呼叫方就會發出一個 EncodingWarning
。然而,對於新的 API,請考慮預設使用 UTF-8(即 encoding="utf-8"
)。
一個包含模組中緩衝 I/O 類別所使用的預設緩衝區大小的整數。若可能的話,open()
會使用檔案的 blksize (透過 os.stat()
取得)。
這是內建函式 open()
的別名。
此函式會引發一個帶有引數 path、mode 以及 flags 的稽核事件 (auditing event) open
。mode 與 flags 引數可能已經被修改或者從原始呼叫中被推斷出來。
以 'rb'
模式開啟提供的檔案。此函式應用於意圖將內容視為可執行的程式碼的情況下。
path 應該要屬於 str
類別,且是個絕對路徑。
這個函式的行為可能會被之前對 PyFile_SetOpenCodeHook()
的呼叫覆寫。然而,假設 path 是個 str
且為絕對路徑,則 open_code(path)
總是與 open(path, 'rb')
有相同行為。覆寫這個行為是為了對檔案進行額外驗證或預處理。
在 3.8 版被加入.
這是個輔助函式,適用於使用 open()
或 TextIOWrapper
且具有 encoding=None
參數的可呼叫物件。
若 encoding 不為 None
,此函式將回傳 encoding。否則,將根據 UTF-8 Mode 回傳 "locale"
或 "utf-8"
。
若 sys.flags.warn_default_encoding
為真,且 encoding 為 None
,此函式會發出一個 EncodingWarning
。stacklevel 指定警告在哪層發出。範例:
def read_text(path, encoding=None):
encoding = io.text_encoding(encoding) # stacklevel=2
with open(path, encoding) as f:
return f.read()
在此範例中,對於 read_text()
的呼叫方會引發一個 EncodingWarning
。
更多資訊請見 文字編碼。
在 3.10 版被加入.
在 3.11 版的變更: 當 UTF-8 模式啟用且 encoding 為 None
時,text_encoding()
會回傳 "utf-8"。
這是內建的 BlockingIOError
例外的相容性別名。
當在資料串流上呼叫不支援的操作時,會引發繼承自 OSError
與 ValueError
的例外。
也參考
sys
包含標準的 IO 資料串流:sys.stdin
、sys.stdout
以及 sys.stderr
。
I/O 串流的實作是由多個類別組合成的階層結構所構成。首先是 abstract base classes (抽象基底類別,ABCs),它們被用來規範各種不同類型的串流,接著具體類別會提供標準串流的實作。
備註
為了協助具體串流類別的實作,抽象基底類別提供了某些方法的預設實作。舉例來說,BufferedIOBase
提供未經最佳化的 readinto()
與 readline()
實作。
I/O 階層結構的最上層是抽象基底類別 IOBase
。它定義了串流的基礎的介面。然而,請注意,讀取串流與寫入串流之間並沒有分離;若不支援給定的操作,實作是允許引發 UnsupportedOperation
例外的。
抽象基底類別 RawIOBase
繼承 IOBase
。此類別處理對串流的位元組讀寫。FileIO
則繼承 RawIOBase
來提供一個介面以存取機器檔案系統內的檔案。
抽象基底類別 BufferedIOBase
繼承 IOBase
。此類別緩衝原始二進位串流 (RawIOBase
)。它的子類別 BufferedWriter
、BufferedReader
與 BufferedRWPair
分別緩衝可寫、可讀、可讀也可寫的的原始二進位串流。類別 BufferedRandom
則提供一個對可搜尋串流 (seekable stream) 的緩衝介面。另一個類別 BufferedIOBase
的子類別 BytesIO
,是一個記憶體內位元組串流。
抽象基底類別 TextIOBase
繼承 IOBase
。此類別處理文本位元組串流,並處理字串的編碼和解碼。類別 TextIOWrapper
繼承自 TextIOBase
,這是個對緩衝原始串流 (BufferedIOBase
) 的緩衝文本介面。最後,StringIO
是個文字記憶體內串流。
引數名稱不是規範的一部份,只有 open()
的引數將作為關鍵字引數。
以下表格總結了 io
模組提供的抽象基底類別 (ABC):
抽象基底類別 (ABC) |
繼承 |
Stub 方法 |
Mixin 方法與屬性 |
---|---|---|---|
|
|
||
|
繼承自 |
||
|
繼承自 |
||
|
繼承自 |
所有 I/O 類別的抽象基礎類別。
為許多方法提供了空的抽象實作,衍生類別可以選擇性地覆寫這些方法;預設的實作代表一個無法讀取、寫入或搜尋的檔案。
即使 IOBase
因為實作的簽名差異巨大而沒有宣告 read()
或 write()
方法,實作與用戶端應把這些方法視為介面的一部份。此外,當呼叫不被它們支援的操作時,可能會引發 ValueError
(或 UnsupportedOperation
)例外。
The basic type used for binary data read from or written to a file is
bytes
. Other bytes-like objects are
accepted as method arguments too. Text I/O classes work with str
data.
請注意,在一個已經關閉的串流上呼叫任何方法(即使只是查詢)都是未定義的。在這種情況下,實作可能會引發 ValueError
例外。
IOBase
(and its subclasses) supports the iterator protocol, meaning
that an IOBase
object can be iterated over yielding the lines in a
stream. Lines are defined slightly differently depending on whether the
stream is a binary stream (yielding bytes), or a text stream (yielding
character strings). See readline()
below.
IOBase
也是個情境管理器,因此支援 with
陳述式。在這個例子中,file 會在 with
陳述式執行完畢後關閉——即使發生了異常。
with open('spam.txt', 'w') as file:
file.write('Spam and eggs!')
IOBase
提供這些資料屬性與方法:
清除並關閉這個串流。若檔案已經關閉,則此方法沒有作用。一旦檔案被關閉,任何對檔案的操作(例如讀取或寫入)將引發 ValueError
異常。
為了方便起見,允許多次呼叫這個方法;然而,只有第一次呼叫會有效果。
如果串流已關閉,則為 True
。
如果適用,清空串流的寫入緩衝區。對於唯讀和非阻塞串流,此操作不會執行任何操作。
如果串流是互動式的(即連接到終端機/tty 設備),則回傳 True
。
從串流讀取並回傳一行。如果指定了 size,則最多讀取 size 個位元組。
對於二進位檔案,行結束符總是 b'\n'
;對於文字檔案,可以使用 open()
函式的 newline 引數來選擇識別的行結束符號。
從串流讀取並回傳一個含有一或多行的 list。可以指定 hint 來控制讀取的行數:如果到目前為止所有行的總大小(以位元組/字元計)超過 hint,則不會再讀取更多行。
hint 值為 0
或更小,以及 None
,都被視為沒有提供 hint。
請注意,已經可以使用 for line in file: ...
在檔案物件上進行疊代,而不一定需要呼叫 file.readlines()
。
將串流位置改變到給定的位元組 offset,此位置是相對於由 whence 指示的位置解釋的,並回傳新的絕對位置。whence 的值可為:
os.SEEK_SET
或 0
-- 串流的起點(預設值);offset 應為零或正數
os.SEEK_CUR
或 1
-- 目前串流位置;offset 可以是負數
os.SEEK_END
或 2
-- 串流的結尾;offset 通常是負數
在 3.1 版被加入: SEEK_*
常數。
在 3.3 版被加入: 某些作業系統可以支援額外的值,例如 os.SEEK_HOLE
或 os.SEEK_DATA
。檔案的合法值取決於它是以文字模式還是二進位模式開啟。
如果串流支援隨機存取,則回傳 True
。如果是 False
,則 seek()
、tell()
和 truncate()
會引發 OSError
。
回傳目前串流的位置。
將串流的大小調整為指定的 size 位元組(如果沒有指定 size,則調整為目前位置)。目前串流位置不會改變。這種調整可以擴展或縮減目前檔案大小。在擴展的情況下,新檔案區域的內容取決於平台(在大多數系統上,額外的位元組會被填充為零)。回傳新的檔案大小。
在 3.5 版的變更: Windows 現在在擴展時會對檔案進行零填充 (zero-fill)。
如果串流支援寫入,則回傳 True
。如果是 False
,write()
和 truncate()
將會引發 OSError
。
將一個包含每一行的 list 寫入串流。這不會新增行分隔符號,因此通常提供的每一行末尾都有一個行分隔符號。
原始二進位串流的基底類別。它繼承自 IOBase
。
原始二進位串流通常提供對底層作業系統設備或 API 的低階存取,並不嘗試將其封裝在高階基元 (primitive) 中(這項功能在緩衝二進位串流和文字串流中的更高階層級完成,後面的頁面會有描述)。
RawIOBase
除了 IOBase
的方法外,還提供以下這些方法:
從物件中讀取最多 size 個位元組並回傳。方便起見,如果 size 未指定或為 -1,則回傳直到檔案結尾 (EOF) 的所有位元組。否則,只會進行一次系統呼叫。如果作業系統呼叫回傳的位元組少於 size,則可能回傳少於 size 的位元組。
如果回傳了 0 位元組,且 size 不是 0,這表示檔案結尾 (end of file)。如果物件處於非阻塞模式且沒有可用的位元組,則回傳 None
。
預設的實作會遵守 readall()
和 readinto()
的實作。
讀取並回傳串流中直到檔案結尾的所有位元組,必要時使用多次對串流的呼叫。
將位元組讀入一個預先分配的、可寫的 bytes-like object (類位元組物件) b 中,並回傳讀取的位元組數量。例如,b 可能是一個 bytearray
。如果物件處於非阻塞模式且沒有可用的位元組,則回傳 None
。
將給定的 bytes-like object (類位元組物件),b,寫入底層的原始串流,並回傳寫入的位元組大小。根據底層原始串流的具體情況,這可能少於 b 的位元組長度,尤其是當它處於非阻塞模式時。如果原始串流設置為非阻塞且無法立即寫入任何單一位元組,則回傳 None
。呼叫者在此方法回傳後可以釋放或變更 b,因此實作應該只在方法呼叫期間存取 b。
支援某種緩衝的二進位串流的基底類別。它繼承自 IOBase
。
The main difference with RawIOBase
is that methods read()
,
readinto()
and write()
will try (respectively) to read
as much input as requested or to emit all provided data.
In addition, if the underlying raw stream is in non-blocking mode, when the
system returns would block write()
will raise BlockingIOError
with BlockingIOError.characters_written
and read()
will return
data read so far or None
if no data is available.
此外,read()
方法不存在一個遵從 readinto()
的預設實作。
一個典型的 BufferedIOBase
實作不應該繼承自一個 RawIOBase
的實作,而是應該改用包裝的方式,像 BufferedWriter
和 BufferedReader
那樣的作法。
BufferedIOBase
除了提供或覆寫來自 IOBase
的資料屬性和方法以外,還包含了這些:
底層的原始串流(一個 RawIOBase
實例),BufferedIOBase
處理的對象。這不是 BufferedIOBase
API 的一部分,且在某些實作可能不存在。
將底層的原始串流從緩衝區中分離出來,並回傳它。
在原始串流被分離後,緩衝區處於一個不可用的狀態。
某些緩衝區,如 BytesIO
,沒有單一原始串流的概念可從此方法回傳。它們會引發 UnsupportedOperation
。
在 3.1 版被加入.
Read and return up to size bytes. If the argument is omitted, None
,
or negative read as much as possible.
Fewer bytes may be returned than requested. An empty bytes
object
is returned if the stream is already at EOF. More than one read may be
made and calls may be retried if specific errors are encountered, see
os.read()
and PEP 475 for more details. Less than size bytes
being returned does not imply that EOF is imminent.
When reading as much as possible the default implementation will use
raw.readall
if available (which should implement
RawIOBase.readall()
), otherwise will read in a loop until read
returns None
, an empty bytes
, or a non-retryable error. For
most streams this is to EOF, but for non-blocking streams more data may
become available.
備註
When the underlying raw stream is non-blocking, implementations may
either raise BlockingIOError
or return None
if no data is
available. io
implementations return None
.
Read and return up to size bytes, calling readinto()
which may retry if EINTR
is encountered per
PEP 475. If size is -1
or not provided, the implementation will
choose an arbitrary value for size.
備註
When the underlying raw stream is non-blocking, implementations may
either raise BlockingIOError
or return None
if no data is
available. io
implementations return None
.
讀取位元組到一個預先分配的、可寫的 bytes-like object b 當中,並回傳讀取的位元組數量。例如,b 可能是一個 bytearray
。
類似於 read()
,除非後者是互動式的,否則可能會對底層原始串流發出多次讀取。
如果底層原始串流處於非阻塞模式,且目前沒有可用資料,則會引發 BlockingIOError
。
讀取位元組到一個預先分配的、可寫的 bytes-like object b 中,最多呼叫一次底層原始串流的 read()
(或 readinto()
)方法。此方法回傳讀取的位元組數量。
如果底層原始串流處於非阻塞模式,且目前沒有可用資料,則會引發 BlockingIOError
。
在 3.5 版被加入.
寫入給定的 bytes-like object,b,並回傳寫入的位元組數量(總是等於 b 的長度,以位元組計,因為如果寫入失敗將會引發 OSError
)。根據實際的實作,這些位元組可能會立即寫入底層串流,或出於性能和延遲的緣故而被留在緩衝區當中。
當處於非阻塞模式時,如果需要將資料寫入原始串流,但它無法接受所有資料而不阻塞,則會引發 BlockingIOError
。
呼叫者可以在此方法回傳後釋放或變更 b,因此實作應該僅在方法呼叫期間存取 b。
一個代表包含位元組資料的 OS 層級檔案的原始二進制串流。它繼承自 RawIOBase
。
name 可以是兩種事物之一:
代表將要打開的檔案路徑的一個字元串或 bytes
物件。在這種情況下,closefd 必須是 True
(預設值),否則將引發錯誤。
an integer representing the number of an existing OS-level file descriptor
to which the resulting FileIO
object will give access. When the
FileIO object is closed this fd will be closed as well, unless closefd
is set to False
.
The mode can be 'r'
, 'w'
, 'x'
or 'a'
for reading
(default), writing, exclusive creation or appending. The file will be
created if it doesn't exist when opened for writing or appending; it will be
truncated when opened for writing. FileExistsError
will be raised if
it already exists when opened for creating. Opening a file for creating
implies writing, so this mode behaves in a similar way to 'w'
. Add a
'+'
to the mode to allow simultaneous reading and writing.
The read()
(when called with a positive argument),
readinto()
and write()
methods on this
class will only make one system call.
A custom opener can be used by passing a callable as opener. The underlying
file descriptor for the file object is then obtained by calling opener with
(name, flags). opener must return an open file descriptor (passing
os.open
as opener results in functionality similar to passing
None
).
The newly created file is non-inheritable.
See the open()
built-in function for examples on using the opener
parameter.
在 3.3 版的變更: The opener parameter was added.
The 'x'
mode was added.
在 3.4 版的變更: The file is now non-inheritable.
FileIO
provides these data attributes in addition to those from
RawIOBase
and IOBase
:
The mode as given in the constructor.
The file name. This is the file descriptor of the file when no name is given in the constructor.
Buffered I/O streams provide a higher-level interface to an I/O device than raw I/O does.
A binary stream using an in-memory bytes buffer. It inherits from
BufferedIOBase
. The buffer is discarded when the
close()
method is called.
The optional argument initial_bytes is a bytes-like object that contains initial data.
BytesIO
provides or overrides these methods in addition to those
from BufferedIOBase
and IOBase
:
Return a readable and writable view over the contents of the buffer without copying them. Also, mutating the view will transparently update the contents of the buffer:
>>> b = io.BytesIO(b"abcdef")
>>> view = b.getbuffer()
>>> view[2:4] = b"56"
>>> b.getvalue()
b'ab56ef'
備註
As long as the view exists, the BytesIO
object cannot be
resized or closed.
在 3.2 版被加入.
In BytesIO
, this is the same as read()
.
在 3.7 版的變更: The size argument is now optional.
In BytesIO
, this is the same as readinto()
.
在 3.5 版被加入.
A buffered binary stream providing higher-level access to a readable, non
seekable RawIOBase
raw binary stream. It inherits from
BufferedIOBase
.
When reading data from this object, a larger amount of data may be requested from the underlying raw stream, and kept in an internal buffer. The buffered data can then be returned directly on subsequent reads.
The constructor creates a BufferedReader
for the given readable
raw stream and buffer_size. If buffer_size is omitted,
DEFAULT_BUFFER_SIZE
is used.
BufferedReader
provides or overrides these methods in addition to
those from BufferedIOBase
and IOBase
:
Return bytes from the stream without advancing the position. The number of bytes returned may be less or more than requested. If the underlying raw stream is non-blocking and the operation would block, returns empty bytes.
In BufferedReader
this is the same as io.BufferedIOBase.read()
In BufferedReader
this is the same as io.BufferedIOBase.read1()
在 3.7 版的變更: The size argument is now optional.
A buffered binary stream providing higher-level access to a writeable, non
seekable RawIOBase
raw binary stream. It inherits from
BufferedIOBase
.
When writing to this object, data is normally placed into an internal
buffer. The buffer will be written out to the underlying RawIOBase
object under various conditions, including:
when the buffer gets too small for all pending data;
when flush()
is called;
when a seek()
is requested (for BufferedRandom
objects);
when the BufferedWriter
object is closed or destroyed.
The constructor creates a BufferedWriter
for the given writeable
raw stream. If the buffer_size is not given, it defaults to
DEFAULT_BUFFER_SIZE
.
BufferedWriter
provides or overrides these methods in addition to
those from BufferedIOBase
and IOBase
:
Force bytes held in the buffer into the raw stream. A
BlockingIOError
should be raised if the raw stream blocks.
Write the bytes-like object, b, and return the
number of bytes written. When in non-blocking mode, a
BlockingIOError
with BlockingIOError.characters_written
set
is raised if the buffer needs to be written out but the raw stream blocks.
A buffered binary stream providing higher-level access to a seekable
RawIOBase
raw binary stream. It inherits from BufferedReader
and BufferedWriter
.
The constructor creates a reader and writer for a seekable raw stream, given
in the first argument. If the buffer_size is omitted it defaults to
DEFAULT_BUFFER_SIZE
.
BufferedRandom
is capable of anything BufferedReader
or
BufferedWriter
can do. In addition, seek()
and
tell()
are guaranteed to be implemented.
A buffered binary stream providing higher-level access to two non seekable
RawIOBase
raw binary streams---one readable, the other writeable.
It inherits from BufferedIOBase
.
reader and writer are RawIOBase
objects that are readable and
writeable respectively. If the buffer_size is omitted it defaults to
DEFAULT_BUFFER_SIZE
.
BufferedRWPair
implements all of BufferedIOBase
's methods
except for detach()
, which raises
UnsupportedOperation
.
警告
BufferedRWPair
does not attempt to synchronize accesses to
its underlying raw streams. You should not pass it the same object
as reader and writer; use BufferedRandom
instead.
Base class for text streams. This class provides a character and line based
interface to stream I/O. It inherits from IOBase
.
TextIOBase
provides or overrides these data attributes and
methods in addition to those from IOBase
:
The name of the encoding used to decode the stream's bytes into strings, and to encode strings into bytes.
The error setting of the decoder or encoder.
A string, a tuple of strings, or None
, indicating the newlines
translated so far. Depending on the implementation and the initial
constructor flags, this may not be available.
The underlying binary buffer (a BufferedIOBase
or RawIOBase
instance) that TextIOBase
deals with.
This is not part of the TextIOBase
API and may not exist
in some implementations.
Separate the underlying binary buffer from the TextIOBase
and
return it.
After the underlying buffer has been detached, the TextIOBase
is
in an unusable state.
Some TextIOBase
implementations, like StringIO
, may not
have the concept of an underlying buffer and calling this method will
raise UnsupportedOperation
.
在 3.1 版被加入.
Read and return at most size characters from the stream as a single
str
. If size is negative or None
, reads until EOF.
Read until newline or EOF and return a single str
. If the stream is
already at EOF, an empty string is returned.
If size is specified, at most size characters will be read.
Change the stream position to the given offset. Behaviour depends on
the whence parameter. The default value for whence is
SEEK_SET
.
SEEK_SET
or 0
: seek from the start of the stream
(the default); offset must either be a number returned by
TextIOBase.tell()
, or zero. Any other offset value
produces undefined behaviour.
SEEK_CUR
or 1
: "seek" to the current position;
offset must be zero, which is a no-operation (all other values
are unsupported).
SEEK_END
or 2
: seek to the end of the stream;
offset must be zero (all other values are unsupported).
Return the new absolute position as an opaque number.
在 3.1 版被加入: SEEK_*
常數。
Return the current stream position as an opaque number. The number does not usually represent a number of bytes in the underlying binary storage.
Write the string s to the stream and return the number of characters written.
A buffered text stream providing higher-level access to a
BufferedIOBase
buffered binary stream. It inherits from
TextIOBase
.
encoding gives the name of the encoding that the stream will be decoded or
encoded with. In UTF-8 Mode, this defaults to UTF-8.
Otherwise, it defaults to locale.getencoding()
.
encoding="locale"
can be used to specify the current locale's encoding
explicitly. See 文字編碼 for more information.
errors is an optional string that specifies how encoding and decoding
errors are to be handled. Pass 'strict'
to raise a ValueError
exception if there is an encoding error (the default of None
has the same
effect), or pass 'ignore'
to ignore errors. (Note that ignoring encoding
errors can lead to data loss.) 'replace'
causes a replacement marker
(such as '?'
) to be inserted where there is malformed data.
'backslashreplace'
causes malformed data to be replaced by a
backslashed escape sequence. When writing, 'xmlcharrefreplace'
(replace with the appropriate XML character reference) or 'namereplace'
(replace with \N{...}
escape sequences) can be used. Any other error
handling name that has been registered with
codecs.register_error()
is also valid.
newline controls how line endings are handled. It can be None
,
''
, '\n'
, '\r'
, and '\r\n'
. It works as follows:
When reading input from the stream, if newline is None
,
universal newlines mode is enabled. Lines in the input can end in
'\n'
, '\r'
, or '\r\n'
, and these are translated into '\n'
before being returned to the caller. If newline is ''
, universal
newlines mode is enabled, but line endings are returned to the caller
untranslated. If newline has any of the other legal values, input lines
are only terminated by the given string, and the line ending is returned to
the caller untranslated.
When writing output to the stream, if newline is None
, any '\n'
characters written are translated to the system default line separator,
os.linesep
. If newline is ''
or '\n'
, no translation
takes place. If newline is any of the other legal values, any '\n'
characters written are translated to the given string.
If line_buffering is True
, flush()
is implied when a call to
write contains a newline character or a carriage return.
If write_through is True
, calls to write()
are guaranteed
not to be buffered: any data written on the TextIOWrapper
object is immediately handled to its underlying binary buffer.
在 3.3 版的變更: The write_through argument has been added.
在 3.3 版的變更: The default encoding is now locale.getpreferredencoding(False)
instead of locale.getpreferredencoding()
. Don't change temporary the
locale encoding using locale.setlocale()
, use the current locale
encoding instead of the user preferred encoding.
在 3.10 版的變更: The encoding argument now supports the "locale"
dummy encoding name.
TextIOWrapper
provides these data attributes and methods in
addition to those from TextIOBase
and IOBase
:
Whether line buffering is enabled.
Whether writes are passed immediately to the underlying binary buffer.
在 3.7 版被加入.
Reconfigure this text stream using new settings for encoding, errors, newline, line_buffering and write_through.
Parameters not specified keep current settings, except
errors='strict'
is used when encoding is specified but
errors is not specified.
It is not possible to change the encoding or newline if some data has already been read from the stream. On the other hand, changing encoding after write is possible.
This method does an implicit stream flush before setting the new parameters.
在 3.7 版被加入.
在 3.11 版的變更: The method supports encoding="locale"
option.
Set the stream position.
Return the new stream position as an int
.
Four operations are supported, given by the following argument combinations:
seek(0, SEEK_SET)
: Rewind to the start of the stream.
seek(cookie, SEEK_SET)
: Restore a previous position;
cookie must be a number returned by tell()
.
seek(0, SEEK_END)
: Fast-forward to the end of the stream.
seek(0, SEEK_CUR)
: Leave the current stream position unchanged.
Any other argument combinations are invalid, and may raise exceptions.
也參考
os.SEEK_SET
, os.SEEK_CUR
, and os.SEEK_END
.
A text stream using an in-memory text buffer. It inherits from
TextIOBase
.
The text buffer is discarded when the close()
method is
called.
The initial value of the buffer can be set by providing initial_value.
If newline translation is enabled, newlines will be encoded as if by
write()
. The stream is positioned at the start of the
buffer which emulates opening an existing file in a w+
mode, making it
ready for an immediate write from the beginning or for a write that
would overwrite the initial value. To emulate opening a file in an a+
mode ready for appending, use f.seek(0, io.SEEK_END)
to reposition the
stream at the end of the buffer.
The newline argument works like that of TextIOWrapper
,
except that when writing output to the stream, if newline is None
,
newlines are written as \n
on all platforms.
StringIO
provides this method in addition to those from
TextIOBase
and IOBase
:
Return a str
containing the entire contents of the buffer.
Newlines are decoded as if by read()
, although
the stream position is not changed.
使用範例:
import io
output = io.StringIO()
output.write('First line.\n')
print('Second line.', file=output)
# Retrieve file contents -- this will be
# 'First line.\nSecond line.\n'
contents = output.getvalue()
# Close object and discard memory buffer --
# .getvalue() will now raise an exception.
output.close()
A helper codec that decodes newlines for universal newlines mode.
It inherits from codecs.IncrementalDecoder
.
This section discusses the performance of the provided concrete I/O implementations.
By reading and writing only large chunks of data even when the user asks for a single byte, buffered I/O hides any inefficiency in calling and executing the operating system's unbuffered I/O routines. The gain depends on the OS and the kind of I/O which is performed. For example, on some modern OSes such as Linux, unbuffered disk I/O can be as fast as buffered I/O. The bottom line, however, is that buffered I/O offers predictable performance regardless of the platform and the backing device. Therefore, it is almost always preferable to use buffered I/O rather than unbuffered I/O for binary data.
Text I/O over a binary storage (such as a file) is significantly slower than
binary I/O over the same storage, because it requires conversions between
unicode and binary data using a character codec. This can become noticeable
handling huge amounts of text data like large log files. Also,
tell()
and seek()
are both quite slow
due to the reconstruction algorithm used.
StringIO
, however, is a native in-memory unicode container and will
exhibit similar speed to BytesIO
.
FileIO
objects are thread-safe to the extent that the operating system
calls (such as read(2) under Unix) they wrap are thread-safe too.
Binary buffered objects (instances of BufferedReader
,
BufferedWriter
, BufferedRandom
and BufferedRWPair
)
protect their internal structures using a lock; it is therefore safe to call
them from multiple threads at once.
TextIOWrapper
objects are not thread-safe.
Binary buffered objects (instances of BufferedReader
,
BufferedWriter
, BufferedRandom
and BufferedRWPair
)
are not reentrant. While reentrant calls will not happen in normal situations,
they can arise from doing I/O in a signal
handler. If a thread tries to
re-enter a buffered object which it is already accessing, a RuntimeError
is raised. Note this doesn't prohibit a different thread from entering the
buffered object.
The above implicitly extends to text files, since the open()
function
will wrap a buffered object inside a TextIOWrapper
. This includes
standard streams and therefore affects the built-in print()
function as
well.