Data IO (Python functions)
Contents
Data IO (Python functions)
- Data IO (Python Functions)
- class tf.python_io.TFRecordWriter
- tf.python_io.tf_record_iterator(path)
- TFRecords Format Details
Data IO (Python Functions)
A TFRecords file represents a sequence of (binary) strings. The format is not random access, so it is suitable for streaming large amounts of data but not suitable if fast sharding or other non-sequential access is desired.
class tf.python_io.TFRecordWriter 
A class to write records to a TFRecords file.
This class implements __enter__ and __exit__, and can be used
in with blocks like a normal file.
tf.python_io.TFRecordWriter.__init__(path) 
Opens file path and creates a TFRecordWriter writing to it.
Args:
- path: The path to the TFRecords file.
Raises:
- IOError: If- pathcannot be opened for writing.
tf.python_io.TFRecordWriter.write(record) 
Write a string record to the file.
Args:
- record: str
tf.python_io.TFRecordWriter.close() 
Close the file.
tf.python_io.tf_record_iterator(path) 
An iterator that read the records from a TFRecords file.
Args:
- path: The path to the TFRecords file.
Yields:
Strings.
Raises:
- IOError: If- pathcannot be opened for reading.
TFRecords Format Details
A TFRecords file contains a sequence of strings with CRC hashes. Each record has the format
uint64 length
uint32 masked_crc32_of_length
byte   data[length]
uint32 masked_crc32_of_data
and the records are concatenated together to produce the file. The CRC32s are described here, and the mask of a CRC is
masked_crc = ((crc >> 15) | (crc << 17)) + 0xa282ead8ul