🔍 Code Extractor

class ZipHeader

Maturity: 34

A dataclass representing the central directory header structure of a ZIP archive file, containing metadata about compressed files.

File:
/tf/active/vicechatdev/rmcl/zipdir.py
Lines:
16 - 47
Complexity:
moderate

Purpose

This class models the ZIP file format's central directory header (signature 0x02014b50), which stores metadata for each file in a ZIP archive. It provides structured access to file attributes like compression method, CRC checksum, file sizes, timestamps, and file names. The primary use case is parsing ZIP archive central directory entries to extract file metadata without decompressing the actual file data.

Source Code

class ZipHeader:
    version_made: int
    version_read: int
    flags: int
    compression: int
    datetime_info: bytes
    crc: int
    compressed_size: int
    uncompressed_size: int
    filename_length: int
    extra_field_length: int
    file_comment_length: int
    disk_number: int
    internal_attr: bytes
    external_attr: bytes
    header_offset: int

    filename: Optional[bytes] = None
    extra_field: Optional[bytes] = None
    file_comment: Optional[bytes] = None

    @classmethod
    def from_stream(cls, stream):
        signature = stream.read(4)
        if signature != b'\x50\x4b\x01\02':
            return None

        obj = cls(*unpack(FIXED_HEADER_FMT, stream))
        obj.filename = stream.read(obj.filename_length)
        obj.extra_field = stream.read(obj.extra_field_length)
        obj.file_comment = stream.read(obj.file_comment_length)
        return obj

Parameters

Name Type Default Kind
bases - -

Parameter Details

version_made: Version of ZIP specification used to create the archive (2 bytes)

version_read: Minimum version needed to extract the file (2 bytes)

flags: General purpose bit flags indicating encryption, compression options, etc. (2 bytes)

compression: Compression method used (0=stored, 8=deflated, etc.) (2 bytes)

datetime_info: Raw MS-DOS date and time bytes (4 bytes total)

crc: CRC-32 checksum of uncompressed file data (4 bytes)

compressed_size: Size of compressed file data in bytes (4 bytes)

uncompressed_size: Size of uncompressed file data in bytes (4 bytes)

filename_length: Length of the filename field in bytes (2 bytes)

extra_field_length: Length of the extra field in bytes (2 bytes)

file_comment_length: Length of the file comment in bytes (2 bytes)

disk_number: Disk number where file starts in multi-disk archives (2 bytes)

internal_attr: Internal file attributes (2 bytes)

external_attr: External file attributes (4 bytes, OS-dependent)

header_offset: Offset of local file header from start of archive (4 bytes)

filename: Optional bytes containing the actual filename (read from stream)

extra_field: Optional bytes containing extra field data (read from stream)

file_comment: Optional bytes containing file comment (read from stream)

Return Value

The class itself returns a ZipHeader instance when instantiated. The from_stream class method returns either a ZipHeader object populated with data from the stream, or None if the signature doesn't match the expected central directory header signature (0x02014b50).

Class Interface

Methods

from_stream(cls, stream) -> Optional[ZipHeader]

Purpose: Class method that reads and parses a ZIP central directory header from a binary stream

Parameters:

  • cls: The class itself (automatically passed for class methods)
  • stream: A file-like object opened in binary mode, positioned at the start of a central directory header

Returns: A ZipHeader instance populated with data from the stream if the signature matches (0x02014b50), or None if the signature doesn't match

Attributes

Name Type Description Scope
version_made int Version of ZIP specification used to create the archive instance
version_read int Minimum version needed to extract the file instance
flags int General purpose bit flags for encryption and compression options instance
compression int Compression method identifier (0=stored, 8=deflated) instance
datetime_info bytes Raw MS-DOS date and time information (4 bytes) instance
crc int CRC-32 checksum of the uncompressed file data instance
compressed_size int Size of the compressed file data in bytes instance
uncompressed_size int Size of the uncompressed file data in bytes instance
filename_length int Length of the filename field in bytes instance
extra_field_length int Length of the extra field in bytes instance
file_comment_length int Length of the file comment in bytes instance
disk_number int Disk number where file starts (for multi-disk archives) instance
internal_attr bytes Internal file attributes (2 bytes) instance
external_attr bytes External file attributes (4 bytes, OS-dependent) instance
header_offset int Offset of the local file header from the start of the archive instance
filename Optional[bytes] The actual filename as bytes, populated by from_stream instance
extra_field Optional[bytes] Extra field data as bytes, populated by from_stream instance
file_comment Optional[bytes] File comment as bytes, populated by from_stream instance

Dependencies

  • dataclasses
  • struct
  • typing

Required Imports

from dataclasses import dataclass
import struct
from typing import Optional

Usage Example

from dataclasses import dataclass
import struct
from typing import Optional
import io

# Define the format string for the fixed header portion
FIXED_HEADER_FMT = '<HHHH4sIIIHHHHH2s4sI'

@dataclass
class ZipHeader:
    version_made: int
    version_read: int
    flags: int
    compression: int
    datetime_info: bytes
    crc: int
    compressed_size: int
    uncompressed_size: int
    filename_length: int
    extra_field_length: int
    file_comment_length: int
    disk_number: int
    internal_attr: bytes
    external_attr: bytes
    header_offset: int
    filename: Optional[bytes] = None
    extra_field: Optional[bytes] = None
    file_comment: Optional[bytes] = None

    @classmethod
    def from_stream(cls, stream):
        signature = stream.read(4)
        if signature != b'\x50\x4b\x01\x02':
            return None
        obj = cls(*struct.unpack(FIXED_HEADER_FMT, stream.read(struct.calcsize(FIXED_HEADER_FMT))))
        obj.filename = stream.read(obj.filename_length)
        obj.extra_field = stream.read(obj.extra_field_length)
        obj.file_comment = stream.read(obj.file_comment_length)
        return obj

# Usage example
with open('archive.zip', 'rb') as f:
    # Seek to central directory location
    header = ZipHeader.from_stream(f)
    if header:
        print(f"File: {header.filename.decode('utf-8')}")
        print(f"Compressed size: {header.compressed_size}")
        print(f"Uncompressed size: {header.uncompressed_size}")

Best Practices

  • Always use the from_stream class method to instantiate ZipHeader objects rather than calling the constructor directly, as it handles signature validation and variable-length field reading
  • Check if from_stream returns None before accessing the object, as it returns None when the signature doesn't match
  • Ensure the stream is positioned at the start of a central directory header before calling from_stream
  • The FIXED_HEADER_FMT constant must be defined in the module scope with the correct struct format string matching the ZIP specification
  • Filename, extra_field, and file_comment are bytes objects and may need decoding (typically UTF-8 or CP437) for display
  • This class is immutable after creation (dataclass without frozen=True, but no setters provided)
  • The stream must support read() operations and be opened in binary mode

Similar Components

AI-powered semantic similarity - components with related functionality:

  • class CompactSection 49.8% similar

    A dataclass representing a section in compact format with an icon, title, content, and priority level.

    From: /tf/active/vicechatdev/e-ink-llm/compact_formatter.py
  • class DatabaseInfo 47.8% similar

    A dataclass that encapsulates complete database schema information including tables, columns, relationships, and metadata for a specific database instance.

    From: /tf/active/vicechatdev/full_smartstat/dynamic_schema_discovery.py
  • class TableInfo 45.0% similar

    A dataclass that encapsulates comprehensive metadata about a database table, including schema information, columns, keys, and data quality metrics.

    From: /tf/active/vicechatdev/full_smartstat/dynamic_schema_discovery.py
  • class Folder 44.7% similar

    Represents a folder item in a file system hierarchy, extending the Item base class with the ability to contain children and be uploaded as a ZIP archive.

    From: /tf/active/vicechatdev/rmcl/items.py
  • class DataSection 44.4% similar

    A dataclass representing a dedicated data analysis section that stores analysis results, plots, dataset information, and conclusions separately from text content.

    From: /tf/active/vicechatdev/vice_ai/models.py
← Back to Browse