🔍 Code Extractor

function calculate_crc32c

Maturity: 35

Calculates a CRC32 checksum of input data and returns it as a base64-encoded string.

File:
/tf/active/vicechatdev/e-ink-llm/cloudtest/simple_clean_root.py
Lines:
13 - 21
Complexity:
simple

Purpose

This function computes a CRC32 checksum (using standard CRC32, not CRC32C despite the name) for data integrity verification. It accepts either string or bytes input, converts strings to UTF-8 bytes, calculates the 32-bit CRC checksum, converts it to a 4-byte big-endian representation, and encodes it as base64. This is commonly used for data validation in file transfers, cloud storage operations, or API communications where checksums need to be transmitted as text.

Source Code

def calculate_crc32c(data):
    """Calculate CRC32C checksum and return as base64"""
    if isinstance(data, str):
        data = data.encode('utf-8')
    
    crc = binascii.crc32(data) & 0xffffffff
    crc_bytes = crc.to_bytes(4, byteorder='big')
    crc_b64 = base64.b64encode(crc_bytes).decode('ascii')
    return crc_b64

Parameters

Name Type Default Kind
data - - positional_or_keyword

Parameter Details

data: Input data to calculate checksum for. Can be either a string (will be UTF-8 encoded) or bytes. No size constraints specified, but large data may impact performance.

Return Value

Returns a string containing the base64-encoded representation of the CRC32 checksum. The output is always an ASCII string of 8 characters (6 base64 characters plus potential padding), representing the 4-byte CRC32 value. Example: 'iCEOYg=='

Dependencies

  • binascii
  • base64

Required Imports

import binascii
import base64

Usage Example

import binascii
import base64

def calculate_crc32c(data):
    if isinstance(data, str):
        data = data.encode('utf-8')
    crc = binascii.crc32(data) & 0xffffffff
    crc_bytes = crc.to_bytes(4, byteorder='big')
    crc_b64 = base64.b64encode(crc_bytes).decode('ascii')
    return crc_b64

# Example usage with string
checksum1 = calculate_crc32c('Hello, World!')
print(f'Checksum: {checksum1}')

# Example usage with bytes
checksum2 = calculate_crc32c(b'Binary data')
print(f'Checksum: {checksum2}')

Best Practices

  • Note: Despite the function name 'calculate_crc32c', this uses standard CRC32 (binascii.crc32), not the CRC32C (Castagnoli) variant. For true CRC32C, use the 'crc32c' or 'google-crc32c' library.
  • The function masks the CRC result with 0xffffffff to ensure a positive 32-bit integer, which is important for cross-platform compatibility.
  • Input strings are automatically converted to UTF-8 bytes. If you need a different encoding, convert to bytes before calling this function.
  • The function is suitable for small to medium-sized data. For very large files, consider processing data in chunks to avoid memory issues.
  • This checksum is not cryptographically secure and should not be used for security purposes. Use hashlib (SHA256, etc.) for security-related checksums.

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function compute_crc32c_header 78.3% similar

    Computes a CRC32C checksum for binary content and returns it as a base64-encoded string formatted for Google Cloud Storage x-goog-hash headers.

    From: /tf/active/vicechatdev/e-ink-llm/cloudtest/force_web_app_refresh.py
  • function calculate_file_hash_v1 54.4% similar

    Calculates the MD5 hash of a file by reading it in chunks to handle large files efficiently.

    From: /tf/active/vicechatdev/mailsearch/enhanced_document_comparison.py
  • function _int_to_bytes 44.2% similar

    Converts a signed integer to its little-endian byte representation, automatically determining the minimum number of bytes needed based on the integer's bit length.

    From: /tf/active/vicechatdev/patches/util.py
  • function bytes_to_unicode 43.7% similar

    Converts a bytes object to a Unicode string using UTF-8 encoding, or returns the input unchanged if it's not a bytes object.

    From: /tf/active/vicechatdev/patches/util.py
  • function calculate_file_hash 41.5% similar

    Calculates the MD5 hash of a file by reading it in chunks to handle large files efficiently.

    From: /tf/active/vicechatdev/mailsearch/compare_documents.py
← Back to Browse