Skip to content

Assets

assets

Asset downloading and management.

Handles concurrent downloading of web assets (images, CSS, JavaScript) with progress tracking and SHA-256 content deduplication. Provides AssetDownloader for async downloads with configurable concurrency limits.

Classes

Asset dataclass

Asset(url: str, type: Literal['image', 'css', 'js', 'font'], original_src: str)

Represents an asset to download.

Attributes:

Name Type Description
url str

Asset URL (absolute)

type Literal['image', 'css', 'js', 'font']

Asset type (image, css, js, font)

original_src str

Original src attribute from HTML

AssetStats dataclass

AssetStats(downloaded: int = 0, failed: int = 0, skipped: int = 0, total_bytes: int = 0, errors: dict[str, int] = dict())

Statistics for asset downloads.

Tracks download success/failure counts and total bytes.

AssetDownloader

AssetDownloader(config: AssetConfig, output_manager: OutputManager, client: AsyncClient | None = None)

Downloads assets concurrently with error handling.

Features: - Concurrent downloads with semaphore limiting - Skip existing files (idempotent) - Comprehensive error tracking - Progress tracking via stats

Initialize asset downloader.

Parameters:

Name Type Description Default
config AssetConfig

AssetConfig from SusConfig

required
output_manager OutputManager

OutputManager instance (for path resolution)

required
client AsyncClient | None

Optional HTTP client (for testing with mocks)

None

Functions

download_all async
download_all(assets: list[str]) -> AssetStats

Download all assets concurrently.

Parameters:

Name Type Description Default
assets list[str]

List of asset URLs to download

required

Returns:

Type Description
AssetStats

AssetStats with download results

Logic: 1. Filter out already downloaded assets 2. Create HTTP client if not provided 3. Create async tasks for each asset 4. Gather all tasks (use asyncio.gather with return_exceptions=True) 5. Update stats based on results 6. Close client if we created it