Assets¶
assets
¶
Asset downloading and management.
Handles concurrent downloading of web assets (images, CSS, JavaScript) with progress tracking and SHA-256 content deduplication. Provides AssetDownloader for async downloads with configurable concurrency limits.
Classes¶
Asset
dataclass
¶
Represents an asset to download.
Attributes:
| Name | Type | Description |
|---|---|---|
url |
str
|
Asset URL (absolute) |
type |
Literal['image', 'css', 'js', 'font']
|
Asset type (image, css, js, font) |
original_src |
str
|
Original src attribute from HTML |
AssetStats
dataclass
¶
AssetStats(downloaded: int = 0, failed: int = 0, skipped: int = 0, total_bytes: int = 0, errors: dict[str, int] = dict())
Statistics for asset downloads.
Tracks download success/failure counts and total bytes.
AssetDownloader
¶
AssetDownloader(config: AssetConfig, output_manager: OutputManager, client: AsyncClient | None = None)
Downloads assets concurrently with error handling.
Features: - Concurrent downloads with semaphore limiting - Skip existing files (idempotent) - Comprehensive error tracking - Progress tracking via stats
Initialize asset downloader.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
AssetConfig
|
AssetConfig from SusConfig |
required |
output_manager
|
OutputManager
|
OutputManager instance (for path resolution) |
required |
client
|
AsyncClient | None
|
Optional HTTP client (for testing with mocks) |
None
|
Functions¶
download_all
async
¶
Download all assets concurrently.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
assets
|
list[str]
|
List of asset URLs to download |
required |
Returns:
| Type | Description |
|---|---|
AssetStats
|
AssetStats with download results |
Logic: 1. Filter out already downloaded assets 2. Create HTTP client if not provided 3. Create async tasks for each asset 4. Gather all tasks (use asyncio.gather with return_exceptions=True) 5. Update stats based on results 6. Close client if we created it