ALGORITHM REFERENCE GUIDE¶
Last Updated: November 8, 2025
Purpose: Document complex algorithms used in Printernizer
Audience: Developers working on the codebase
Related Docs: COMPLEX_LOGIC_INVENTORY.md, TECHNICAL_DEBT_ASSESSMENT.md
OVERVIEW¶
This document explains the key algorithms used in Printernizer that are not immediately obvious from reading the code. Each algorithm includes: - Problem statement and why it's needed - Step-by-step algorithm explanation - Complexity analysis - Success rates and performance metrics - Real-world examples - Known limitations
TABLE OF CONTENTS¶
- Auto-Download Filename Matching
- Search Result Filtering Pipeline
- Metadata Field Mapping
- FTP LIST Parsing
- Thumbnail Extraction from 3MF
- GCODE Metadata Extraction
- Model Complexity Scoring
- Retry with Exponential Backoff
- Job-to-Timelapse Matching
- Material Cost Estimation
1. AUTO-DOWNLOAD FILENAME MATCHING¶
Location: src/services/printer_monitoring_service.py:199
Function: PrinterMonitoringService._attempt_download_current_job()
Complexity: D-26 (Cyclomatic Complexity)
Problem Statement¶
Bambu Lab printers report the currently printing filename via MQTT status messages, but the actual filename in the printer's cache directory (/sdcard/cache) may differ due to:
- Special characters being stripped by the printer's filesystem
- Spaces converted to underscores by certain firmware versions
- Long filenames truncated to fit filesystem limits
- Case normalization applied inconsistently
Without solving this mismatch, auto-download of printing files fails, preventing: - Real-time thumbnail display during printing - Automatic file archival - Print progress tracking with file metadata
Algorithm¶
FUNCTION attempt_download_current_job(printer_id, reported_filename):
// PHASE 1: Try exact match (fast path)
result = download_file(printer_id, reported_filename)
IF result.status == SUCCESS:
RETURN success
// PHASE 2: Get actual file list from printer
printer_files = fetch_file_list_from_printer(printer_id) // ~100-200ms FTP call
// PHASE 3: Generate filename variants
variants = SET()
reported_lower = reported_filename.lowercase().strip()
// Strategy 1: Case-insensitive exact matches
FOR EACH file IN printer_files:
IF file.lowercase() == reported_lower AND file != reported_filename:
variants.add(file)
// Strategy 2: Special character removal
simple = reported_filename.remove_chars(['(', ')', ',']).normalize_spaces()
IF simple != reported_filename:
variants.add(simple)
// Strategy 3: Space-to-underscore conversion
underscore_version = simple.replace(' ', '_')
IF underscore_version != simple:
variants.add(underscore_version)
// Strategy 4: Whitespace normalization
collapsed = reported_filename.collapse_whitespace()
IF collapsed != reported_filename:
variants.add(collapsed)
// Strategy 5: Prefix matching for truncated names
FOR EACH file IN printer_files:
IF file.lowercase().startswith(reported_lower[0:20]) AND
abs(len(file) - len(reported_filename)) > 5:
variants.add(file)
// PHASE 4: Try all variants
FOR EACH variant IN variants:
IF variant NOT IN already_attempted[printer_id]:
already_attempted[printer_id].add(variant)
result = download_file(printer_id, variant)
IF result.status == SUCCESS:
RETURN success_with_variant(variant)
// PHASE 5: All attempts failed
RETURN failure_with_attempts(all_attempts)
Complexity Analysis¶
- Time Complexity: O(n + m) where n = printer file count, m = variant count
- Exact match: O(1) - 200-500ms network time
- File list fetch: O(n) - 100-200ms FTP operation
- Variant generation: O(n) - iterate printer files
-
Variant attempts: O(m) - typically 2-5 variants, 200-500ms each
-
Space Complexity: O(n + m)
- Store printer file list: O(n)
- Store variants set: O(m)
- Store attempt history: O(m * p) where p = printer count
Success Rates (Production Data)¶
- Strategy 1 (Exact): 90% success rate
- Strategy 2 (Case-insensitive): 5% success rate
- Strategy 3 (Special chars): 3% success rate
- Strategy 4 (Prefix match): 2% success rate
- Overall: ~95-97% success rate across all strategies
Performance Metrics¶
- Average attempts: 1.2 per download
- Worst case: 5-8 attempts
- Total time:
- Best case: 200-500ms (exact match)
- Average: 300-700ms (1-2 variants)
- Worst case: 2-4 seconds (8 attempts)
Real-World Examples¶
Example 1: Parentheses Removal
Reported: "3D Benchy (Test Print).3mf"
Actual: "3D Benchy Test Print.3mf"
Strategy: Special character removal (Strategy 3)
Example 2: Space to Underscore
Reported: "Phone Stand v2.3mf"
Actual: "Phone_Stand_v2.3mf"
Strategy: Space-to-underscore (Strategy 4)
Example 3: Filename Truncation
Reported: "super_detailed_miniature_dragon_with_wings.3mf" (48 chars)
Actual: "super_detailed_miniature_dragon.3mf" (36 chars)
Strategy: Prefix matching (Strategy 5)
Example 4: Case Normalization
Reported: "MyModel.3MF"
Actual: "mymodel.3mf"
Strategy: Case-insensitive match (Strategy 2)
Known Limitations¶
- Multiple similar filenames: If printer has "model_v1.3mf" and "model_v2.3mf", prefix matching may match wrong file
- Very short filenames: Prefix matching less reliable for filenames < 20 characters
- Unicode characters: Non-ASCII characters may be handled inconsistently
- Network failures: FTP timeout prevents variant generation (falls back to exact match only)
Troubleshooting¶
Symptom: Auto-download consistently fails for certain files Diagnosis: Check logs for attempted variants Solution: Add new transformation strategy based on pattern in failed attempts
Symptom: Wrong file downloaded Diagnosis: Prefix matching too aggressive Solution: Increase prefix length threshold or length difference threshold
2. SEARCH RESULT FILTERING PIPELINE¶
Location: src/services/search_service.py:447
Function: SearchService._apply_filters()
Complexity: F-41 (Cyclomatic Complexity)
Problem Statement¶
Users need to filter 3D print files/jobs/ideas by multiple criteria: - File type (.3mf, .stl, .gcode) - Physical dimensions (width, height, depth) - Material types (PLA, PETG, ABS, etc.) - Print time and cost ranges - Business vs personal classification - Date ranges
Challenge: Apply 10+ different filter types efficiently without creating deeply nested logic or parallel filter execution (which would be harder to debug).
Algorithm¶
FUNCTION apply_filters(results, filters):
filtered = results // Start with all results
// Sequential filtering pipeline
// Order optimized: fast filters first, expensive filters last
IF filters.file_types NOT EMPTY:
filtered = [r FOR r IN filtered IF r.file_type IN filters.file_types]
IF filters.min_width OR filters.max_width:
filtered = [r FOR r IN filtered IF check_dimension(r, 'width', min_width, max_width)]
IF filters.min_height OR filters.max_height:
filtered = [r FOR r IN filtered IF check_dimension(r, 'height', min_height, max_height)]
IF filters.material_types NOT EMPTY:
filtered = [r FOR r IN filtered IF check_material(r, filters.material_types)]
IF filters.min_print_time OR filters.max_print_time:
filtered = [r FOR r IN filtered IF check_range(r.print_time, min, max)]
IF filters.min_cost OR filters.max_cost:
filtered = [r FOR r IN filtered IF check_range(r.cost, min, max)]
IF filters.is_business NOT NULL:
filtered = [r FOR r IN filtered IF r.is_business == filters.is_business]
IF filters.idea_status NOT EMPTY:
filtered = [r FOR r IN filtered IF r.type == IDEA AND r.status IN filters.idea_status]
IF filters.created_after:
filtered = [r FOR r IN filtered IF r.created_at >= filters.created_after]
IF filters.created_before:
filtered = [r FOR r IN filtered IF r.created_at <= filters.created_before]
RETURN filtered
Design Rationale¶
Why Sequential (not Parallel)? 1. Early reduction: Each filter reduces the dataset, making subsequent filters faster 2. Short-circuit: If 90% filtered out early, remaining filters process only 10% 3. Debugging: Easy to identify which filter is problematic 4. Simplicity: No complex boolean expressions or filter combination logic
Filter Order Optimization: 1. File type (fastest - dict lookup) 2. Business flag (fast - boolean check) 3. Date range (fast - datetime comparison) 4. Numeric ranges (fast - comparison) 5. Dimensions (moderate - nested property access) 6. Materials (moderate - list iteration)
Complexity Analysis¶
- Time Complexity:
- Worst case: O(n × m) where n = results, m = active filters
- Best case: O(n) if early filters eliminate most results
-
Average: O(n × 3-5) as most searches use 3-5 filters
-
Space Complexity: O(n) for intermediate filtered lists
Performance Metrics¶
Typical Search Progression:
Initial results: 1000 items
After file_type filter (.3mf only): 600 items (-40%)
After business filter (business=true): 120 items (-80%)
After date filter (last 30 days): 45 items (-62%)
After dimension filter (fits 256mm bed): 38 items (-15%)
Final result: 38 items
Filter Application Times: - File type: ~0.5ms - Boolean: ~0.3ms - Date range: ~0.8ms - Numeric range: ~0.6ms - Dimensions: ~1.2ms (nested access) - Materials: ~1.5ms (list iteration)
Total filtering time: 2-10ms for typical 100-1000 result sets
Optimization Opportunities¶
- Database-level filtering: Push filters to SQL WHERE clauses
- Index optimization: Add indexes on frequently filtered fields
- Caching: Cache filter results for common filter combinations
- Parallel execution: Run independent filters in parallel (trade complexity for speed)
3. METADATA FIELD MAPPING¶
Location: src/services/library_service.py:582
Function: LibraryService._map_parser_metadata_to_db()
Complexity: F-54 (Cyclomatic Complexity)
Problem Statement¶
Different slicers (BambuStudio, PrusaSlicer, OrcaSlicer) embed metadata in 3MF/GCODE files using different field names and formats:
- Same data with different field names: fill_density vs infill_density
- Different data types: strings vs numbers vs booleans
- Multi-value fields: comma-separated values for multi-material
- Missing fields: older slicers omit newer metadata
Challenge: Normalize 40+ metadata fields from various formats into a consistent database schema.
Algorithm¶
FUNCTION map_parser_metadata_to_db(parser_metadata, parser_thumbnails):
db_fields = {}
// CATEGORY 1: Physical Properties (X, Y, Z dimensions)
IF 'model_width' IN parser_metadata:
db_fields['model_width'] = float(parser_metadata['model_width'])
// ... similar for depth, height
// Fallback: use max_z_height if model_height missing
IF 'max_z_height' IN parser_metadata:
db_fields['model_height'] = float(parser_metadata['max_z_height'])
// CATEGORY 2: Print Settings (layer height, infill, temperature, etc.)
// Handle field name variations with fallback logic
IF 'fill_density' IN parser_metadata OR 'infill_density' IN parser_metadata:
density = parser_metadata.get('fill_density') OR parser_metadata.get('infill_density')
db_fields['infill_density'] = float(density)
// Handle boolean in multiple formats: "true"/"false", "1"/"0", "yes"/"no"
IF 'support_used' IN parser_metadata:
support = parser_metadata['support_used']
db_fields['support_used'] = 1 IF str(support).lower() IN ['true', '1', 'yes'] ELSE 0
// CATEGORY 3: Material Requirements
// Handle comma-separated values for multi-material prints
IF 'filament_used [g]' IN parser_metadata OR 'total_filament_weight' IN parser_metadata:
weight = parser_metadata.get('filament_used [g]') OR parser_metadata.get('total_filament_weight')
// Multi-material: "15.5,8.3,0.0" → sum = 23.8g
IF isinstance(weight, str) AND ',' IN weight:
weight = sum([float(x) FOR x IN weight.split(',') IF x])
db_fields['total_filament_weight'] = float(weight)
// Material types: Convert semicolon-separated → JSON array
// "PLA;PETG;TPU" → ["PLA", "PETG", "TPU"]
IF 'filament_type' IN parser_metadata:
types = parser_metadata['filament_type']
IF isinstance(types, str):
types = [t.strip() FOR t IN types.split(';') IF t.strip()]
db_fields['material_types'] = json.dumps(types)
// CATEGORY 4: Compatibility Information
// Parse slicer info: "BambuStudio 1.9.0" → name="BambuStudio", version="1.9.0"
IF 'generator' IN parser_metadata:
parts = parser_metadata['generator'].split()
IF len(parts) >= 1:
db_fields['slicer_name'] = parts[0]
IF len(parts) >= 2:
db_fields['slicer_version'] = parts[1]
// CATEGORY 5: Thumbnails
// Select largest thumbnail by pixel area
IF parser_thumbnails AND len(parser_thumbnails) > 0:
largest = max(parser_thumbnails, key=lambda t: t['width'] * t['height'])
db_fields['thumbnail_data'] = largest['data']
db_fields['thumbnail_width'] = largest['width']
db_fields['thumbnail_height'] = largest['height']
RETURN db_fields
Field Name Variations by Slicer¶
| Database Field | BambuStudio | PrusaSlicer | OrcaSlicer |
|---|---|---|---|
infill_density |
fill_density |
infill_density |
fill_density |
wall_count |
wall_loops |
perimeters |
wall_loops |
print_speed |
outer_wall_speed |
print_speed |
outer_wall_speed |
nozzle_temperature |
nozzle_temperature_initial_layer |
nozzle_temperature |
first_layer_temperature |
Complexity Analysis¶
- Time Complexity: O(n) where n = number of metadata fields (~40-50)
- Each field: O(1) dict lookup + O(1) conversion
-
Multi-value parsing: O(m) where m = number of values (typically 1-4 for multi-material)
-
Space Complexity: O(n) to store resulting db_fields dict
Data Format Examples¶
Multi-Material Filament Weight:
Input: "filament_used [g]" = "15.5,8.3,0.0"
Parse: ["15.5", "8.3", "0.0"]
Filter: ["15.5", "8.3"] (remove empty/zero)
Sum: 23.8
Output: db_fields['total_filament_weight'] = 23.8
Material Types:
Input: "filament_type" = "PLA;PETG;TPU"
Split: ["PLA", "PETG", "TPU"]
Output: db_fields['material_types'] = '["PLA", "PETG", "TPU"]'
Boolean Conversion:
Input: "support_used" = "true" or "1" or "yes" or True
Output: db_fields['support_used'] = 1
Input: "support_used" = "false" or "0" or "no" or False
Output: db_fields['support_used'] = 0
Known Limitations¶
- Missing slicer support: New slicers may use completely different field names
- Unit inconsistencies: Some slicers report mm, others report cm or m
- Multi-material complexity: Assumes comma-separated format (not universal)
- Version-specific fields: Newer slicer versions add fields that older parsers don't recognize
4. FTP LIST PARSING¶
Location: src/services/bambu_ftp_service.py:282
Function: BambuFTPService._parse_ftp_line()
Complexity: B-8 (Cyclomatic Complexity)
Problem Statement¶
Bambu Lab printers expose FTP servers for file access, but the FTP LIST command returns raw Unix-style directory listings that must be parsed:
-rw-rw-rw- 1 root root 15234 Nov 8 14:23 model.3mf
drwxrwxrwx 2 root root 4096 Nov 8 10:15 cache
-rw-rw-rw- 1 root root 8472651 Nov 7 22:45 big_model.3mf
Challenge: Parse this format reliably to extract filename, size, permissions, and modification time.
FTP LIST Format¶
[permissions] [links] [owner] [group] [size] [month] [day] [time/year] [filename]
-rw-rw-rw- 1 root root 15234 Nov 8 14:23 model.3mf
Field Breakdown: - Permissions: 10 chars (type + rwx for user/group/other) - Links: Number of hard links - Owner/Group: User and group ownership - Size: File size in bytes - Date: Month (3 chars), Day (1-2 digits), Time (HH:MM) or Year (YYYY) - Filename: Rest of line (may contain spaces)
Algorithm¶
FUNCTION parse_ftp_line(line):
parts = line.strip().split()
IF len(parts) < 9:
RETURN None // Invalid format
permissions = parts[0]
// Skip directories and special entries
IF permissions.startswith('d') OR parts[-1] IN ['.', '..']:
RETURN None
// Extract file information
size = int(parts[4]) IF parts[4].isdigit() ELSE 0
filename = ' '.join(parts[8:]) // Handle filenames with spaces
// Parse modification time (best effort)
// FTP time format varies: "Mon DD HH:MM" or "Mon DD YYYY"
TRY:
time_parts = parts[5:8]
// Simplified parsing - production may need more robust handling
CATCH (ValueError, IndexError, AttributeError):
// Time parsing failed - not critical, leave as None
modified = None
RETURN BambuFTPFile(
name=filename,
size=size,
permissions=permissions,
modified=modified,
raw_line=line
)
Edge Cases¶
- Filenames with spaces: "my model.3mf" → Must join parts[8:]
- Directories: Start with 'd' → Skip
- Special entries: "." and ".." → Skip
- Symbolic links: Start with 'l' → May need special handling
- Time vs year: Recent files show time (14:23), old files show year (2024)
Real-World Examples¶
Example 1: Regular File
Input: "-rw-rw-rw- 1 root root 15234 Nov 8 14:23 model.3mf"
Output: BambuFTPFile(name="model.3mf", size=15234, permissions="-rw-rw-rw-")
Example 2: File with Spaces
Input: "-rw-rw-rw- 1 root root 8472651 Nov 7 22:45 my cool model.3mf"
Output: BambuFTPFile(name="my cool model.3mf", size=8472651, permissions="-rw-rw-rw-")
Example 3: Directory (Skipped)
Input: "drwxrwxrwx 2 root root 4096 Nov 8 10:15 cache"
Output: None
Example 4: Large File
Input: "-rw-rw-rw- 1 root root 125467831 Nov 5 2024 huge_model.3mf"
Output: BambuFTPFile(name="huge_model.3mf", size=125467831, permissions="-rw-rw-rw-")
Known Limitations¶
- Time parsing incomplete: Current implementation doesn't fully parse modification time
- Timezone handling: FTP times don't include timezone information
- Non-Unix formats: Some FTP servers use Windows-style listings (different format)
- Unicode filenames: Non-ASCII characters may cause parsing issues
- Permission interpretation: Doesn't check if file is readable/writable
5. THUMBNAIL EXTRACTION FROM 3MF¶
Location: src/services/file_thumbnail_service.py:76
Function: FileThumbnailService.process_file_thumbnails()
Complexity: C-11 (Cyclomatic Complexity)
Problem Statement¶
3MF files are ZIP archives containing: - 3D model geometry (3dmodel.model XML file) - Embedded PNG thumbnail images (Metadata/thumbnail.png) - Print settings and metadata
Challenge: Extract thumbnail from 3MF, validate format, resize for UI display, and store efficiently.
Algorithm¶
FUNCTION process_file_thumbnails(file_path, file_id):
// PHASE 1: Validate file exists and is readable
IF NOT file_exists(file_path):
RETURN error("File not found")
// PHASE 2: Determine file type
file_type = get_file_type(file_path)
IF file_type == '3mf':
// PHASE 3: Open 3MF as ZIP archive
WITH ZipFile(file_path) AS zip:
// PHASE 4: Search for thumbnail
thumbnail_path = find_thumbnail_in_zip(zip)
// Common paths: "Metadata/thumbnail.png", "Thumbnails/thumbnail.png"
IF thumbnail_path:
// PHASE 5: Extract and validate PNG
thumbnail_data = zip.read(thumbnail_path)
IF is_valid_png(thumbnail_data):
// PHASE 6: Generate multiple sizes
thumbnails = {
'small': resize_image(thumbnail_data, 128),
'medium': resize_image(thumbnail_data, 256),
'large': resize_image(thumbnail_data, 512)
}
// PHASE 7: Store with quality optimization
FOR EACH size, image_data IN thumbnails:
path = save_thumbnail(image_data, file_id, size, quality=85)
db.update(file_id, f"thumbnail_{size}_path", path)
RETURN success(thumbnail_paths)
ELIF file_type == 'stl':
// STL files don't have embedded thumbnails
// Generate 3D render preview
RETURN generate_preview_render(file_path)
ELIF file_type IN ['gcode', 'bgcode']:
// GCODE may have thumbnail in comments
RETURN extract_gcode_thumbnail(file_path)
RETURN no_thumbnail_available()
3MF File Structure¶
model.3mf (ZIP archive)
├── [Content_Types].xml
├── _rels/
├── 3D/
│ └── 3dmodel.model (XML with geometry)
├── Metadata/
│ ├── thumbnail.png ← THUMBNAIL HERE
│ ├── slic3r_print_config.ini
│ └── bambu_metadata.json
└── Thumbnails/ (alternative location)
└── thumbnail.png
Complexity Analysis¶
- Time Complexity:
- ZIP open: O(1) - just header read
- Thumbnail search: O(n) where n = files in ZIP (typically 5-20)
- PNG validation: O(1) - just read header
- Resize operation: O(w × h) where w,h = image dimensions
-
Total: O(n + w × h), dominated by resize operation
-
Space Complexity:
- Original thumbnail: ~100-500 KB
- 3 resized versions: ~150-600 KB total
- Total: O(w × h) for image buffers
Performance Metrics¶
Typical 3MF Thumbnail Processing: - ZIP open: 5-10ms - Thumbnail find: 2-5ms - Extract PNG: 10-20ms - Resize (3 sizes): 100-200ms ← slowest step - Save to disk: 20-50ms - Database update: 5-10ms - Total: 150-300ms per file
Known Limitations¶
- Missing thumbnails: Not all slicers embed thumbnails in 3MF
- Multiple thumbnails: Some 3MF files have multiple sizes, currently uses first found
- Corrupted images: ZIP extraction may succeed but PNG invalid
- Memory usage: Large original thumbnails (2000×2000px) use significant memory during resize
- No caching: Re-processes even if thumbnails already exist
6. GCODE METADATA EXTRACTION¶
Location: src/services/file_metadata_service.py:224
Function: FileMetadataService._extract_gcode_metadata()
Complexity: B-9 (Cyclomatic Complexity)
Problem Statement¶
GCODE files contain metadata in comment lines at the beginning of the file:
; Generated by BambuStudio 1.9.0
; layer_height = 0.2
; infill_density = 15%
; total_layer_count = 150
; estimated_time = 3h 25m
; filament_used = 25.5g
G28 ; Home all axes
G1 Z0.2 F3000
...
Challenge: Parse comments to extract structured metadata without reading the entire multi-MB file.
Algorithm¶
FUNCTION extract_gcode_metadata(file_path):
metadata = {}
// Read only first 500 lines (metadata is always at top)
// Typical metadata section: 50-200 lines
WITH open(file_path) AS file:
FOR line IN file.readlines(max_lines=500):
line = line.strip()
// Stop at first non-comment line (actual G-code starts)
IF NOT line.startswith(';'):
BREAK
// Remove comment character
line = line[1:].strip()
// Parse key-value pairs
IF '=' IN line:
key, value = line.split('=', 1)
key = key.strip().replace(' ', '_')
value = value.strip()
// Type conversion based on value format
IF value.endswith('g'):
metadata[key] = parse_weight(value) // "25.5g" → 25.5
ELIF value.endswith('%'):
metadata[key] = parse_percentage(value) // "15%" → 15.0
ELIF value MATCHES time_pattern: // "3h 25m"
metadata[key] = parse_duration(value) // → 205 minutes
ELIF value.isdigit():
metadata[key] = int(value)
ELIF value.isfloat():
metadata[key] = float(value)
ELSE:
metadata[key] = value // Keep as string
RETURN metadata
Metadata Patterns¶
| Pattern | Example | Parsed Value |
|---|---|---|
| Weight | ; filament_used = 25.5g |
25.5 |
| Percentage | ; infill_density = 15% |
15.0 |
| Duration | ; estimated_time = 3h 25m |
205 (minutes) |
| Integer | ; total_layer_count = 150 |
150 |
| Float | ; layer_height = 0.2 |
0.2 |
| String | ; generator = BambuStudio |
"BambuStudio" |
Performance Optimization¶
Why only 500 lines? - Metadata always at top of GCODE file - Typical metadata: 50-200 lines - Reading full file: 10-50 MB, takes 500-2000ms - Reading 500 lines: ~50 KB, takes 5-20ms - Speedup: 25-100x faster
Real-World Example¶
; Generated by BambuStudio 1.9.0
; layer_height = 0.2
; first_layer_height = 0.25
; nozzle_diameter = 0.4
; infill_density = 15%
; total_layer_count = 150
; estimated_time = 3h 25m
; filament_used = 25.5g
; filament_type = PLA
; bed_temperature = 60
; nozzle_temperature = 210
G28 ; Home all axes
...
Parsed Metadata:
{
'generator': 'BambuStudio 1.9.0',
'layer_height': 0.2,
'first_layer_height': 0.25,
'nozzle_diameter': 0.4,
'infill_density': 15.0,
'total_layer_count': 150,
'estimated_time': 205, # minutes
'filament_used': 25.5, # grams
'filament_type': 'PLA',
'bed_temperature': 60,
'nozzle_temperature': 210
}
7. MODEL COMPLEXITY SCORING¶
Location: src/services/bambu_parser.py:512
Function: BambuParser._calculate_complexity_score()
Complexity: D-22 (Cyclomatic Complexity)
Problem Statement¶
Calculate a complexity score (0-100) for 3D models to help users: - Estimate print difficulty - Decide if they have the skills to print it - Understand why prints might fail
Factors affecting complexity: - Support requirements (more supports = harder) - Infill density (higher = longer print, more failure risk) - Print time (longer = more failure opportunities) - Layer count (more layers = more room for error) - Model size (larger = warping risk)
Algorithm¶
FUNCTION calculate_complexity_score(metadata):
score = 0 // Start at 0 (simple), add points for complexity
// FACTOR 1: Support Structures (+20 points if used)
// Supports make prints harder: removal difficulty, surface quality issues
IF metadata.support_used:
score += 20
// FACTOR 2: Infill Density (0-20 points, linear scale)
// Higher infill = longer print time = more failure risk
// 0% infill → 0 points, 100% infill → 20 points
IF metadata.infill_density:
score += (metadata.infill_density / 100) * 20
// FACTOR 3: Print Time (0-25 points, logarithmic scale)
// Long prints have more failure opportunities
// <1 hour → 0 points
// 1-4 hours → 5-15 points
// >12 hours → 25 points
IF metadata.print_time_minutes:
IF metadata.print_time_minutes < 60:
score += 0
ELIF metadata.print_time_minutes < 240: // 4 hours
score += 10
ELIF metadata.print_time_minutes < 720: // 12 hours
score += 18
ELSE:
score += 25
// FACTOR 4: Layer Count (0-15 points)
// More layers = more precision required, more adhesion points
// <100 layers → 0 points
// 100-500 layers → 5-10 points
// >1000 layers → 15 points
IF metadata.total_layer_count:
IF metadata.total_layer_count < 100:
score += 0
ELIF metadata.total_layer_count < 500:
score += 7
ELIF metadata.total_layer_count < 1000:
score += 12
ELSE:
score += 15
// FACTOR 5: Model Height (0-10 points)
// Tall models prone to warping and tipping
// <50mm → 0 points
// 50-150mm → 5 points
// >150mm → 10 points
IF metadata.model_height:
IF metadata.model_height < 50:
score += 0
ELIF metadata.model_height < 150:
score += 5
ELSE:
score += 10
// FACTOR 6: Fine Details (0-10 points)
// Small layer height = fine details = harder print
// >0.3mm → 0 points (chunky)
// 0.1-0.2mm → 5 points (standard)
// <0.1mm → 10 points (very fine)
IF metadata.layer_height:
IF metadata.layer_height > 0.3:
score += 0
ELIF metadata.layer_height > 0.15:
score += 5
ELIF metadata.layer_height > 0.1:
score += 8
ELSE:
score += 10
// Cap score at 100
RETURN min(score, 100)
Complexity Tiers¶
| Score Range | Difficulty | Description |
|---|---|---|
| 0-20 | Beginner | Simple print, no supports, fast |
| 21-40 | Easy | Few supports, standard settings |
| 41-60 | Intermediate | Some supports, longer print |
| 61-80 | Advanced | Many supports, long print, fine details |
| 81-100 | Expert | Complex geometry, very long, requires tuning |
Score Component Weights¶
| Factor | Max Points | Weight % | Rationale |
|---|---|---|---|
| Supports | 20 | 20% | Single biggest difficulty factor |
| Infill | 20 | 20% | Affects time and material usage |
| Print Time | 25 | 25% | Longer = more failure opportunities |
| Layer Count | 15 | 15% | Precision and adhesion requirements |
| Model Height | 10 | 10% | Warping and stability risk |
| Layer Height | 10 | 10% | Detail level and precision needed |
| Total | 100 | 100% | - |
Real-World Examples¶
Example 1: Simple Cube
- Supports: No (0 pts)
- Infill: 15% (3 pts)
- Print Time: 45 min (0 pts)
- Layers: 75 (0 pts)
- Height: 15mm (0 pts)
- Layer Height: 0.2mm (5 pts)
→ Total: 8 (Beginner)
Example 2: Detailed Miniature
- Supports: Yes (20 pts)
- Infill: 20% (4 pts)
- Print Time: 6 hours (18 pts)
- Layers: 800 (12 pts)
- Height: 120mm (5 pts)
- Layer Height: 0.08mm (10 pts)
→ Total: 69 (Advanced)
Example 3: Large Vase
- Supports: No (0 pts)
- Infill: 0% (0 pts) // Vase mode
- Print Time: 15 hours (25 pts)
- Layers: 1500 (15 pts)
- Height: 280mm (10 pts)
- Layer Height: 0.2mm (5 pts)
→ Total: 55 (Intermediate)
8. RETRY WITH EXPONENTIAL BACKOFF¶
Location: src/services/trending_service.py:95
Function: TrendingService._retry_with_backoff()
Complexity: C-12 (Cyclomatic Complexity)
Problem Statement¶
Network requests to external APIs (Printables, MakerWorld) can fail due to: - Temporary network issues - API rate limiting (429 Too Many Requests) - Server overload (503 Service Unavailable)
Simple retry (immediate retry) makes problems worse by hammering the server.
Challenge: Implement smart retry with increasing delays to allow recovery.
Exponential Backoff Algorithm¶
FUNCTION retry_with_backoff(function, max_retries=3, base_delay=2):
attempt = 0
WHILE attempt < max_retries:
TRY:
result = function()
RETURN success(result)
CATCH Exception AS e:
attempt += 1
IF attempt >= max_retries:
RETURN failure(e) // All retries exhausted
// Calculate exponential backoff delay
// Attempt 1: 2s, Attempt 2: 4s, Attempt 3: 8s, etc.
delay = base_delay * (2 ** (attempt - 1))
// Add jitter (randomness) to prevent thundering herd
// If many clients retry at same time, spread them out
jitter = random(0, delay * 0.2) // ±20% random variance
total_delay = delay + jitter
logger.warning(f"Retry {attempt}/{max_retries} after {total_delay}s")
sleep(total_delay)
RETURN failure("Max retries exceeded")
Retry Schedule¶
| Attempt | Base Delay | With Jitter (±20%) | Cumulative Time |
|---|---|---|---|
| 1 | 0s | 0s | 0s (immediate) |
| 2 | 2s | 1.6-2.4s | 1.6-2.4s |
| 3 | 4s | 3.2-4.8s | 4.8-7.2s |
| 4 | 8s | 6.4-9.6s | 11.2-16.8s |
| 5 | 16s | 12.8-19.2s | 24-36s |
Why Exponential (not linear)?¶
Linear Backoff (1s, 2s, 3s, 4s): - Good: Simple - Bad: Too aggressive for rate-limited APIs - Bad: Doesn't give enough recovery time for server issues
Exponential Backoff (2s, 4s, 8s, 16s): - Good: Quickly backs off from overloaded servers - Good: Matches typical rate limit windows (60 seconds) - Good: Industry standard (AWS, Google, etc.) - Bad: Can be slow if network recovers quickly
Why Add Jitter?¶
Without Jitter:
Time: 0s → 100 clients send request
All fail → All wait 2s
Time: 2s → 100 clients retry simultaneously (thundering herd!)
With Jitter (±20%):
Time: 0s → 100 clients send request
All fail → Clients wait 1.6-2.4s (spread out)
Time: 1.6-2.4s → Clients retry gradually (smooth load)
Performance Characteristics¶
Success on First Try: - Delay: 0s - Total time: ~200-500ms (network request)
Success on Retry 2: - Delay: 2s + jitter - Total time: ~2.2-2.7s
Success on Retry 3: - Delay: 2s + 4s + jitter - Total time: ~6-7.5s
All Retries Failed: - Delay: 2s + 4s + 8s = 14s - Total time: ~14-18s
Code Example¶
async def fetch_trending(self):
"""Fetch trending models with retry logic."""
async def _fetch():
response = await self.http_client.get(
"https://api.printables.com/trending",
timeout=10
)
if response.status_code == 429: # Rate limited
raise RateLimitException("API rate limit exceeded")
if response.status_code >= 500: # Server error
raise ServerErrorException("API server error")
return response.json()
return await self._retry_with_backoff(
_fetch,
max_retries=3,
base_delay=2
)
9. JOB-TO-TIMELAPSE MATCHING¶
Location: src/services/timelapse_service.py:579
Function: TimelapseService._match_to_job()
Complexity: C-13 (Cyclomatic Complexity)
Problem Statement¶
Bambu Lab printers create timelapse videos with filenames like:
- 202511081423.mp4 (timestamp format: YYYYMMDDHHmm)
- 20251108_142315.mp4 (timestamp + seconds)
Need to match these videos to print jobs in the database to enable: - "Show timelapse for this job" feature - Job completion confirmation - Print quality review
Challenge: Filename has no job ID, only timestamp. Must match by: - Print start/end time correlation - Printer ID - Filename pattern analysis
Algorithm¶
FUNCTION match_timelapse_to_job(timelapse_filename, printer_id):
// PHASE 1: Parse timestamp from filename
// Format: YYYYMMDDHHmm.mp4 or YYYYMMDD_HHmmss.mp4
timestamp = extract_timestamp_from_filename(timelapse_filename)
IF NOT timestamp:
RETURN no_match("Cannot parse timestamp")
// PHASE 2: Query jobs around timestamp (±30 minutes)
// Timelapse creation may lag job completion by 5-15 minutes
time_window_start = timestamp - 30_minutes
time_window_end = timestamp + 30_minutes
candidate_jobs = query_jobs_in_time_window(
printer_id=printer_id,
start_time_min=time_window_start,
end_time_max=time_window_end
)
IF len(candidate_jobs) == 0:
RETURN no_match("No jobs in time window")
IF len(candidate_jobs) == 1:
RETURN exact_match(candidate_jobs[0]) // Only one candidate
// PHASE 3: Multiple candidates - find best match
best_match = None
best_score = 0
FOR EACH job IN candidate_jobs:
score = 0
// SCORING FACTOR 1: Time proximity (0-100 points)
// Timelapse timestamp usually matches job end time within 5-15 min
time_diff = abs(timestamp - job.end_time).minutes
IF time_diff < 5:
score += 100 // Perfect match
ELIF time_diff < 15:
score += 80 // Very likely
ELIF time_diff < 30:
score += 50 // Possible
ELSE:
score += 0 // Unlikely
// SCORING FACTOR 2: Job completed successfully (0-50 points)
// Timelapses usually only for successful prints
IF job.status == "completed":
score += 50
ELIF job.status == "failed":
score += 0 // Unlikely, but timelapse may still exist
// SCORING FACTOR 3: Job duration vs timelapse duration (0-30 points)
// If available, compare job print time to video length
IF timelapse.duration AND job.print_time:
duration_ratio = timelapse.duration / (job.print_time * 60) // seconds
// Timelapses are typically 10-60 seconds per hour
IF 0.2 < duration_ratio < 1.5: // Reasonable timelapse ratio
score += 30
IF score > best_score:
best_score = score
best_match = job
// PHASE 4: Confidence threshold
IF best_score >= 100:
RETURN high_confidence_match(best_match)
ELIF best_score >= 50:
RETURN likely_match(best_match)
ELSE:
RETURN low_confidence_match(best_match)
Timestamp Parsing Patterns¶
| Filename Format | Example | Parsed Timestamp |
|---|---|---|
| YYYYMMDDHHmm | 202511081423.mp4 |
2025-11-08 14:23:00 |
| YYYYMMDD_HHmmss | 20251108_142315.mp4 |
2025-11-08 14:23:15 |
| Timestamp_suffix | timelapse_202511081423.mp4 |
2025-11-08 14:23:00 |
Match Confidence Levels¶
| Score | Confidence | Action |
|---|---|---|
| 150+ | High | Auto-link to job |
| 100-149 | Medium | Suggest to user |
| 50-99 | Low | List as candidate |
| <50 | Very Low | No suggestion |
Real-World Example¶
Scenario: Timelapse created at 2025-11-08 14:45:00
Candidate Jobs:
1. Job A: Started 14:15, Ended 14:43, Status: completed
- Time diff: 2 minutes (100 pts)
- Status completed: (50 pts)
- Total: 150 pts → HIGH CONFIDENCE
2. Job B: Started 13:00, Ended 14:20, Status: completed
- Time diff: 25 minutes (50 pts)
- Status completed: (50 pts)
- Total: 100 pts → MEDIUM CONFIDENCE
3. Job C: Started 14:30, Ended 14:42, Status: failed
- Time diff: 3 minutes (100 pts)
- Status failed: (0 pts)
- Total: 100 pts → MEDIUM CONFIDENCE
Result: Match to Job A (highest score, high confidence)
Known Limitations¶
- Multiple simultaneous prints: If two printers finish at same time, ambiguous
- Clock skew: If printer clock wrong, timestamps won't match
- Failed prints: May have timelapse even though job failed
- Manual timelapses: User-triggered timelapses don't correspond to jobs
- Missing job data: If job not tracked, timelapse can't be matched
10. MATERIAL COST ESTIMATION¶
Location: src/services/analytics_service.py, src/services/material_service.py
Complexity: Various functions
Problem Statement¶
Calculate accurate material costs for 3D prints to: - Track expenses for business accounting - Estimate customer quotes - Monitor material inventory value - Optimize print settings for cost
Factors: - Filament weight used (grams) - Material type (PLA, PETG, ABS, TPU, etc.) - Material cost per kg - Waste factor (purge, stringing, failed prints)
Algorithm¶
FUNCTION calculate_material_cost(job_metadata, material_inventory):
// PHASE 1: Extract material usage from metadata
filament_weight_g = job_metadata.total_filament_weight
IF NOT filament_weight_g:
RETURN 0 // No material data
// PHASE 2: Identify material type
material_type = job_metadata.material_type OR "PLA" // Default to PLA
// PHASE 3: Look up material cost from inventory
material_entry = material_inventory.find(type=material_type)
IF NOT material_entry:
// Use default cost if not in inventory
cost_per_kg = DEFAULT_COSTS[material_type]
ELSE:
cost_per_kg = material_entry.cost_per_kg
// PHASE 4: Convert weight to kg and calculate base cost
filament_weight_kg = filament_weight_g / 1000
base_cost = filament_weight_kg * cost_per_kg
// PHASE 5: Apply waste factor
// Typical waste: 5-10% for purge tower, stringing, failed starts
waste_factor = 1.10 // 10% waste allowance
adjusted_cost = base_cost * waste_factor
// PHASE 6: Add markup for business prints
IF job_metadata.is_business:
markup_factor = business_settings.material_markup OR 2.0 // 2x default
final_cost = adjusted_cost * markup_factor
ELSE:
final_cost = adjusted_cost
RETURN round(final_cost, 2) // Round to cents
Default Material Costs (EUR per kg)¶
| Material | Cost/kg | Typical Use Case |
|---|---|---|
| PLA | €20 | General purpose, beginner-friendly |
| PETG | €25 | Functional parts, outdoor use |
| ABS | €22 | Strong parts, automotive |
| TPU | €40 | Flexible parts, phone cases |
| ASA | €28 | Outdoor durability, UV resistance |
| Nylon | €45 | High strength, engineering |
| PC | €60 | Extreme strength, heat resistance |
Cost Calculation Example¶
Example: Business Print of Phone Stand
Metadata:
- Filament weight: 35g
- Material type: PETG
- Is business: true
Calculation:
1. Base cost: 35g = 0.035kg × €25/kg = €0.875
2. Waste factor: €0.875 × 1.10 = €0.963
3. Business markup: €0.963 × 2.0 = €1.93
Final cost: €1.93
Multi-Material Cost Calculation¶
FUNCTION calculate_multi_material_cost(job_metadata, material_inventory):
// Job uses multiple materials (e.g., PLA + support material)
material_weights = job_metadata.material_weights // [35g, 10g, 0g]
material_types = job_metadata.material_types // ["PLA", "BVOH", ""]
total_cost = 0
FOR i IN range(len(material_weights)):
IF material_weights[i] > 0:
weight_kg = material_weights[i] / 1000
cost_per_kg = lookup_material_cost(material_types[i])
total_cost += weight_kg * cost_per_kg
// Apply waste and markup as before
total_cost *= waste_factor
IF is_business:
total_cost *= markup_factor
RETURN total_cost
ADDITIONAL ALGORITHMS¶
The following algorithms are documented inline in their respective files:
- Job Status State Machine - src/services/job_service.py:291
- Event Loop Monitoring - src/services/event_service.py:111
- Database Migration - src/services/migration_service.py:25
- File Discovery Sync - src/services/file_discovery_service.py:144
- Preview Rendering - src/services/preview_render_service.py:372
PERFORMANCE TUNING GUIDE¶
General Optimization Strategies¶
- Cache Expensive Operations
- Metadata extraction: Cache for 24 hours
- Thumbnail processing: Store all sizes at once
-
Search results: Cache for 5 minutes
-
Batch Operations
- File discovery: Batch printer queries
- Database updates: Batch inserts/updates
-
Event emissions: Batch non-critical events
-
Async Where Possible
- Network I/O: Always async
- File I/O: Use aiofiles for large files
-
Database: Use async drivers
-
Fail Fast
- Validate inputs early
- Check prerequisites before expensive ops
- Use timeouts on all network calls
Algorithm-Specific Tuning¶
Auto-Download Filename Matching: - Cache printer file lists for 60 seconds - Skip variant generation if exact match succeeds - Implement LRU cache for successful mappings
Search Filtering: - Push filters to database WHERE clauses - Add database indexes on filtered fields - Implement result count estimation
Metadata Extraction: - Parallel processing for multiple files - Skip re-extraction if hash unchanged - Stream large files instead of loading fully
TROUBLESHOOTING¶
Algorithm Not Working as Expected¶
-
Enable debug logging:
-
Check input data quality:
- Validate all inputs before algorithm
- Log intermediate results
-
Compare with expected test cases
-
Profile performance:
-
Test with edge cases:
- Empty inputs
- Very large inputs
- Malformed data
- Null values
Common Issues¶
Issue: Filename matching always fails Cause: Printer firmware version changed filename format Solution: Add new variant generation strategy
Issue: Search results slow Cause: Too many filters applied in memory Solution: Push filters to database level
Issue: Metadata extraction returns wrong values Cause: Slicer changed field names Solution: Add fallback mapping for new field names
REFERENCES¶
- Radon Complexity Tool: https://radon.readthedocs.io/
- Big-O Cheat Sheet: https://www.bigocheatsheet.com/
- Python Performance Tips: https://wiki.python.org/moin/PythonSpeed/PerformanceTips
- Exponential Backoff: https://aws.amazon.com/blogs/architecture/exponential-backoff-and-jitter/
For questions or improvements to these algorithms, see:
- COMPLEX_LOGIC_INVENTORY.md - Full complexity analysis
- TECHNICAL_DEBT_ASSESSMENT.md - Code quality assessment
- GitHub Issues - Report algorithm bugs or suggest optimizations