unilab.ipc.collector_error¶
Cross-process error propagation for collector subprocesses.
Uses a Pipe + ExceptionWrapper pattern (proven by PyTorch DataLoader and CPython’s ProcessPoolExecutor) to ensure collector tracebacks always reach the parent process — even when stderr is lost or interleaved.
Functions
|
Context manager that catches all exceptions in collector subprocess. |
Create a unidirectional pipe for error reporting. |
|
|
Format a human-readable death report for a collector process. |
Classes
Picklable exception + traceback for cross-process propagation. |
- class unilab.ipc.collector_error.ExceptionWrapper[source]¶
Bases:
objectPicklable exception + traceback for cross-process propagation.
Stores the exception type and a pre-formatted traceback string (not exc_info — that would create reference cycles preventing GC of objects in the exception scope).
- Parameters:
where (
str)
- unilab.ipc.collector_error.create_error_pipe()[source]¶
Create a unidirectional pipe for error reporting.
Returns (recv_conn, send_conn). Parent keeps recv, child gets send.
- unilab.ipc.collector_error.collector_error_guard(error_conn=None, metrics_queue=None, stop_event=None, label='collector')[source]¶
Context manager that catches all exceptions in collector subprocess.
Sends a picklable ExceptionWrapper through the error pipe so the parent process can surface the full traceback. Also pushes to metrics_queue for fast-path detection by the training loop.