Execution Configuration¶
File: app/config/execution.py
Maps model:version keys to executor names. Falls back to DEFAULT_EXECUTOR for any key not listed.
Available executors¶
| Name | Class | Workers |
|---|---|---|
cpu |
InferenceExecutor(device="cpu") |
8 |
gpu |
InferenceExecutor(device="gpu") |
2 |
Adding a custom executor¶
Register it in app/adapters/http/deps.py:
@lru_cache
def get_onnx_executor():
from app.execution.onnx_executor import OnnxExecutor
return OnnxExecutor(max_workers=4)
@lru_cache
def get_execution_policy() -> ExecutionPolicy:
return ExecutionPolicy(
executors={
"cpu": get_cpu_executor(),
"gpu": get_gpu_executor(),
"onnx": get_onnx_executor(),
},
policy=EXECUTION_POLICY,
default=DEFAULT_EXECUTOR,
)
Then reference "onnx" in EXECUTION_POLICY.
See Execution Backends for executor details.