Deploying a Model¶

Two paths: use the CLI for automatic code generation, or write the pipeline definition manually.

Option A — CLI (recommended)¶

uv sync --extra cli
export GROQ_API_KEY=<your-key>

inference-engine deploy ./my_model.pkl

The CLI inspects the artifact, generates load() and predict(), validates the pipeline, and writes all files. See CLI: deploy for full reference.

Option B — Manual¶

1. Implement the model¶

# app/domain/models/my_model.py
from app.domain.models.base import BaseModel

class MyModel(BaseModel):
    def load(self) -> None:
        import joblib
        self._clf = joblib.load("model_artifacts/my_model/v1/model.pkl")

    def predict(self, x):
        return self._clf.predict([x])[0]

2. Add preprocessor / postprocessor¶

from app.domain.processing.pre import BasePreprocessor

class MyPreprocessor(BasePreprocessor):
    def transform(self, raw_input):
        return [float(v) for v in raw_input["features"]]

Use IdentityPreprocessor / IdentityPostprocessor if no transformation is needed.

3. Write the pipeline definition¶

# models/my_model/v1/definition.py
from app.domain.pipelines.base import InferencePipeline
from app.domain.processing.post import IdentityPostprocessor

MODEL_NAME = "my_model"
MODEL_VERSION = "v1"

def build_pipeline() -> InferencePipeline:
    from app.domain.models.my_model import MyModel
    from app.domain.processing.pre import MyPreprocessor
    model = MyModel()
    model.load()
    return InferencePipeline(
        preprocessor=MyPreprocessor(),
        model=model,
        postprocessor=IdentityPostprocessor(),
    )

Place this file at models/my_model/v1/definition.py. The registry auto-discovers it at startup.

4. Add a routing rule (optional)¶

# app/config/routing.py
ROUTES = {
    "my_model": {"strategy": "static", "version": "v1"},
}

5. Restart and test¶

uvicorn app.adapters.http.app:app --reload

curl -X POST http://localhost:8000/predict \
  -H "X-API-Key: dev-key" \
  -H "Content-Type: application/json" \
  -d '{"model": "my_model", "version": "v1", "data": {"features": [1,2,3]}}'

Checklist¶

[ ] BaseModel subclass with load() and predict()
[ ] Preprocessor (or IdentityPreprocessor)
[ ] Postprocessor (or IdentityPostprocessor)
[ ] Validator (or omit for NoOpValidator)
[ ] definition.py with MODEL_NAME, MODEL_VERSION, build_pipeline()
[ ] Routing rule (if version should be optional)