AI

Async AI client with support for OpenAI-compatible providers, fal (image generation), and Modal (serverless inference). Access it via derp.ai.

Config

# derp.toml
[ai]
api_key = "$OPENAI_API_KEY"
# base_url = "https://api.openrouter.ai/v1"  # for other providers
# fal_api_key = "$FAL_API_KEY"

# [ai.modal]
# token_id = "$MODAL_TOKEN_ID"
# token_secret = "$MODAL_TOKEN_SECRET"
# endpoint_url = "https://your-app.modal.run"

Chat

response = await derp.ai.chat(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello"}],
)
print(response.content)  # "Hi there!"
print(response.usage)    # Usage(prompt_tokens=10, ...)

Returns a ChatResponse with content, role, model, usage, and finish_reason.

Streaming

async for chunk in derp.ai.stream_chat(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello"}],
):
    print(chunk.delta, end="")

Each ChatChunk carries delta, role, model, and lifecycle flags (is_first, is_last). The final chunk includes finish_reason and usage (when the provider supports it).

Protocol Adapters

Both ChatResponse and ChatChunk have vercel_ai_json() and tanstack_ai_json() methods that return SSEEvent objects ready for streaming.

Vercel AI SDK:

from fastapi.responses import StreamingResponse

@app.post("/api/chat")
async def chat(request: ChatRequest):
    mid = f"msg-{uuid4().hex}"

    async def sse():
        async for chunk in derp.ai.stream_chat(
            model=request.model, messages=request.messages
        ):
            for event in chunk.vercel_ai_json(message_id=mid):
                yield event.dump()

    return StreamingResponse(sse(), media_type="text/event-stream")

TanStack AI:

async def sse():
    async for chunk in derp.ai.stream_chat(
        model=request.model, messages=request.messages
    ):
        for event in chunk.tanstack_ai_json(message_id=mid):
            yield event.dump()

Non-streaming responses work the same way:

response = await derp.ai.chat(model="gpt-4o-mini", messages=messages)
async def sse():
    for event in response.vercel_ai_json():
        yield event.dump()

Each SSEEvent is a dict subclass with a .dump() method that serializes to data: {...}\n\n. The final event in a complete sequence is always SSEDone which dumps to data: [DONE]\n\n.

Fal (Image Generation)

# One-shot: submit, poll, and return the result
result = await derp.ai.fal_call(
    "fal-ai/flux",
    inputs={"prompt": "a cat in space"},
)

# Or manage the lifecycle yourself
request_id = await derp.ai.fal_submit(
    "fal-ai/flux",
    inputs={"prompt": "a cat in space"},
)
status = await derp.ai.fal_poll("fal-ai/flux", request_id)
if status.is_completed:
    ...
elif status.is_queued:
    print(f"Position: {status.position}")

# Cancel
result = await derp.ai.fal_cancel("fal-ai/flux", request_id)
if result.is_cancelled and result.job_queued:
    # Never started, safe to skip billing
    ...