Chat¶

llm.chat(...) is the smallest entry point and fits most text-only workloads.

Prompt Mode¶

from republic import LLM

llm = LLM(model="openrouter:openrouter/free", api_key="<API_KEY>")
out = llm.chat("Output exactly one word: ready", max_tokens=8)
print(out)

Messages Mode¶

messages = [
    {"role": "system", "content": "Be concise."},
    {"role": "user", "content": "Explain tape-first in one sentence."},
]
out = llm.chat(messages=messages, max_tokens=48)

Structured Error Handling¶

from republic import ErrorPayload, LLM

llm = LLM(model="openrouter:openrouter/free", api_key="<API_KEY>")

try:
    out = llm.chat("Write one sentence.", max_tokens=32)
    print(out)
except ErrorPayload as error:
    if error.kind == "temporary":
        print("retry later")
    else:
        print("fail fast:", error.message)

Retries and Fallback¶

Note: max_retries is the number of retries after the first attempt (total attempts per model is 1 + max_retries).

llm = LLM(
    model="openai:gpt-4o-mini",
    fallback_models=["anthropic:claude-3-5-sonnet-latest"],
    max_retries=3,
    api_key={"openai": "<OPENAI_KEY>", "anthropic": "<ANTHROPIC_KEY>"},
)

out = llm.chat("Give me one deployment checklist item.")

Recommendation: keep max_retries small (for example 2-4), and pick fallback models that are slightly more stable while still meeting quality requirements.