Doc - Super AI Engineer LLM

Documentation

Super AI Engineer LLM documentation

Reference material for model usage, integration, deployment, and limitations.

Quickstart

Install the OpenAI SDK and point it at the Super AI Engineer LLM endpoint.

pip install openai

from openai import OpenAI

client = OpenAI(
    base_url="https://api.example.com/v1",
    api_key="YOUR_API_KEY"
)

response = client.chat.completions.create(
    model="thai-llm-chat-v0.1",
    messages=[
        {"role": "user", "content": "ช่วยอธิบาย Machine Learning เป็นภาษาไทยง่าย ๆ"}
    ],
    temperature=0.7
)

print(response.choices[0].message.content)

API Reference

Base URL: https://api.example.com/v1

POST /v1/chat/completions — chat completion (stream optional)
GET /v1/models — list available models

Authentication

Pass your key as a Bearer token. Keep it server-side; never expose it in a browser.

Authorization: Bearer YOUR_API_KEY

Chat Completion

cURL example:

curl https://api.example.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "thai-llm-chat-v0.1",
    "messages": [
      {"role": "user", "content": "สวัสดีครับ"}
    ],
    "temperature": 0.7
  }'

Streaming

Set stream=True to receive tokens as they are generated.

stream = client.chat.completions.create(
    model="thai-llm-chat-v0.1",
    messages=[{"role": "user", "content": "เล่านิทานสั้น ๆ"}],
    stream=True
)

for chunk in stream:
    delta = chunk.choices[0].delta.content or ""
    print(delta, end="", flush=True)

Deployment

Serve with vLLM / TGI on B200 or LANTA, then switch the portal to real mode.

# Serve with vLLM (OpenAI-compatible) on B200 / LANTA
python -m vllm.entrypoints.openai.api_server \
  --model thai-llm/thai-llm-chat-v0.1 \
  --served-model-name thai-llm-chat-v0.1 \
  --port 8000

# Point the portal at it (.env)
# LLM_API_MODE=real
# LLM_API_BASE_URL=http://your-b200-host:8000
# LLM_API_KEY=your-key

Model Limitations

This is an early version — outputs may contain errors or hallucinations.
Not intended for medical, legal, or financial advice.
Knowledge is limited to its training data cutoff.
Always review generated content before production use.