Documentation
Super AI Engineer LLM documentation
Reference material for model usage, integration, deployment, and limitations.
Quickstart
Install the OpenAI SDK and point it at the Super AI Engineer LLM endpoint.
pip install openai from openai import OpenAI
client = OpenAI(
base_url="https://api.example.com/v1",
api_key="YOUR_API_KEY"
)
response = client.chat.completions.create(
model="thai-llm-chat-v0.1",
messages=[
{"role": "user", "content": "ช่วยอธิบาย Machine Learning เป็นภาษาไทยง่าย ๆ"}
],
temperature=0.7
)
print(response.choices[0].message.content)API Reference
Base URL: https://api.example.com/v1
POST /v1/chat/completions— chat completion (stream optional)GET /v1/models— list available models
Authentication
Pass your key as a Bearer token. Keep it server-side; never expose it in a browser.
Authorization: Bearer YOUR_API_KEYChat Completion
cURL example:
curl https://api.example.com/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "thai-llm-chat-v0.1",
"messages": [
{"role": "user", "content": "สวัสดีครับ"}
],
"temperature": 0.7
}'Streaming
Set stream=True to receive tokens as they are generated.
stream = client.chat.completions.create(
model="thai-llm-chat-v0.1",
messages=[{"role": "user", "content": "เล่านิทานสั้น ๆ"}],
stream=True
)
for chunk in stream:
delta = chunk.choices[0].delta.content or ""
print(delta, end="", flush=True)Deployment
Serve with vLLM / TGI on B200 or LANTA, then switch the portal to real mode.
# Serve with vLLM (OpenAI-compatible) on B200 / LANTA
python -m vllm.entrypoints.openai.api_server \
--model thai-llm/thai-llm-chat-v0.1 \
--served-model-name thai-llm-chat-v0.1 \
--port 8000
# Point the portal at it (.env)
# LLM_API_MODE=real
# LLM_API_BASE_URL=http://your-b200-host:8000
# LLM_API_KEY=your-keyModel Limitations
- This is an early version — outputs may contain errors or hallucinations.
- Not intended for medical, legal, or financial advice.
- Knowledge is limited to its training data cutoff.
- Always review generated content before production use.