Research - Super AI Engineer LLM

Research

How Super AI Engineer LLM is built

We share our methods openly — from tokenizer design to serving infrastructure.

Overview Coming soon

ภาพรวมสถาปัตยกรรมโมเดล วิธีการเทรน และผลการประเมินทั้งหมด

Data Draft

การออกแบบ tokenizer สำหรับภาษาไทย เพื่อลดจำนวน token และต้นทุน

Data Draft

กระบวนการรวบรวม ทำความสะอาด และคัดกรองข้อมูลภาษาไทยคุณภาพสูง

Training Coming soon

สูตรการ pretrain ตั้งแต่ต้น รวมถึง hyperparameters และ schedule

Safety Draft

แนวทาง alignment และความปลอดภัยสำหรับบริบทภาษาและวัฒนธรรมไทย

Eval Open source

ชุดเครื่องมือและ benchmark สำหรับวัดคุณภาพโมเดลภาษาไทย

Infra In progress

การ deploy บน B200 / LANTA ด้วย vLLM / TGI พร้อม streaming API