/


Together AI
Cloud platform for training, fine-tuning, and serving open-source models like Llama and DeepSeek. Offers serverless inference endpoints, dedicated GPU clusters on Blackwell hardware, and custom pre-training. Research team behind FlashAttention and ThunderKittens; inference stack uses speculative decoding and custom GPU kernels for low-latency, high-throughput serving.
Links
LAST 30 DAYS
No records found
Try modifying the filters
