The most interesting part of the LlamaCon was the release of Llama API, which should enable Meta to reach more business users and developers
The most impactful announcement from the first session of LlamaCon, chaired by Chris Cox, was the launch of the Llama API in limited preview (min 46:55), which allows developers to build applications directly on top of Meta’s Llama 3 and Llama 4 models. This move could significantly expand third-party access to Llama models, beyond Meta’s own platforms such as Facebook, Instagram, and WhatsApp. The API supports model fine-tuning, evaluation, and seamless integration through lightweight SDKs.
I felt that the interview with Databricks CEO Ali Ghodsi was more of a PR exercise, mainly intended to showcase how Databricks clients are using Llama. However, it seemed that Zuckerberg is increasingly leaning toward the distillation process. In my view, distillation reduces the risk of Meta’s Llama models underperforming competitors, since superior models—such as DeepSeek V3 or GPT 4o—can be distilled and layered on top of Llama. Ali Ghodsi pointed out that when DeepSeek R1 was released, Databricks customers began using it on top of Llama via the distillation process (minute 1:24:52).
“But I think the reality is-part of the value around open source is that you can mix and and match-so if another model like DeepSeek is better, Queen is better at something, as developers, you have the chance to take the best parts of the intelligence from different the models and produce exactly what you need, which is going to be very powerful," Zuckerberg said (min 1:23:34).
“The distillation thing, I think is going to be a real big deal as we get into the world of different models,” Zuckerberg added (min 1:30:37).