Jesus RodriguezinTowards AIInside NuminaMath: The AI Model that Took The First Place In the AI Math OlympiadThe model used strong data curation, fine-tuning processes, and algorithmic improvements to reach the top of the AIMO leaderboard.5d ago5d ago
Jesus RodriguezUnderstanding FlashAttention-3: One of the Most Important Algortihms to Make Transformers FastThe new version takes full advatange of H100 capabilities to improve attention in transformer models.Jul 15Jul 15
Jesus RodriguezinTowards AIInside 4M-21: Apple Small Model that Works Across 21 ModalitiesThe new model could be the foundation of Apple’s on-device AI strategy.Jul 82Jul 82
Jesus RodriguezInside DSPy: A Framework for Algorithmic Prompt OptimizationLaunched a few months ago, the framework has rapidly become one of the most complete LMP stacks in the market.Jun 241Jun 241
Jesus RodriguezinTowards AIMeet HUSKY: A New Agent Optimized for Multi-Step ReasoningNew research from Meta AI, Allen AI, and the University of Washington tackles one of the most important problems in LLM reasoning.Jun 182Jun 182
Jesus RodriguezinTowards AIThe Method OpenAI Uses to Extract Interpretable Concepts from GPT-4Highly scalable sparse autoencoders might be an interesting solution to one of the toughest challenges in generative AI.Jun 11Jun 11
Jesus RodriguezinTowards AISynthetic Data Generation in Foundation Models and Differential Privacy: Three Papers from…A reference architecture, security challenges and some recipes are some of the methods outlined in Microsoft’s papers.Jun 31Jun 31
Jesus RodriguezinTowards AIInside One of the Most Important Papers of the Year: Anthropic’s Dictionary Learning is a…The model builds on research from last year and tries to understand interpretable features in LLMs.May 283May 283
Jesus RodriguezinTowards AIInside Infini Attention: Google DeepMind’s Technique Powering Gemini 2M Token WindowThe method combines compressive memory and attention mechanisms in a single structure.May 20May 20
Jesus RodriguezinTowards AIInside AlphaFold 3: A Technical View Into the New Version of Google DeepMind’s BioScience ModelA highly improved architecture drastically expands the capabilities of AlphaFold.May 13May 13