Jesus RodriguezinTowards AIAnthropic New Research Shows that AI Models Can Sabotage Human EvaluationsThe new research proposes a framework for assessing a model’s ability to subvert human evaluations.Oct 28Oct 28
Jesus RodriguezInside Meta AI’s New Method to Build LLMs that Think Before they SpeakThought Preference Optimization could be the new foundation for “Thinking LLMs”.Oct 22Oct 22
Jesus RodriguezinTowards AIInside OpenAI’s MLE-Bench: A New Benchmark for Evaluating Machine Learning Engineering Capabilities…The new benchmark evaluates AI agents in areas such as pretraining, evaluation and others.Oct 15Oct 15
Jesus RodriguezinTowards AILearn About Movie Gen: Meta AI Upcoming Video Generation ModelThe new model represents an important milestone in video and audio generation.Oct 71Oct 71
Jesus RodriguezinTowards AIInside AlphaProteo, Google DeepMind’s New Model for Next Generation Protein DesignThe new model focuses on the design of protein binders which could have major implications in modeling protein interactions. +Oct 12Oct 12
Jesus RodriguezinTowards AIInside EUREKA: Microsoft Research’s New Framework for Evaluating Foundation ModelsThe framework provides an evaluation pipeline as well as a collection of benchmarks for evaluating language and vision capabilities.Sep 23Sep 23
Jesus RodriguezinTowards AIInside DataGemma: Google DeepMind’s Initiative to Ground LLMs in Factual KnowledgeThe model comes accompanied by DataCommons, a data repository based on factual data.Sep 161Sep 161
Jesus RodriguezinTowards AIInside xLAM: Salesforce’s Models Specialized for Agentic TasksThe family of models is highly optimized for function calling.Sep 101Sep 101
Jesus RodriguezinTowards AIInside GameNGen: Google DeepMind’s New Model that can Simulate Entire 1993’s DOOM Game in Real TimeGameNGen represents a major milestone in creating generative AI models that can interact with complex real world environments.Sep 21Sep 21
Jesus RodriguezinTowards AIHow NVIDIA Pruned and Distilled Llama 3.1 to Create Minitron 4B and 8BThe new models are using state of the art pruning and distillation techniques.Aug 262Aug 262