calendar_today 2025-01-27 language Large Language Models

Navigating the Landscape of Large Language Models

calendar_today 2025-01-17 attribution sebastianraschka.com/blog/

Implementing A Byte Pair Encoding (BPE) Tokenizer From Scratch

This blog post provides an educational, from-scratch implementation of the Byte Pair Encoding (BPE) tokenization algorithm, commonly used in models like GPT-2 and Llama 3. It explains the BPE algorithm, contrasting it with other implementations like OpenAI's open-source version and . The post includes code examples for training, encoding, and decoding text, as well as loading pre-trained GPT-2 tokenizers. While the provided implementation prioritizes readability, it offers a valuable resource for understanding BPE tokenization.
Good summary?
calendar_today 2025-01-23 attribution sebastianraschka.com/blog/

Noteworthy LLM Research Papers of 2024 —12 influential AI papers from January to December 2024

This article highlights key advancements in Large Language Models (LLMs) throughout 2024, focusing on one impactful research paper per month. It covers diverse topics such as Mixtral's Mixture of Experts approach, LoRA and DoRA finetuning methods, continual pretraining strategies, and the debate between DPO and PPO for LLM alignment. It also discusses the FineWeb dataset, Llama 3 models, scaling inference-time compute, multimodal LLM paradigms, OpenAI O1's reasoning capabilities, and LLM scaling laws for precision. Finally, it touches on Phi-4 and synthetic data.
Good summary?
calendar_today 2025-01-16 attribution huyenchip.com/blog/

Common pitfalls when building generative AI applications

Building applications with foundation models is still in its early stages, making mistakes common. One frequent pitfall is using generative AI when simpler solutions suffice, such as in energy optimization or anomaly detection. Confusing a bad product with bad AI is another issue, as user experience is often the critical differentiator. Other common mistakes include over-relying on new frameworks/finetuning too early, over-indexing on initial success, skipping human evaluation, and lacking a big-picture strategy. Teams with the best products use human evaluations to improve AI judges.
Good summary?