LlamaIndex • 2024-04-02
LlamaIndex Newsletter 2024-04-02
Greetings, LlamaIndex community! 🦙
Welcome to another exciting weekly update from LlamaGalaxy! We're thrilled to share a range of fantastic updates with you, including the introduction of RAFT LlamaPack, enhanced memory and cost efficiency in RAG with Cohere's embeddings, and much more.
🤩 The highlights:
- DeepLearningAI Course: JavaScript RAG Web Apps with LlamaIndex collaborative course with DeepLearningAI. Course, Tweet.
- RAFTDatasetPack LlamaPack: Introduced RAFTDatasetPack for dataset generation using RAFT - Retrieval Augmented Fine Tuning for training models to differentiate between relevant 'oracle' documents and 'distractor' documents. LlamaPack, Tweet.
- Memory Efficiency with Cohere Embeddings: Utilize Cohere's Int8 and binary embeddings for cost-effective and low-memory RAG operations. Notebook, Tweet.
- Python Docs Makeover: Revamped Python documentation with accessible example notebooks, advanced search, and comprehensive API details. API Ref, Tweet, Docs
✨ Feature Releases and Enhancements:
- We introduced RAFT - Retrieval Augmented Fine Tuning, a method from Tianjun Zhang and Shishir Patil to enhance domain-specific RAG performance in LLMs. By training models to differentiate between relevant 'oracle' documents and 'distractor' documents, RAFT improves context understanding. Try it out with our new RAFTDatasetPack LlamaPack for dataset generation. LlamaPack, Tweet.
- We collaborated with DeepLearningAI for a course that goes beyond teaching RAG techniques; it guides you on integrating RAG into a full-stack application. Learn to construct a backend API, develop an interactive React component, and tackle the unique challenges of deploying RAG on a server rather than just in a notebook. Course, Tweet.
- We integrated with Cohere's Int8 and Binary Embeddings for a memory-efficient solution for your RAG pipeline. This addresses the high memory usage and costs associated with large dataset operations in RAG. Notebook, Tweet
- We launched revamped Python docs with top-level example notebooks, improved search with previews, and overhauled API documentation. API Ref, Tweet, Docs
🎥 Demos:
- RestAI, a project by Pedro Dias is a nifty platform that offers RAG, advanced text-to-SQL, and multimodal inference as a service with a nifty UI.
- Ragdoll and Ragdoll Studio by bennyschmidt: Create AI Personas for characters, web assistants, or game NPCs using LlamaIndex TS, local LLMs, and image generation with Ollama and StabilityAI.
🗺️ Guides:
- Guide to Designing RAG Systems by Michał Oleszak for an in-depth look at crucial design decisions in building efficient RAG systems, spanning five key areas: Indexing, Storing, Retrieval, Synthesis, and Evaluation.
✍️ Tutorials:
- Sujit Patil tutorial on combining semantic chunking with hierarchical clustering and indexing for RAG content enrichment.
- Florian June's tutorial on crafting a dynamic RAG system with integrated reflection, a guide to building Self-RAG from scratch.
- Laurie's video tutorial on using LlamaParse's LLM-powered parsing turns complex insurance policies into clear yes-or-no statements, improving LLM responses on coverage queries.
- Akriti’s tutorial on Building Real-Time Financial News RAG Chatbot with Gemini, and Qdrant.
- Marco Bertelli's tutorial on deploying a RAG server for real-time use, and covering efficient embedding serving, concurrent request handling, and failure resilience.
- Sudarshan Koirala’s tutorial on building advanced PDF RAG with LlamaParse and purely local models for embedding, LLMs, and reranking.
🎥 Webinars:
- Register for a webinar with Tianjun Zhang and Shishir Patil on how to do retrieval-augmented fine-tuning (RAFT).
- Webinar with Daniel on CodeGPT - a platform for AI Copilots that help your coding workflows, with components built on top of LlamaIndex components.
- Vectara’s Panel Discussion on 'Why RAG will Never Die?’.