Announcing our Document Research Assistant, a collaboration with NVIDIA!
LlamaIndex

Ravi Theja 2023-08-01

LlamaIndex Update — 08/01/2023

Greetings once again, LlamaIndex community!

Welcome to the third installment of our LlamaIndex Update series. Your active participation continues to drive our open-source community forward. We appreciate every contribution whether you’re an experienced LlamaIndex contributor or a newcomer!

In our latest edition, we’ve prepared an assortment of updates for you. From advancements in Data Agents and LlamaIndex TS, benchmarking, and a host of inspiring events, webinars, blog posts, and demos, we’ve got plenty in store.

Without more ado, let’s dive into these updates.

New Features:

  1. We heard you! LlamaIndex has completely revamped our documentation. The update includes new clearer documents on high-level concepts, detailed module guides, comprehensive tutorials, and an all-inclusive API reference. Docs, Tweet
  2. LlamaIndex launched Data Agents, an innovative feature that combines AI agents with data. This launch introduces components like an agent reasoning loop and tool abstractions. Accompanied by an extensive upgrade to LlamaHub, the new feature offers more than 15 tool specs for easy integration. Data Agents enhance query capabilities and are designed to handle varied data applications. Docs, Tweet, Blog Post
  3. LlamaIndex launched LlamaIndex.TS, a lean Typescript package for building robust Retrieval Augmented Generation (RAG) systems. It simplifies tasks like document parsing and tackling context window limitations. LlamaIndex.TS is ideal for quickly building apps like using frameworks like Next.JS to chat over your data. Docs, Tweet, Blogpost
  4. LlamaIndex teams up with Zapier Natural Language API (NLA), reducing the cognitive load on the data agent when handling APIs with multiple parameters. Zapier NLA translates complex third-party APIs into simpler interfaces using a single natural language parameter: instruction. This helps the data agent concentrate on tool selection and action orchestration. Tweet, Blogpost
  5. LlamaIndex’s ContextChatEngineaddresses the issue of conversational agents hallucinating information by ensuring retrieval of context with every user interaction. This feature, compatible with all ReAct and OpenAI Function agent types, prepends retrieved-context as a system message. Docs, Tweet
  6. This month marked the launch of two new exciting LLMs. First off was Anthropic Claude 2.0. We launched with day 1 support of the new model. Docs, Tweet. The other one was Llama2, and LlamaIndex now offers best-in-class integration with the Llama2 model on Replicate. Docs, Tweet
  7. LlamaIndex is day one compatible with Chroma v0.4.0, enhancing support for in-memory, persisted, and self-hosted databases. This upgrade simplifies the use of Chroma within LlamaIndex, making database handling easier and more efficient. Tweet
  8. LlamaIndex’s newly launched Data Agents can automatically interact with any API defined via an OpenAPI spec. It handles indexing/loading of large data from API specs and facilitates easy integration of the OpenAPI Tool, enhancing the ability to call web services. Docs, Tweet
  9. LlamaIndex now utilizes the rebel-large model for high-speed relation extraction. Combined with CUDA, you can generate knowledge graphs from your text data. Tweet
  10. LlamaIndex introduced a code interpreter tool. This feature equips any LLM with the ability to analyze data and generate visualizations, expanding their capabilities similar to those of ChatGPT. Tweet
  11. LlamaIndex now integrates with Eduardo Reis’s Llama 2 functions API at llama-api.com. Tweet
  12. LlamaIndex TS now supports integration with OpenAI Whisper. Docs, Tweet
  13. LlamaIndex now seamlessly integrates with Kùzudb, allowing users to directly store extracted knowledge graphs/triples for advanced processing, querying, and visualization. Docs, Tweet
  14. LlamaIndex combines data agents with text-to-image models enhancing user prompts with relevant context from a knowledge base. This integration allows for more advanced multimodal reasoning by merging LLM RAG systems with text-to-image tools. Docs, Tweet

Benchmarking:

  1. LlamaIndex now supports BEIR, an Information Retrieval benchmark. Users can define custom retrievers within LlamaIndex, apply the vector index, or implement reranking steps, and then easily evaluate their methods using any dataset from BEIR. Tweet
  2. LlamaIndex’s Llama2 agents have shown promising performance in our agent task benchmark. Especially notable is their capability to appropriately use tools within a ReACT loop. However, the tasks’ difficulty varies, with both 13B and 70B models notably refraining from dialing a phone number, underlining the AI’s limitations. Tweet
  3. LlamaIndex now has integration with the HotpotQA benchmark! This enables rigorous testing of LLM’s multi-hop reasoning capabilities by providing the full context to the models, helping you evaluate their performance more accurately. Perfect for stress-testing LLMs like ChatGPT, Claude 2, PaLM, and more. Plus, explore how context reordering can simplify tasks for your LLMs. Tweet
  4. LlamaIndex now supports over 20 vector databases, each with unique features and capabilities. To help understand their differences, we have compiled a comprehensive comparison table, guiding the choice of the optimal database for the use case. Tweet

Tutorials:

We were excited to see so many people making tutorials for LlamaIndex this month!

  1. Adam Hofmann’s blog post on Building Better Tools for LLM Agents.
  2. Weaviate’s tutorial on using the Llama2 model with LlamaIndex and Weaviet on external data.
  3. Erika’s tutorial on VectorStore Index, List Index, and Tree Index.
  4. James Maslek’s tutorial on Breaking Barriers with OpenBB and LlamaIndex: Simplifying data access to 100+ trusted sources.
  5. Ayush Thakur’s tutorial on Building Advanced Query Engine and Evaluation with LlamaIndex and W&B.
  6. Trulens’s tutorial on using LlamaIndex Yelp agent to answer queries using Yelp data, and evaluate it for definitiveness and accuracy using custom feedback functions, compare its performance against a standalone LLM.
  7. Airbyte’s tutorial on Chat with your data warehouse without writing SQL.
  8. Anil Chandra Naidu’s tutorial on Retrievers and QueryEngines.
  9. Wenqi Glantz’s tutorial on Exploring Snowflake and Streamlit With LlamaIndex Text-to-SQL.

And from the LlamaIndex team:

  1. Logan’s tutorial on a comprehensive understanding of embedding models, their benchmarking, and their implementation in LlamaIndex, with a focus on OpenAI and Instructor embeddings, enabling semantic search through numerical text representations.
  2. Logan’s tutorial on the evaluation of query engines using LlamaIndex, learn to handle uncontrolled outputs and runtime costs while measuring performance with GPT-4.
  3. Ravi Theja’s tutorial on Key Components to build QA Systems.

Webinars:

  1. Webinar with Didier Lopes, CEO/Co-Founder at OpenBB on LLMs for Investment Research.
  2. Webinar on Building & Evaluating an Advanced Query Engine Over Your Data with Weights and Biases.
  3. Webinar with Jason Liu on From Prompt to Schema Engineering with Pydantic.

Events:

  1. LlamaIndex and Arize workshop on LLM Search & Retrieval Systems with Arize and LlamaIndex: Powering LLMs on Your Proprietary Data.
  2. LlamaIndex and TruLens workshop on building an LLM App.
  3. TPF (The Product Folks) workshop session on Building QA Systems With LlamaIndex by Ravi Theja.
  4. Ravi Theja talk at the Speciale VC GenAI meetup in Chennai on Beyond the Basics: Leveraging LlamaIndex from Concept to Production.
  5. Data Agents session at TPF X Nexus VC Buildathon by Ravi Theja.

Demos:

  1. Tali.AI at the Augment hackathon dove into the future of support roles by developing an Autonomous Support Bot using LlamaIndex. Tweet
  2. SuperAGI integrated with LlamaIndex which enables AI agents to process a wide variety of data from both structured and unstructured sources including Docx, PDF, CSV files, videos, and images. Tweet