Hello Llama Enthusiasts 🦙!
Another week has flown by, and we’re back with a jam-packed newsletter filled with updates on hackathons, guides, integrations, features, webinars, tutorials, blogs, and demos. If you have a project, blog post, or video that deserves a spotlight, we’d love to feature it! Just reach out to us at news@llamaindex.cloud.
Bonus: You can now get all these updates straight to your inbox! Simply visit our homepage and sign up for our email updates.
🤩 First, the highlights:
- AI.Engineer Summit: At the AI.Engineer Summit, Jerry Liu discussed RAG applications, while Simon led a workshop on RAG app optimization (Jerry’s slides, Simon’s slides)
- Text to pgVector: we launched PGVectorSQLQueryEngine for combined SQL and vector queries on PostgreSQL. (Docs, Tweet)
- Hugging Face Integration: Integrated with HuggingFace’s text-embeddings-inference server for high-speed, large-scale BERT model serving. (Docs, Tweet)
- Multi-Document Agents: New V1 agents support advanced multi-document retrieval and async query planning. (Docs, Tweet)
- Unstructured Parsing: Unveiled UnstructuredElementNodeParser, a hierarchical parser for embedded tables/text using UnstructuredIO. (Docs, Tweet)
- LLM Compatibility: We have charted LLM performances on various tasks and found that the Zephyr-7b-alpha model stands out as the top-performing 7B model in advanced RAG tasks. (Docs)
🏆 Congratulations to our AGI House Hackathon Winners!
We love seeing people build amazing things with LlamaIndex!
Build:
- Demostify
- Stick with Fit, SafeQuery, Cherry
Break:
Test:
- X-Ray Insight
Honorable Mentions:
- KindleGPT
- PenTest
🎤 LlamaIndex at AI.Engineer Summit:
- Jerry Liu gave a talk on Building production-ready RAG applications. Slides.
- Simon conducted a workshop on Building, Evaluating, and Optimizing your RAG App for Production with LlamaIndex. Slides, Code.
🗺️ Guides:
- LLM Compatibility Tracking: We’ve charted LLM performances on various tasks, revealing zephyr-7b-alpha as the only current 7B model excelling in advanced RAG/ Agentic tasks. Docs.
- Evaluations: Adjusting chunk size is essential for RAG apps. Having more chunks isn’t necessarily better, and re-ranking might be counterproductive. To fine-tune, experiment with different chunk sizes and top-k values. The Arize AI team has provided a guide to help you evaluate using Arize AI Phoenix and Llama Index. Slides, Notebook.
✍️ Tutorials:
- Shahul’s tutorial demonstrates how to choose the best embeddings for your data, emphasizing that retriever performance and embedding quality are crucial for a RAG system’s efficacy using the LlamaIndex and RAGAS libraries.
- Wenqi Glantz’s tutorial on Evaluation Driven Development for RAG Pipelines.
- Wenqi Glantz’s tutorial on Masking PII Data in the RAG Pipeline.
- Ofer Mendelevitch’s from Vectara has a tutorial on Retrieval Augmented Generation with LlamaIndex on comparing Vectara’s new Boomerang model to OpenAI and Cohere.
- Patrick Loeber from AssemblyAI has a tutorial on Build LlamaIndex Audio Apps.
- Pradip Nichite made a tutorial on NL2SQL with LlamaIndex: Querying Databases Using Natural Language.
- Mayo Oshin has a tutorial on How to Compare Multiple Large PDF Files.
- Sudarshan Koirala made a tutorial on Chat With Documents with LlamaIndex and Pinecone.
💡 Demos:
- Siva Surendira built YC Bot to get instant startup advice from your favorite YC mentors.
✨ Feature Releases and Enhancements:
- Text to pgVector: We introduced the PGVectorSQLQueryEngine, which allows you to query a PostgreSQL database using both full SQL and vector search simultaneously. Docs, Tweet.
- Multi-Document Agents: We introduce Multi-Document Agents (V1) that can now retrieve across multiple docs and plan queries asynchronously, offering a superior analysis compared to standard RAG. Docs, Tweet.
- UnstructuredIO: We’ve partnered with UnstructuredIO to enhance LLM/RAG applications. By extracting tables from PDFs, we’ve improved query methods beyond basic vector indexing, enabling hybrid queries and cross-document comparisons, especially for tabular questions. Docs, Tweet.
- UnstructuredElementNodeParser: Going beyond basic text-splitting, we introduce the UnstructuredElementNodeParser. It models embedded tables/text hierarchically in a data graph using UnstructuredIO. Docs, Tweet.
- Cross-Encoder Fine-Tuning: Cross-encoders enhance RAG by refining post-embedding search results. With LlamaIndex, you can now fine-tune cross-encoders on any document, boosting performance. Docs, Tweet.
⚙️ Integrations & Collaborations:
- Assembly AI: We introduced a new data reader for audio data integration with AssemblyAI. This integration allows effortless audio loading and facilitates building vector store indices and query engines for inquiries. Docs, Tweet.
- Nougat — MetaAI: We integrated Nougat, an exceptional OCR tool from Meta, that excels in interpreting scientific papers, notably mathematical notations, and LaTeX as a loader in LlamaHub, allowing streamlined processing of ArXiv papers within the RAG pipeline. Docs, Tweet.
- Hugging Face-Text Embeddings Inference: We integrated with the new text-embeddings-inference server from HuggingFace offering production-scale serving with distributed tracing for all BERT models at impressive speeds. Docs, Tweet.