/LLM

AI Engineering For AI Error Resolution

- Dr. Panos Patros tl;dr: Discover how this engineering team used Large Language Models (LLMs) for smarter debugging with AI Error Resolution, a feature that preloads prompts with relevant data, offering instant AI-powered solutions to production issues. Learn about their development journey, key requirements, and the impact on enhancing application reliability and security. Think AI Engineering has nothing new to offer? Read on to see how skilful software engineering still plays a crucial role when working with AI components.

featured in #525


Postgres Is All You Need, Even For Vectors

- Eric Zakariasson tl;dr: “When working with LLMs, you usually want to store embeddings, a vector space representation of some text value. During the last few years, we’ve seen a lot of new databases pop up, making it easier to generate, store, and query embeddings: Pinecone, Weaviate, Chroma, Qdrant. The list goes on. But having a separate database where I store a different type of data has always seemed off to me. Do I really need it?”

featured in #524


Let's Reproduce GPT-2 (124M)

- Andrej Karpathy tl;dr: “We reproduce the GPT-2 (124M) from scratch. This video covers the whole process: First we build the GPT-2 network, then we optimize its training to be really fast, then we set up the training run following the GPT-2 and GPT-3 paper and their hyperparameters, then we hit run, and come back the next morning to see our results, and enjoy some amusing model generations.” 

featured in #523


What We’ve Learned From A Year of Building With LLMs

tl;dr: “We’ve spent the past year building, and have discovered many sharp edges along the way. While we don’t claim to speak for the entire industry, we’d like to share what we’ve learned to help you avoid our mistakes and iterate faster. These are organized into three sections: tactical, operational and strategic.”

featured in #520


Don't Worry About LLMs

- Vicki Boykis tl;dr: Vicki shares challenges of working with LLMs, offering advice to focus on specific use cases, establishing clear evaluation metrics, building modular systems, and troubleshooting complex issues by getting "close to the metal."

featured in #518


Did GitHub Copilot Really Increase My Productivity?

- Yuxuan Shui tl;dr: “I had free access to GitHub Copilot for about a year, I used it, got used to it, and slowly started to take it for granted, until one day it was taken away. I had to re-adapt to a life without Copilot, but it also gave me a chance to look back at how I used Copilot, and reflect - had Copilot actually been helpful to me?”

featured in #513


What Can LLMs Never Do?

- Rohit Krishnan tl;dr: “Over the past few weeks I have been obsessed by trying to figure out the failure modes of LLMs. This started off as an exploration of what I found. It is admittedly a little wonky but I think it is interesting. The failures of AI can teach us a lot more about what it can do than the successes.”

featured in #511


How Does ChatGPT Work? As Explained By The ChatGPT Team

- Gergely Orosz tl;dr: When you ask ChatGPT a question, several steps happen: (1) Input: We take your text from the text input. (2) Tokenization: We chunk it into tokens. A token roughly maps to a couple of unicode characters. You can think of it as a word. (3) Create embeddings: We turn each token into a vector of numbers. These are called embeddings. (4) Multiply embeddings by model weights: We then multiply these embeddings by hundreds of billions of model weights. (5) Sample a prediction. 

featured in #508


How We Built Text-to-SQL At Pinterest

tl;dr: “We took the rise in availability of LLMs as an opportunity to explore whether we could assist our data users with this task by developing a Text-to-SQL feature which transforms these analytical questions directly into code.” The authors describe the tools evolution and implementation. 

featured in #507


Lessons After A Half-Billion GPT Tokens

- Ken Kantzer tl;dr: “I thought I’d share some of the more “surprising” lessons after churning through just north of 500 million tokens, by my estimate.” Lessons include: (1) When it comes to prompts, less is more. (2) You don’t need langchain. You probably don’t even need anything else OpenAI has released in their API in the last year. (3) Improving the latency with streaming API and showing users variable-speed typed words is actually a big UX innovation with ChatGPT.

featured in #506