/AI

How Does ChatGPT Work? As Explained By The ChatGPT Team

- Gergely Orosz tl;dr: When you ask ChatGPT a question, several steps happen: (1) Input: We take your text from the text input. (2) Tokenization: We chunk it into tokens. A token roughly maps to a couple of unicode characters. You can think of it as a word. (3) Create embeddings: We turn each token into a vector of numbers. These are called embeddings. (4) Multiply embeddings by model weights: We then multiply these embeddings by hundreds of billions of model weights. (5) Sample a prediction. 

featured in #508


I Accidentally Built A Meme Search Engine

- Harper Reed tl;dr: “I built a meme search engine using siglip / CLIP and vector encoding images. It was fun and I learned a lot. I have been building a lot of applied AI tools for a while. One of the components that always seemed the most magical has always been vector embeddings. Word2Vec and the like have straight blown my mind. It is like magic.” Harper describes his journey and shares the results. 

featured in #508


Lessons After A Half-Billion GPT Tokens

- Ken Kantzer tl;dr: “I thought I’d share some of the more “surprising” lessons after churning through just north of 500 million tokens, by my estimate.” Lessons include: (1) When it comes to prompts, less is more. (2) You don’t need langchain. You probably don’t even need anything else OpenAI has released in their API in the last year. (3) Improving the latency with streaming API and showing users variable-speed typed words is actually a big UX innovation with ChatGPT.

featured in #506


Using GitHub Copilot In Your IDE: Tips, Tricks, And Best Practices

tl;dr: 15 tips include: (1) Open your relevant files. (2) Provide a top-level comment. (3) Set includes and references. (4) Meaningful names matter. (5) Provide specific and well-scoped function comments. (6) Provide sample code. (7) Inline chat with GitHub Copilot. (8) Remove irrelevant requests. (9) Navigate through your conversation. (10) Use the @workspace agent. 

featured in #503


Claude And ChatGPT For Ad-Hoc Sidequests

- Simon Willison tl;dr: The author demonstrates a quick ”sidequest" task where he converted the shapefile of a largest park in NY to a GeoJSON polygon in just 6 minutes. “One of the greatest misconceptions concerning LLMs is that they’re easy to use. They aren’t: getting great results requires a great deal of experience and hard-fought intuition, combined with deep domain knowledge of the problem you are applying them to.”

featured in #501


What I Learned From Looking At 900 Most Popular Open Source AI Tools

- Chip Huyen tl;dr: I think of the AI stack as consisting of 4 layers: (1) Infrastructure: Toolings for serving, vector search and database. (2) Model development: Toolings for developing models and anything that involves changing a model’s weights. (3) Application development with readily available models. This is the layer that has seen the most actions in the last 2 years and is still rapidly evolving. (4) Applications: Most popular types of applications are coding, workflow automation, information aggregation. 

featured in #499


How To Measure The Impact Of Generative AI Code

- Ben Lloyd Pearson tl;dr: What’s the ROI of your GenAI code? By the end of 2024, GenAI is projected to generate 20% of all code – or 1 in every 5 lines. Learn how to use PR labels to get telemetry on GenAI code, allowing metric tracking that compares AI-generated code against unlabeled PRs. With this free automation, you can track the ROI of your GenAI investments and identify potential security and compliance risks.

featured in #489


How To Measure The Impact Of Generative AI Code

- Ben Lloyd Pearson tl;dr: What’s the ROI of your GenAI code? By the end of 2024, GenAI is projected to generate 20% of all code – or 1 in every 5 lines. Learn how to use PR labels to get telemetry on GenAI code, allowing metric tracking that compares AI-generated code against unlabeled PRs. With this free automation, you can track the ROI of your GenAI investments and identify potential security and compliance risks.

featured in #485


A Developer’s Second Brain: Reducing Complexity Through Partnership With AI

- Eirini Kalliamvakou tl;dr: “As we look to empower developers with AI tools, we inadvertently integrate AI deeper into the way developers work. How do developers feel about that? And what are the most impactful ways to introduce more AI into workflows? We recently conducted 25 in-depth interviews with developers to understand exactly that.”

featured in #482


Reshaping The Tree: Rebuilding Organizations For AI

- Ethan Mollick tl;dr: "AI is impacting organizations, and managers need to start taking an active role in shaping what that looks like. There is no central authority that can tell you the best ways to use AI - every organization will need to figure it out for themselves.” Ethan proposes some principles:" (1) Let teams develop their own methods. Given that AIs perform more like people than software, they are often best managed as additional team members. (2) Build for the oncoming future. It is clear that advanced models are coming fast. (3) Organizations that wait to experiment will fall behind very quickly. 

featured in #470