/Search

How Levels.fyi Built Scalable Search With PostgreSQL

- Tanishq Singh tl;dr: The post outlines how Levels.fyi built a scalable fuzzy search solution using PostgreSQL that handles over 10 million search queries per month with p99 query performance under 20ms, outlining the key steps. 

featured in #504


A Search Engine In 80 Lines Of Python

- Alex Molas tl;dr: “Ever heard of the “Small Website Discoverability Crisis”? The problem it’s basically that small websites, ones like this one, are impossible to be found using Google or any other search engine. My mission? Making those tiny websites great again. In this post I will walk you through the journey of buliding a search engine from scratch using Python. This implementation doesn’t pretend to be a production-ready search engine, just a usable toy example showing how a search engine works under the hood.”

featured in #487


The Largest Money-Printing UI Element Ever Made

- Jim Nelson tl;dr: "The largest source of money flowing into the world of programming languages comes from Google paying to be the default search engine... Google took in $283bn in revenue in one year. Of that, $49bn went towards “traffic acquisition costs” which includes Google paying other browsers for the preference of being the default search engine."

featured in #473


Building In-Video Search

tl;dr: "Suppose it’s Christmas, and you want to create a great instagram piece out all the best scenes across Netflix films of people shouting “Merry Christmas”! Or suppose it’s Anya Taylor Joy’s birthday, and you want to create a highlight reel of all her most iconic and dramatic shots. Creating these involves sifting through hundreds of thousands of movies and TV shows to find the right line of dialogue or the appropriate visual elements (objects, scenes, emotions, actions, etc.). We have built an internal system that allows someone to perform in-video search across the entire Netflix video catalog, and we’d like to share our experience in building this system."

featured in #464


Create An Advanced Search Engine With PostgreSQL

- Tudor Golubenco tl;dr: “The Postgres approach to full-text search offers building blocks that you can combine to create your own search engine. This is quite flexible but it also means it generally feels lower-level compared to search engines like Elasticsearch, Typesense, or Mellisearch.” The article covers: (1) The tsvector and tsquery data types. (2) The match operator @@ to check if a tsquery matches a tsvector. (3) Functions to rank each match (ts\_rank, ts\_rank\_cd). (4) The GIN index type, an inverted index to efficiently query tsvector.

featured in #430


Semantic Search In iMessage, iMessage Wrapped, And AI Conversations

- JonLuca DeCaro tl;dr: “I realized that iMessage just stores its database locally as a sqlite file, so I went about building an alternate UI for searching, and adding in a few features that I thought would be interesting. These include: (1) Semantic Search (2) Wrapped: stats about my life on iMessage (2) AI conversations with friends. And more.

featured in #406


Image Stacks And iPhone Racks - Building An Internet Scale Meme Search Engine

- Matthew Bryant tl;dr: "There’s an ironic duality to most memes: the more niche they are, the more funny they tend to be… This presented an extremely common problem: I could never find the niche memes I wanted to send folks when I needed them most. Mid-conversation, spur-of-the-moment memes were always impossible to find. Scrolling through hundreds of saved images in my phone is not efficient searching as it turns out, so I decided to try to better solve the problem.”

featured in #391


The Technology Behind GitHub’s New Code Search

- Timothy Clem tl;dr: "We were motivated to create our own solution by three things: (1) We’ve got a vision for an entirely new user experience that’s about being able to ask questions of code and get answers through iteratively searching, browsing, navigating, and reading code. (2) We understand that code search is uniquely different from general text search. (3) GitHub’s scale is truly a unique challenge... north of 200 million repositories.

featured in #388


A Look At Search Engines With Their Own Indexes

- Rohan Kumar tl;dr: "I decided to test and catalog all the different indexing search engines I could find. I prioritized breadth over depth, and encourage readers to try the engines out themselves if they’d like more information."

featured in #328


Solving The Three Stooges Problem

- Rajiv Shah tl;dr: "In this blog post, we’ll talk about how traffic to Reddit’s search infrastructure is reminiscent of The Three Stooges’ doorway sketch, and we’ll outline our approach to remediate these request patterns. We’ll walk through our methodology step-by-step, and we hope that you’ll use it to make your own microservice boundary doorways more resilient to rowdy slapstick traffic."

featured in #238