The Zipline Playbook (David vs. Goliath)

In the race to automate the future, the loudest voices often lose. The real transformation happens quietly, forged by teams who are out of money, out of time, and backed into a corner. This is the story of how a small startup, once dismissed as a sideshow, is becoming a quiet giant poised to take on the $14 trillion global logistics industry. It is the story of how they out-executed Amazon, the most powerful and logistics-obsessed company on Earth.

They didn’t win by having more. They won by having less. This is the Zipline Playbook, and its core lesson is this: adapting to constraints forces clarity, speed, and precision. It builds urgency and urgency is how you win.

In engineering, five constraints surface again and again:

  1. Time: When delay means irrelevance or death
  2. Budget: When every dollar must perform miracles
  3. Resources: When a 20-person team must achieve what a 2,000-person division cannot
  4. Technical: When the laws of physics make no exceptions
  5. Regulatory: When seeking permission is part of the engineering process itself

These are not blockers. They are the blueprint.

On December 1st, 2013, during an episode of 60 Minutes, Jeff Bezos introduced Amazon Prime Air, a new drone delivery concept that reflected the company’s growing ambitions. At the time, Amazon was already a dominant force in global commerce, with a market cap exceeding $119 billion, annual revenue approaching $74.5 billion, a workforce of more than 117,000, and a rapidly expanding network of 50 fulfillment centers worldwide. Bezos presented Prime Air not as a distant dream, but as a natural extension of Amazon’s logistics capabilities.

The underlying math was shocking. More than 86% of Amazon’s orders weigh under 5-lb. Swap a 4,000-lb gas van for a 15-lb electric drone, and you trade traffic for direct flight, diesel for electrons, and “arriving in 2 days” for “landing in 30 minutes.” Operating costs would plummet. Bezos told Charlie Rose that this wasn’t a distant dream. He predicted drones would be delivering packages to customers in as little as 4 to 5 years.

But as Amazon’s vision captivated the media, a fledgling startup was approaching the same problem from the opposite direction. Its founder, Keller Rinaudo Cliffton, had seen the orange Kiva robots automating warehouses for Amazon and had a simple, world-changing thought: someone needed to build Kiva for outside the warehouse. He envisioned an automated, on-demand logistics network that could serve everyone on Earth.

Keller and his team knew that to achieve this grand vision, they first needed a beachhead, a single, critical use case where the need was desperate and the value of their solution was undeniable. They strategically chose healthcare logistics. A life-or-death delivery commands more willingness to pay, and, crucially, they believed it would give them a compelling case for regulatory approval.

And therein lay the regulatory constraint. Between 2014 and 2016, FAA rules made Zipline’s model of long-range, autonomous delivery effectively impossible in the United States. With no money, no reputation, and no operational data, they couldn’t afford to spend years trapped in a regulatory holding pattern. While Amazon stayed domestic and became mired in delays, Zipline made the hard call to leave. Rwanda offered something the U.S. couldn’t: a government willing to move fast, a healthcare system in urgent need, and a real-world proving ground for their technology.

Armed with this strategy, they arrived in Rwanda with an audacious pitch for the Minister of Health: Zipline would build and operate a national, on-demand delivery service for all medical products, to every hospital and clinic in the country. The Minister listened, then cut through their sprawling vision with a simple, focusing order: “Keller, shut up. Just do blood.”

Blood was a logistical nightmare and a matter of life and death. Platelets last only five days and require constant agitation. Red blood cells need refrigeration and expire in 42 days. It was the perfect, intensely painful problem to solve.

The contract with the Rwandan government was both a lifeline and a ticking clock. Zipline was operating on a shoestring budget, having raised a small Series A in 2012 and a couple extensions in 2015, a rounding error for a company like Amazon. This severe budget constraint meant there was no room for error or expensive R&D detours; every dollar had to be stretched to its limit. Compounding this was an intense time constraint: they had to get a functional system operational, and quickly, to prove their concept. Failure meant the company would die.

The technical constraint was just as unforgiving. Designing an autonomous fixed-wing aircraft that could catapult-launch, navigate mountainous terrain, drop with meter-level precision, and survive tropical storms was already a daunting engineering challenge, especially in 2015 when the tech was still new. But the drone itself made up only 15% of the real technical complexity. The remaining 85% lived behind the scenes: air-traffic-control dashboards for regulators, computer-vision pre-flight checks, detect-and-avoid systems so multiple drones could share airspace safely, and a data pipeline that logged a gigabyte from every flight to improve reliability. These challenges couldn’t be solved by theorizing in a lab or designing in an ivory tower. They had to be solved by building in the field, side by side with customers, and stripping away anything that didn’t matter. The sheer amount of technical complexity demanded brutal focus. Every decision had to serve the mission. Everything else had to go.

The resource constraint showed up immediately. They had only 20 people. That alone was crazy. For nine long months, they served just one hospital. The system was fragile. Every delivery was a fight. Unlike at Amazon, where massive teams handle tightly scoped problems, Zipline’s engineers had to do everything themselves. There were no handoffs or buffers. The same people who wrote the flight code also built launchers, tested recovery systems, and built hardware out of shipping containers. All-nighters, working weekends, and live-debugging were the baseline. Amazon had forklifts, global infrastructure, and billions of dollars. Zipline had twenty people, a mission, and no other option. This constraint forced them to do ten times more with one-tenth the people.

After nine months, it was running reliably. In the next three months, Zipline expanded service to the remaining 20 hospitals in the contract. Then 50. Then 400 primary care clinics across Rwanda. From that single, fragile beachhead, they scaled across the world: Ghana, Nigeria, Côte d’Ivoire, Japan, the United States. Walmart signed on. So did Intermountain Health. And the NHS. By 2023, Zipline had flown over 50 million autonomous miles and served more than 3,000 hospitals and clinics across four continents. Rwanda awarded them a $61 million national contract, making Zipline the backbone of its healthcare logistics.

Here, the paradoxical power of constraint reveals itself. An unconstrained company with a massive budget has the luxury of chasing complexity, building elegant systems in labs, stacking features no one asked for, and perfecting technology that drifts further from real customer needs. A constrained startup has no such luxury. Zipline had to adapt by solving the single most important problem for a real customer, because that was the only path to survival. Every decision had to earn its place. While Amazon spent billions developing a flashy rotor-based drone, Zipline built a fixed-wing aircraft because it was the best way to solve their customer’s problem. A fixed-wing airframe covers more ground, uses less power, and is simpler to maintain. That choice, driven by necessity, shaped everything.

Ironically, the company famous for being customer-obsessed built something no real customer asked for, while Zipline, under pressure and constraint, listened closely. A customer couldn’t have been more direct: shut up, just do blood.

This became their philosophy. Zipline was never the best funded or flashiest team. Their edge came from adapting faster by being relentlessly practical. They did not optimize for prestige. They optimized for reality. And they understood that the most important engineering insights do not come from whiteboards or design reviews. They come from customers, from real-world use, and from the painful, humbling lessons that only surface when your product is live and every mistake matters.

Abundance breeds complexity. Constraint forces a brutal elegance. Stripping away everything non-essential didn’t just make the system cheaper. It made it better. Today, Zipline has flown over 100 million autonomous miles, completed more than 1.5 million deliveries, and most impressively, done it all without a single human safety incident. Meanwhile, Amazon is still demoing. Prime Air made its first deliveries in late 2022, nearly a decade after its TV debut. In January 2025, it grounded flights after a rain-induced crash exposed a software fault. Today, it has completed fewer than one hundred deliveries. Zipline crossed a million deliveries before Amazon crossed a hundred. While Amazon’s drones dropped boxes in suburban cul-de-sacs, Zipline was flying through tropical storms, over mountains, and under dense air-traffic control. Checkmate.

The divergence wasn’t an accident. It was the technology hype cycle in action. The initial excitement for drone delivery obscured the immense, underlying complexity of the problem. While others operated in the bubble of hype, Zipline spent a decade in the “trough of disillusionment,” the long, painful period where the actual work gets done.

Everything is Robot

In 2011, Marc Andreessen wrote that software would eat the world. He was right. Fourteen years later, the world runs on software. Now he’s saying something bigger. “Robotics is going to be the biggest industry in the history of the planet. There are going to be hundreds of billions of robots of all shapes and sizes.” But here’s the uncomfortable truth: software isn’t done eating. It hasn’t even had breakfast. Software will keep eating the world, and the next course on the table is the physical world of atoms.

We think of robots as humanoids. Boston Dynamics. Tesla Optimus. Figure. Machines that walk and wave and weld. But that’s just one shape. Robots don’t need arms and legs. Tesla is already building robots on wheels. Zipline is building robots in the sky. Gecko Robotics builds wall-crawling robots that inspect power plants and refineries. A robot is not a body type. A robot is autonomy wrapped in hardware.

That’s the real shift. Hardware is the entry point. Autonomy is the multiplier. The body without the brain is nothing. The chassis is the shell. The moat is the mind.

Take Tesla. On the surface, it looks like a car company. But its moat isn’t the vehicle. It’s Autopilot, Full Self-Driving, and the massive neural nets trained on billions of miles of data. Tesla is a software AI company that happens to sell hardware. Over the next decade, cars will transform from tools of transportation to autonomous robots. And the companies that own the autonomy will own the industry.

Or look at Apple. The iPhone in 2025 isn’t what keeps people locked in. It’s not the aluminum or the glass. The real moat is iOS, the integration across apps, payments, cloud, and services. The iPhone is not a phone. It’s a software ecosystem you can’t leave.

NVIDIA is another example. The GPU was a leap in hardware engineering. But the thing that made NVIDIA untouchable wasn’t the chip. It was CUDA, the software layer that locked in developers and created an ecosystem. That’s why NVIDIA went from graphics cards to the most important AI company in the world.

This is the pattern. Hardware is a wedge. Software is the empire.

The world of honeybees

If you also think of yourself as useless, then you must read this…

The world of honeybees is one of nature’s most astonishing miracles, where every individual is busy performing a specific role. Inside this tiny hive operates a complex social system whose two most important figures are the queen and the drones. Their lives are so different that it is hard to believe they belong to the same species.

The queen bee is the heart of the colony, a monarch who lives among her subjects yet is so unique that she stands apart in every way. Her body is noticeably longer and more elegant than other bees, and when she moves inside the hive, worker bees surround her like guards protecting their queen. Interestingly, however, the queen is not born a princess. She comes from an ordinary egg, just like any worker bee.

The only difference is that when the colony needs a new queen, worker bees begin feeding certain larvae a special substance called “royal jelly.” This milky white food is so powerful that it changes the entire genetic destiny of the larva. A larva that might have become an ordinary worker instead becomes a queen. It is a masterpiece of nature’s chemistry: simply by changing the diet, a bee’s entire life, size, lifespan, and abilities are transformed.

One of the most fascinating chapters of the queen’s life is her mating flight, which happens in daylight rather than at night. When the queen matures, she leaves the hive only once in her lifetime for this purpose. She flies high into the sky, and drones chase after her. It becomes an extraordinary competition, with hundreds of male bees pursuing one queen, though only a few reach her. The queen mates with several drones in the air and stores their sperm inside her body, enabling her to lay millions of eggs over the next five years.

Now consider the story of the drones, perhaps among nature’s strangest creations. These male bees are born without a father. When the queen lays unfertilized eggs, they develop into drones containing only the mother’s genes. These bees are larger, have big eyes, and live carefree lives. They have no sting, do not make honey, do not collect pollen, and do not even gather their own food. Worker bees feed them, almost as if idle princes are being raised.

A drone’s entire life has only one purpose: to mate with a queen. But here lies the irony—any drone that succeeds dies immediately after mating because his reproductive organs remain inside the queen. Success itself leads to death. And the drones that fail in this competition do not fare much better.

When winter approaches and food becomes scarce, worker bees throw the lazy drones out of the hive. It is a harsh decision but necessary for the colony’s survival. The drones die from cold and hunger, while the workers and the queen survive the winter safely inside the hive.

This system is strange yet remarkably successful. A queen, who can live up to five years, lays more eggs daily than her own body weight. Imagine laying around two thousand eggs per day—one every few seconds—continuing for months. She releases special chemical signals that keep the entire colony united. If the queen dies or becomes weak, the colony realizes it within hours and immediately begins raising a new queen.

Though the drone’s story may seem tragic, their existence is just as essential as the queen’s. They bring genetic diversity to the colony. By mating with multiple drones, the queen ensures the next generation carries a mix of traits, improving resistance to diseases and increasing survival chances in different conditions.

Honeybees possess an astonishing ability that scientists call the “waggle dance.” When a worker bee finds a good source of flowers or food, she doesn’t simply bring honey back. Instead, she returns to the hive and performs a special dance.

Through this dance, she tells the other bees the direction of the food source, how far away it is, and how abundant it is. The angle of the dance indicates the direction relative to the sun, while the duration of the dance communicates the distance.

The most fascinating part is that bees also account for the movement of the sun. Even if the sun has shifted position, the bee still communicates the correct direction in her dance, as if she possesses both a natural compass and an internal clock.

In other words, this tiny insect uses a system involving mathematics, navigation, time calculation, and collective communication—one that has functioned for millions of years without any teacher or school.

Thus, within a tiny hive exists a complete kingdom: a queen who rules, thousands of hardworking workers serving the colony’s welfare, and drones who play a temporary but essential role in producing the next generation. This system has functioned for millions of years without law books, police, or armies—perhaps far more organized and successful than human societies.

OpenAI Unveils GPT-5.3-Codex: A Coding Model That Helps Build Its Own Successors

OpenAI has introduced GPT-5.3-Codex, its latest flagship coding model, marking a major step forward in both programming capability and AI self-improvement. The new release combines advanced coding skills with stronger reasoning performance in a faster and more efficient package — and notably, it is already being used within OpenAI to improve its own systems.

A Model That Improves Its Own Development

One of the most striking aspects of GPT-5.3-Codex is how it contributes to OpenAI’s internal workflows. According to the company, early versions of the model were already deployed to:

  • Identify bugs in training runs
  • Assist with rollout and deployment management
  • Analyze evaluation results and system performance

In effect, the model helped accelerate and refine the development process of the very systems that produced it. This signals a growing shift where advanced AI models play an active role in improving their successors.

Benchmark Gains Across the Board

Performance results highlight the model’s leap in capability, particularly in agentic coding tasks where AI must independently reason and execute programming actions.

GPT-5.3-Codex reportedly leads benchmarks such as SWE-Bench Pro and Terminal-Bench 2.0, outperforming competing models and surpassing Opus 4.6 by around 12% on Terminal-Bench shortly after release.

Improvements extend beyond coding. On OSWorld, a benchmark measuring how effectively AI systems control desktop environments, GPT-5.3-Codex achieved a 64.7% score, nearly doubling the 38.2% achieved by the previous Codex generation. This indicates rapid progress toward AI systems that can operate computers more autonomously.

Security Risks and Defensive Investment

OpenAI also classified GPT-5.3-Codex with its first “High” cybersecurity risk rating, acknowledging that more capable coding models can potentially be misused. In response, the company committed $10 million in API credits to support defensive security research.

The move reflects an industry trend: as AI models become more powerful in software generation and system control, proactive security investment becomes essential.

The Bigger Picture: AI Designing AI

The broader significance of the announcement lies in the growing evidence that frontier AI systems are beginning to assist in designing and refining future models. Industry leaders have recently echoed this trend, signaling that next-generation AI development may increasingly involve AI collaboration.

The competitive landscape among leading AI labs is also intensifying, with rapid-fire releases demonstrating escalating capability gains. Debates about product features or monetization strategies now appear secondary to the accelerating race to build more capable and self-improving models.

Why It Matters

GPT-5.3-Codex represents more than a coding upgrade. It showcases a turning point where AI models are becoming part of their own development cycle. As systems grow better at debugging, optimizing, and deploying software—including AI software—the pace of progress may accelerate further.

The frontier is no longer just about who builds the best model, but who builds models that help create the next breakthrough.

https://openai.com/index/introducing-gpt-5-3-codex

Semantic Search in SQL Server 2025

In today’s fast-moving world, where every organization is rapidly adopting AI technologies, Microsoft SQL Server is no longer behind. In its latest release, SQL Server 2025, Microsoft promises to deliver one of the most enterprise-ready relational databases. Now SQL Server can seamlessly integrate Large Language Models(LLM’s) directly within the database. This allows developers to leverage AI and combine it with business data without leaving SQL Server.

In this article, we will learn some of the basic concepts of AI and ML, such as Vectors, Embeddings, LLM models, etc. Then we will see how to integrate an external LLM model with SQL Server and perform semantic search, natural and context-free queries on your data.

Note: It is recommended that reader should have a basic knowledge of AI, Machine Learning, and T-SQL.

Vector Data Type

Vectors are backbone of any LLM models. They represent data such as text, images, songs, etc. as an ordered list of numbers that ML algorithms process.

SQL Server introduces a new data type, Vector, to support AI and machine learning workloads. This data type stores multi-dimensional data, making it useful for semantic search, similarity findings, and embedding storage. A vector stores values in an ordered numerical array. For example, a three-dimensional vector is represented as ‘[0,1,2]’. The Vector data type supports up to 1998 dimensions and uses a familiar JSON array format for creating and storing values.

The following example below creates a table with Vector column and inserts data into it.

CREATE TABLE dbo.Vectors(
ProductID INT PRIMARY KEY
,ProductName NVARCHAR(50)
,ProductVectorRepresentation VECTOR(3)  -- three dimensional vector
)

INSERT INTO dbo.Vectors values
(1, 'Pen', '[0,4,3]')
,(2,'Eraser','[4,9,-2]')
,(3,'Sharpener','[0,-9,2]')

We can see the results below of this, with the vector data stored as an array of floating point values.

For additional details about Vectors, please refer this article.

Embeddings

Embeddings are the numerical values stored in the vectors that represents features of data. These are generated using a Deep Learning model, and ML/AI models uses them to measure similarity between vectors. For example, two similar words like ‘Animal’ and ‘Dog’ have similar embeddings (vector representations).

You can generate embeddings using LLM models, such as OpenAI, Ollama, etc. Generated embeddings then can be stored in a Vector column inside a SQL Server database alongside the original data which they represent. Furthermore, you can perform vector searches using T-SQL to find semantically similar words or concepts.

In the screenshot below, the document table contains two columns: Content(nvarchar) and Embeddings(vector). The Content column stores textual data, while Embeddings column stores the corresponding embedding values. These embeddings are generated using OpenAI’s LLM model. In the demo section, we will go through how to generate the embeddings and store it in a database table.

Vector Searches

Vector Search is a process of finding vectors in a dataset that closely match a given input vector. It is similar to searching text using ‘LIKE’ operator is SQL Server. The difference is that, instead of simply matching text patterns, it scans the entire vector column and return the vectors whose embeddings are similar to input vector’s embeddings. Hence it performs semantic search.

For example, if the input vector represents the word ‘cat’, it return vectors of semantically similar words like ‘Animal’ as close match.

The closer the embeddings are, more similar they are. This closeness is measured using different metrices like Cosine distance, Euclidean distance or dot product. SQL offers two functions VECTOR_DISTANCE() and VECTOR_SEARCH() to measure the similarity between two vectors.

VECTOR_DISTANCE()

It calculates the exact similarity between two vectors using a predefined metric(cosine, dot, Euclidean) and returns a scalar value of difference between two vectors, based on the distance metric you specify. It doesn’t uses any vector index for finding similarity, hence this function is best suited for smaller datasets and for finding exact distance between two vectors.

Below is the syntax:

VECTOR_DISTANCE('distance_metric', [Vector1], [Vector2])

and the arguments are:

  • distance_metric: Distance metric to be used for calculating the distance between the two vectors. Supported metrics are cosine, Euclidean and dot product.
  • Vector1: First Vectorized data type array.
  • Vector2: Second Vectorized data type array.

VECTOR_SEARCH()

It calculates the similarity between the vectors using an ANN(approximate nearest neighbor) algorithm. Unlike VECTOR_DISTANCE(), it doesn’t calculates the exact distance between the vectors and only returns the most nearest vectors. This function requires a vector index on the vector column of the table. As it uses vector indexes, this function is best suited for larger datasets.

Please note at the time of writing this article this function is still in Preview and subject to change.

Below is the Syntax:

 VECTOR_SEARCH(
        TABLE = table_name, 
        COLUMN = embedding_column_name, 
        SIMILAR_TO = query_vector, 
        METRIC = 'cosine | euclidean | dot', 
        TOP_N = k
    )

and the arguments are:

  • TABLE: Table where search will be performed
  • COLUMN: Vectorized Column with Vector Index which stores the embeddings of the textual data and where the search will be performed
  • SIMILAR_TO: Vectorized value of the input query text which will be used for finding similarity.
  • METRIC: Metric to be used for calculating the similarity between the two vectors. Supported metrics are cosine, Euclidean and dot product.
  • TOP_N: Maximum number of similar rows to return in the result.

Demo – Integrating External LLM Models and Semantic Search

Now that we understand the basic concepts, lets see how we can integrate an external LLM model in our SQL Server and perform semantic search.

Configuring Open AI API Key

We will use Open AI’s Web API in this demo. If you do not have an API key, you have to purchase one. To generate a new key follow the below steps:

First, register on https://platform.openai.com/ and then create a new project.(e.g. SQL2025-SemanticSearch), as shown below.

Then navigate to API keys from settings option and create new secret key by assigning it a name. Select the project which you created in previous step.

It is important to note down the value of the secret key at the time of creation only as it complete value disappears after this step. If you missed to note down the value, you have to delete and create a new key again.

Your key will look something like sk-proj……..B7wA.

Next, you have to add a payment option from Billing section to buy the credits. Additionally you can track the pricings, billing history and usage limits from the same section.

Also, keep track of Credit Limits and Usage. You may refer this article for Credit Limit usage.

Creating an External LLM in SQL Server

Now we will see how to create an OpenAI model in SQL Server.

To begin with, first create a database credential and master key using below sql statements. A Database Credential will be required at the time of creation of external model.

USE [SQL-2025];
GO

-- Create a master key (use a strong password you will remember)
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'Str0ngP@ssword!';

IF EXISTS (SELECT * FROM sys.database_scoped_credentials WHERE name = 'https://api.openai.com')
    DROP DATABASE SCOPED CREDENTIAL [https://api.openai.com];
GO

CREATE DATABASE SCOPED CREDENTIAL [https://api.openai.com]   --- For OpenAI API
WITH IDENTITY = 'HTTPEndpointHeaders',
SECRET = '{"Authorization": "Bearer sk-proj........B7wA"}';  --- Provide the API Key secret which you created in previous section

Next, create the external model using T-SQL’s new command CREATE EXTERNAL MODEL. Also make sure to enable the external rest endpoint advance configuration so that SQL can make calls to external OpenAI API.

EXEC sp_configure 'external rest endpoint enabled',1;
RECONFIGURE WITH OVERRIDE;

CREATE EXTERNAL MODEL OpenAIEmbeddingModel
WITH (
    LOCATION = 'https://api.openai.com/v1/embeddings',
    CREDENTIAL = [https://api.openai.com],   -- Database Scoped Credential
    API_FORMAT = 'OpenAI',    -- For OpenAI. Other examples - Azure OpenAI,Ollama
    MODEL_TYPE = EMBEDDINGS,
    MODEL = 'text-embedding-3-small'  -- For Text Embeddings
);

If you want to alter or drop any external model you can use ALTER EXTERNAL MODEL and DROP EXTERNAL MODEL commands respectively.

Creating Documents Table with Vector Column for textual data

Once the external model is ready, create a table named ‘document’ for semantic search using below T-SQL script. It contains following columns:

  • ID – Primary Key column for each document
  • Content(nvarchar) – Stores some random textual data
  • Embedding(vector) – Stores embeddings of the textual data in vector data type.CREATE TABLE dbo.Documents ( Id INT IDENTITY(1,1) PRIMARY KEY, Content NVARCHAR(MAX), Embedding VECTOR(1536) NULL — for storing embeddings ); GO

Next, insert some random data into the table using below script.

INSERT INTO dbo.Documents (Content)
VALUES
(N'The capital of France is Paris.'),
(N'Python is a popular programming language used in AI and data science.'),
(N'SQL Server 2025 supports vector search natively.'),
(N'Mount Everest is the tallest mountain in the world.'),
(N'The Great Wall of China is one of the wonders of the world.'),
(N'Tesla is a leading company in electric vehicles.'),
(N'Microsoft Azure provides cloud services worldwide.'),
(N'Football is also known as soccer in some countries.'),
(N'Amazon is the largest e-commerce company in the world.'),
(N'Water boils at 100 degrees Celsius under normal conditions.'),
(N'ChatGPT is an AI model developed by OpenAI.'),
(N'India is the largest democracy in the world.'),
(N'Bananas are rich in potassium.'),
(N'Shakespeare wrote the play Romeo and Juliet.'),
(N'The human brain contains billions of neurons.'),
(N'Coffee is one of the most popular beverages in the world.'),
(N'Cristiano Ronaldo is a famous football player.'),
(N'Google was founded by Larry Page and Sergey Brin.'),
(N'Bitcoin is a type of cryptocurrency.'),
(N'The Pacific Ocean is the largest ocean on Earth.'),
(N'Leonardo da Vinci painted the Mona Lisa.'),
(N'The speed of light is approximately 299,792 km per second.'),
(N'Tokyo is the capital city of Japan.'),
(N'The Sahara Desert is the largest hot desert in the world.'),
(N'COVID-19 pandemic started in 2019.'),
(N'Einstein developed the theory of relativity.'),
(N'Lionel Messi is regarded as one of the greatest footballers.'),
(N'J.K. Rowling wrote the Harry Potter series.'),
(N'Water is composed of hydrogen and oxygen atoms.'),
(N'YouTube is a video sharing platform owned by Google.'),
(N'Apple produces the iPhone, iPad, and MacBook.'),
(N'The Amazon Rainforest is known as the lungs of the Earth.'),
(N'Facebook was founded by Mark Zuckerberg.'),
(N'The currency of the United States is the US Dollar.'),
(N'Mona Lisa is displayed at the Louvre Museum in Paris.'),
(N'The Statue of Liberty is located in New York.'),
(N'Elon Musk founded SpaceX.'),
(N'The Nile is the longest river in the world.'),
(N'Ice cream is a frozen dessert loved worldwide.'),
(N'The Sun is a star at the center of our solar system.'),
(N'BMW is a German luxury car manufacturer.'),
(N'Mahabharata is one of the oldest epics in Indian history.'),
(N'The Internet revolutionized communication.'),
(N'Mars is called the Red Planet.'),
(N'Venus is the hottest planet in our solar system.'),
(N'Microsoft was founded by Bill Gates and Paul Allen.'),
(N'The Taj Mahal is located in Agra, India.'),
(N'The Eiffel Tower is an iconic landmark in Paris.'),
(N'Kolkata was formerly known as Calcutta.'),
(N'The first man on the moon was Neil Armstrong.'),
(N'Giraffes are the tallest land animals.'),
(N'Penguins live mostly in the Southern Hemisphere.'),
(N'The Earth orbits around the Sun in 365 days.'),
(N'Samsung is a South Korean multinational company.'),
(N'Opera is a style of theatre performance.'),
(N'Rome was the center of the Roman Empire.'),
(N'The human heart has four chambers.'),
(N'Einstein won the Nobel Prize in Physics in 1921.'),
(N'Photosynthesis occurs in plant leaves.'),
(N'The first World War began in 1914.'),
(N'Second World War ended in 1945.'),
(N'The iPhone was first released in 2007.'),
(N'Manufacturing industries are part of the secondary sector.'),
(N'The Pyramids of Giza are in Egypt.'),
(N'Pluto was reclassified as a dwarf planet.'),
(N'The Great Fire of London happened in 1666.'),
(N'Football World Cup is held every four years.'),
(N'The brain controls the central nervous system.'),
(N'China has the largest population in the world.'),
(N'Canada is the second-largest country by land area.'),
(N'The Berlin Wall fell in 1989.'),
(N'WhatsApp is a popular messaging application.'),
(N'Sound travels faster in water than in air.'),
(N'Volcanoes erupt when magma rises to the surface.'),
(N'Milk is a source of calcium.'),
(N'The internet was first developed for military use.'),
(N'The first programming language was Fortran.'),
(N'Vegetables are essential for a healthy diet.'),
(N'Saturn is famous for its rings.'),
(N'Airplanes fly due to the principle of lift.'),
(N'Chess originated in India.'),
(N'The Cold War was a period of geopolitical tension.'),
(N'The periodic table organizes chemical elements.'),
(N'You need oxygen to breathe.'),
(N'Mahatma Gandhi was a leader of Indian independence.'),
(N'The Himalayas are the youngest mountain range.'),
(N'Sri Lanka is an island nation in South Asia.'),
(N'The Indian Ocean lies south of India.'),
(N'George Washington was the first US president.'),
(N'The Eiffel Tower was built in 1889.'),
(N'Newton discovered the law of gravitation.'),
(N'The Milky Way is our galaxy.'),
(N'The Moon causes ocean tides.'),
(N'The Leaning Tower of Pisa is in Italy.'),
(N'Polar bears live in the Arctic region.'),
(N'Lightning is caused by static electricity.'),
(N'Kangaroos are native to Australia.'),
(N'Amazon Alexa is a virtual assistant.'),
(N'Machine learning is a subset of artificial intelligence.'),
(N'The Golden Gate Bridge is in San Francisco.'),
(N'The ozone layer protects Earth from UV rays.'),
(N'Cricket is very popular in India.'),
(N'The brain uses electrical signals to transmit information.'),
(N'Global warming is caused by greenhouse gases.'),
(N'The first computer was ENIAC.'),
(N'Blockchain technology is used in cryptocurrencies.'),
(N'The UN was founded in 1945.');

GO

You can see the data is in my table:

After inserting the textual data, we have to generate the embeddings for each content(row) using the external model which we have created. Use T-SQL’s new function AI_GENERATE_EMBEDDINGS() to generate and insert the embeddings into document table.

UPDATE dbo.documents
SET Embedding = AI_GENERATE_EMBEDDINGS ([Content] Use MODEL OpenAIEmbeddingModel);

Now if you run select query on documents table, you will see numerical values(embeddings) for each content in the embedding column.

Semantic Search of Textual Query

Now that our embeddings are ready, we can perform some semantic searches on the documents table and see how the model behaves. First, lets see couple of examples for VECTOR_DISTANCE().

The first example conducts a random search for some contents related to auto mobile industry. The query is shown in the first line of the code:

DECLARE @query NVARCHAR(MAX) = N'auto mobile industry';

-- Generate embedding for query
DECLARE @queryEmbedding VECTOR(1536) =  (SELECT AI_GENERATE_EMBEDDINGS(@query USE MODEL OpenAIEmbeddingModel));
 
 SELECT TOP 3
       Content
      ,VECTOR_DISTANCE('cosine', @queryEmbedding, Embedding) AS CosineDistance
FROM dbo.Documents
ORDER BY CosineDistance;

The output is impressive as all the returned results are related to automobile industry.

The next example conducts a specific search for Mona Lisa:

DECLARE @query NVARCHAR(MAX) = N'leonardo da vinci''s famous painting';
-- Generate embedding for query
DECLARE @queryEmbedding VECTOR(1536) =  (SELECT AI_GENERATE_EMBEDDINGS(@query USE MODEL OpenAIEmbeddingModel));
 
SELECT TOP 3
       [Content]
      ,VECTOR_DISTANCE('cosine', @queryEmbedding, Embedding) AS CosineDistance
FROM dbo.Documents
ORDER BY CosineDistance;

The returned output with most similarity correctly answers our query.

Here is an explanation on how the code worked. In both the examples above, first we take a input text – ‘auto mobile industry’ and ‘leonardo da vinci’s famous painting’. Then we generated a vector embedding of this input text using the same external AI model and stored it in a queryEmbedding variable. Finally using VECTOR_DISTANCE function we calculated the cosine similarity between the input text vector and values from Embedding column of the documents table. It compared the distance between the input vector and each row of embedding column one by one and returns the corresponding top 3 most similar row from Content column. Please note that the less the value of cosine distance, the more is the similarity.

Now, we will perform some searches using VECTOR_SEARCH(). As this feature is still in preview, we need to enable the trace flag using below query to make this function work:

DBCC TRACEON(466, 474, 13981, -1)

Then create a vector index on the embedding column as an index is mandatory for vector search to work.

CREATE VECTOR INDEX vec_idx ON [dbo].[Documents]([Embedding])
WITH (METRIC = 'cosine', TYPE = 'diskann', MAXDOP = 4);
GO

A little explanation for the arguments:

  • Metric : defines how similarity is calculated (options are cosine/dot/Euclidean)
  • Type: defines algorithm used to calculate the nearest neighbors. Currently only ‘diskann’ is supported.
  • MAXDOP: parallel threads for rebuilding the index

Let us see when the iPhone was released.

DECLARE @querytext NVARCHAR (3000) ='what Apple released its first phone?' 
DECLARE @queryEmbedding vector(1536) =  AI_GENERATE_EMBEDDINGS(@querytext USE MODEL OpenAIEmbeddingModel);
 
SELECT 
    t.Content, s.distance
FROM
    VECTOR_SEARCH(
        TABLE = [dbo].[Documents] as t, 
        COLUMN = [embedding], 
        SIMILAR_TO = @queryEmbedding, 
        METRIC = 'cosine', 
        TOP_N = 3
    ) AS s

It did well. The results with maximum similarity correctly answers the query.

The next example is finding the nationality of Cristiano Ronaldo. Actually, we’ll ask the LLM who is the famous football player from Portugal.

DECLARE @querytext NVARCHAR (3000) ='who is the famous footballer from portugal?' 
DECLARE @queryEmbedding vector(1536) =  AI_GENERATE_EMBEDDINGS(@querytext USE MODEL OpenAIEmbeddingModel);
 
SELECT 
    t.Content, s.distance
FROM
    VECTOR_SEARCH(
        TABLE = [dbo].[Documents] as t, 
        COLUMN = [embedding], 
        SIMILAR_TO = @queryEmbedding, 
        METRIC = 'cosine', 
        TOP_N = 3
    ) AS s

Again the results were impressive and model worked quite well. The vector with maximum similarity correctly answers the query.

Like the previous examples, first we generated the embeddings of the input text using the external AI model and stored the result in queryEmbedding variable.  Then VECTOR_SEARCH() function performs approximate nearest search using diskann algorithm defined in the vector index and filtered the top 3 rows with maximum cosine similarity.

Conclusion

Microsoft has taken a big leap by integrating AI within SQL Server. Now it is possible to directly use LLM AI models within SQL Server. However these features(embeddings, external AI models, Vector Searches, etc.) are still very new and evolving, so we need to be very careful before implementing them in our organizations specially in production environments. Below are some of the points which we should consider before implementing them:

  1. Vector Searches uses approximate algorithms and compromises results for speed. So we should be careful to use them for deterministic solutions.
  2. Vector Indexes consumes high memory and disk IO. Inappropriate VM sizing can affect overall database performances.
  3. LLM models are not hosted on SQL Server thus every model call occurs outside the database server which can introduce latency in the database transactions.
  4. Inappropriate/bad model selection can lead to undesired results. Be careful when you select the external model for generating the embeddings.
  5. Vector tables or databases should not be stored with critical OLTP workloads as it may create resource contentions and affect database performance.