I Put My Search Engine on My Own PC and the Storefront on Vercel

17 June 2026By AntonioGitHub ↗LinkedIn ↗Part of NORDHEM

SearchFull StackNext.js

The NORDHEM storefront lives on Vercel; the Elasticsearch engine lives on my PC behind a free tunnel. A circuit breaker falls the site back to Postgres lite mode when the machine is off. For a portfolio search demo, that split is the right trade.

The NORDHEM storefront runs on Vercel. The Elasticsearch search engine that powers its best features runs on my own PC, in my apartment, reached over a free Cloudflare tunnel. When the PC is off, a circuit breaker notices within 800 milliseconds and the whole site falls back to Postgres full-text search, what I call lite mode. For a portfolio search demo this split is the right trade. It is free, it is honest about its own state, and it is real graceful-degradation engineering instead of a screenshot of one. For a product with paying customers I would put the engine in the cloud. For showing what I can build, this is better: the public URL is never down, and full mode (semantic search, facets, the learning loop) lights up the moment my machine is online.

The question worth answering is not can you self-host Elasticsearch. Anyone can. The question is what happens to the live site when the self-hosted half is asleep, and whether the answer is something you would be proud to put in front of an interviewer. Here is what I got, what it cost, and where the line is.

What I Got

A Public URL That Is Never Down

The storefront, the cart, accounts, checkout, order history, all of it reads and writes Neon Postgres directly. None of that touches the search engine. So the only path that can degrade is search itself, and even that does not go dark. When the PC is unreachable the site serves Postgres full-text search over the same shop catalog. No facets, no synonyms, no semantic ranking, but a shopper can still type oak bed and get oak beds back.

The retrieval policy is one small function. It tries the full engine, and on any failure it records the failure and serves the fallback. While the breaker is open it skips the engine entirely and goes straight to Postgres.

typescript

export async function resolveSearch(
  deps: ResolveDeps,
): Promise<{ response: SearchResponse; lite: boolean }> {
  if (deps.breaker.canRequest()) {
    try {
      const response = await deps.full();
      deps.breaker.recordSuccess();
      return { response, lite: false };
    } catch {
      deps.breaker.recordFailure();
    }
  }
  return { response: await deps.fallback(), lite: true };
}

That boolean it returns, lite, is the whole honesty story. It rides back to the UI so the page can say, plainly, that advanced search is degraded right now. The site never pretends to be something it is not.

Full Mode For Free, Whenever The Machine Is On

Hosting Elasticsearch in the cloud costs real money. A managed cluster that can hold the 43k-product benchmark index and run embeddings is not a free-tier thing. My PC already exists, already has the RAM, and already runs Docker. So the expensive half of the system costs me nothing to run, and it turns on exactly when I am demoing it.

Connecting the two halves is a tunnel and a shared token. Vercel points at the tunnel URL; the Fastify service on the PC requires a matching bearer token on every route except health. The web app reads whichever backend applies to the request and attaches the token.

typescript

const full = async (): Promise<SearchResponse> => {
  const res = await fetch(`${backend.url}/search?${queryString}`, {
    signal: AbortSignal.timeout(TIMEOUT_MS),
    headers: authHeaders(backend),
  });
  if (!res.ok) throw new Error(`search service responded ${res.status}`);
  return SearchResponseSchema.parse(await res.json());
};
const fallback = () => ftsSearchShop(db(), query, fts);
return resolveSearch({ breaker: searchBreaker(), full, fallback });

The AbortSignal.timeout(TIMEOUT_MS) is load-bearing. TIMEOUT_MS is 800. A sleeping PC must not hold a request hostage while a TCP connection times out on its own schedule. Eight hundred milliseconds is my budget for the engine answers or it does not exist.

Graceful Degradation You Can Actually Point At

Plenty of systems claim a fallback. Far fewer have a state machine you can read in one screen. Mine is a circuit breaker with three states, closed, open, and half-open, and a 10-second cooldown. After two consecutive failures it opens, stops calling the dead engine, and serves Postgres until the cooldown elapses. Then it lets exactly one trial request through to see if the PC woke up.

typescript

canRequest(): boolean {
  if (this.state === "open") {
    if (this.now() - this.openedAt >= this.opts.cooldownMs) {
      this.state = "half-open";
      return true; // allow a single trial
    }
    return false;
  }
  return true; // closed or half-open
}

This is the part I am happiest to talk through in an interview. It is not a library. It is a tested class with an injectable clock, so I can prove the open-to-half-open transition without sleeping in a test. The point of the whole split is that it forced me to build this, and a thing you build is worth more in a portfolio than a thing you configure.

What It Costs

The Best Features Are Up Only When I Am

This is the honest one. Semantic search, hybrid ranking, live facet counts, the chatbot, all of it needs the PC. If someone opens the live URL at 3 a.m. while my machine is off, they get lite mode. The flagship work, the relevance lab numbers and the kNN-plus-BM25 fusion I am proudest of, is exactly the work that is sometimes asleep.

I decided this was acceptable because the alternative was not building those features at all, or paying monthly for a cluster to host a demo nobody is hammering. A feature that exists and is sometimes off beats a feature I skipped to fit a free tier. But it is a real cost, and the banner exists precisely so I never have to pretend otherwise.

A Tunnel Is One More Thing That Breaks

The path from Vercel to my PC is longer than a path inside one cloud region: Vercel, the public internet, Cloudflare's edge, the tunnel daemon on my machine, then Fastify. Every hop is somewhere a request can die. A quick Cloudflare tunnel also hands out a new hostname each time it starts, so a fresh tunnel means updating SEARCH_API_URL on Vercel. A named tunnel fixes the hostname but is more setup.

The breaker absorbs most of this. If the tunnel is mid-restart, requests fail fast, the breaker opens, and lite mode carries the site. The cost is not downtime; it is that I own more operational surface than a pure-cloud deploy would give me, and I have to keep the token in sync on both ends or every search 401s straight into lite mode.

Two Search Implementations To Keep Honest

Lite mode is not free to build. It is a second, independent search path: Postgres full-text search over the shop catalog, with its own query construction and its own quirks. The two paths have to agree on enough of the contract that the UI cannot tell which one answered, beyond the lite flag. The fallback ignores the full-mode-only parts of the querystring (scope, mode, filters) and runs a plain text search with the same paging.

The Postgres path also has its own relevance behavior, and getting it to behave like a shopper expects took a real fix. Postgres plainto_tsquery ANDs every term, so a two-word query returned far fewer results than the engine did. The lite path rewrites the query to OR semantics so recall stays sane. That is a whole second mental model I have to maintain, and it only earns its keep because the PC is sometimes off.

Where The Line Moves

Because the PC is not always on, I added an escape hatch: a visitor can drive the live site from their own engine for the length of their session. They run the search service locally, expose it with a tunnel, and paste the URL and shared password on the status page. That stores a per-session override in an https-only cookie, validated to be a well-formed https URL, and routes their requests through the same breaker as the default.

typescript

export async function getSearchBackend(): Promise<EngineBackend> {
  try {
    const override = parseEngineCookie((await cookies()).get(ENGINE_COOKIE)?.value);
    if (override) return override;
  } catch {
    // not in a request scope
  }
  return { url: SEARCH_API_URL, ...(SEARCH_API_TOKEN ? { token: SEARCH_API_TOKEN } : {}) };
}

Per-session is the whole reason a shared password is safe enough here: one visitor's choice never touches anyone else's session. This is a portfolio affordance, not a product feature, and that is exactly where the verdict moves. The moment search has an SLA, the engine belongs in the cloud, the breaker stays (you still want it in front of anything you do not fully control), and the bring-your-own-engine cookie goes in the bin.

The Call

For a portfolio search demo built by one person on a free budget, put the storefront on Vercel, the database on Neon, and the engine on your own machine behind a tunnel, with a circuit breaker and a real fallback between them. You get a public URL that is never down, the expensive features for free whenever you are around to show them, and a degradation story you can defend line by line. The price is that your best features keep your hours, you own a tunnel, and you maintain a second search path.

I would not ship this for a store that takes real orders against real search traffic. There the engine goes in the cloud and the math changes. But for proving I can build a search system, including the part where I decide what happens when half of it is asleep, the split on my desk beats the cluster I would have to pay for. A degraded but honest site beats a spinner, and building the fallback taught me more than renting the uptime ever would.