Skip to main content
← Back to blogs

Test a library against a real database, not a mock

By AntonioGitHub ↗LinkedIn ↗
CodingWeb DevelopmentFull StackNext.js

When you mock a third-party library in a test, you end up asserting that your own mock got called, which proves nothing about the library. Spin up a real disposable database with Testcontainers instead, run the real code, and check the rows it actually wrote. Here is why that catches the bugs mocks hide.

I was about to write a test for a sign-up flow, and I caught myself reaching for the usual trick: mock the library that does the work, then check that I called it. I stopped, because I have done that before and the test taught me nothing. The question I had was: when a third-party library writes to a database for me, how do I test that it actually did the right thing? The short answer is that you stop mocking the library and you let it talk to a real database that you throw away after the test. The interesting part is everything in between.

This post is about that choice. It is not specific to any one stack. If you have ever written a test that mocked an ORM, an auth library, a payment SDK, or a storage client, and then felt a little hollow about what the green checkmark actually proved, this is for you.

What is wrong with mocking the library?

A mock is a fake stand-in you write yourself. You tell it what to return and you record how it was called. That is genuinely useful when the real thing is slow, costs money, or reaches across the network to a service you do not control. The problem is what happens when you mock the very thing you are trying to test.

Say a library is supposed to take a password, hash it, and store the hash in a users table. If you mock the library, your test looks like this in spirit: I call the fake, and then I assert that the fake was called with the password. That assertion passes. But think about what it actually checked. It checked that you wrote a mock and then called it. The real hashing never ran. The real table was never touched. If the library secretly stored the password in plaintext, your mock would still report success, because your mock is not the library. It is a puppet doing exactly what you told it to do.

This is the trap. A test that mocks its own subject becomes a mirror. You assert that the thing did what you said it would do, and of course it did, because you are the one who said it. The test turns green and tells you nothing about whether the real library behaves the way you think it does. Think of it like checking that your phone works by calling a friend who agreed in advance to say “yes, I can hear you” no matter what. The call connects. You learned nothing about the microphone.

There is a second, quieter cost. Mocks freeze your assumptions in place. The day the library changes its column names, or your schema drifts, or the adapter that maps objects to rows starts behaving differently, the mock keeps returning the old happy answer. The test stays green while production breaks. The mock is now actively lying to you.

Why test against a real database instead?

When you point the real code at a real database, you stop asserting on your own stub and start asserting on reality. You run the library for real, let it write whatever it writes, and then you go look at the rows. The thing you check is the effect the library had on the world, not the calls you imagine it made.

You get three things from this that a mock can never give you. First, you verify the library's actual behaviour: did it really hash the password, did it really create the session row, did it really set the foreign key. Second, you catch the mismatches a mock hides, the schema drift and adapter mapping bugs, because the real insert either succeeds against your real schema or it blows up loudly. Third, you exercise real query and constraint behaviour: unique constraints fire, foreign keys are enforced, default values get filled in, and the types are the ones the database actually stores, not the ones you assumed in a stub.

There is a bonus that took me a while to appreciate. A test like this survives refactors. If the library reorganizes its internals, renames a private method, or changes how it batches writes, a mock-based test breaks because it was wired to those internals. A real-database test does not care. It only checks the rows at the end, so as long as the right rows show up, the test stays green through any internal rewrite. You are testing behaviour through the public surface, which is exactly where tests should live.

How do you run a real database in a test without it being a pain?

The objection everyone raises here is fair: a real database sounds slow, stateful, and annoying to set up. Nobody wants a test that depends on a Postgres someone installed on their laptop three years ago, with leftover rows from the last run polluting today's results. The answer is to make the database real but disposable, and the tool that does this is Testcontainers.

Testcontainers is a library that starts a real database inside a throwaway Docker container for the duration of your test run, then stops and deletes it when you are done. Think of it like a hotel room for your test: you get a real room with real plumbing, you use it, and when you check out it is wiped clean for the next guest. The database is genuinely Postgres, not an emulation, so it behaves exactly like production. But it exists only while the test runs, so there is no shared state and nothing to clean up by hand.

The shape is always the same. Before the tests, you start a container, connect to it, and apply your schema. Then you point the code under test at that connection. After the tests, you tear the container down. Here is what that setup looks like in one of my projects, using Vitest and a Postgres container:

typescript
let container: StartedPostgreSqlContainer;
let db: Db;
let close: () => Promise<void>;

beforeAll(async () => {
  container = await new PostgreSqlContainer("postgres:17").start();
  ({ db, close } = createDb(container.getConnectionUri()));
  await ensureSchema(db);
});

afterAll(async () => {
  await close?.();
  await container?.stop();
});

That is the whole trick. PostgreSqlContainer("postgres:17").start() pulls a real Postgres 17 image and boots it. getConnectionUri() hands you a connection string to that fresh instance, and ensureSchema(db) applies my tables to it. The first run is slow because Docker downloads the image, which is why the timeout on setup is generous. After that it is cached and fast. The same container approach works for any database with a Docker image, and Testcontainers exists for most of them.

Why is this actually good, and not just slower?

The payoff is that your assertions point at reality. You ran the real library against a real schema, and now you read the rows it wrote and check them. If the library messed up the foreign key, the insert failed and your test is red. If it stored the wrong type, the row looks wrong and your test is red. If your schema and the library's expectations drifted apart, you find out in the test instead of in production at two in the morning. None of these failures are reachable when you mock the library, because the mock never touches the schema.

And it runs in CI. A throwaway container is just as disposable on a build server as it is on your machine, so the same test that gives you confidence locally gives the whole team confidence on every pull request. There is no “works on my machine” database to provision and no manual cleanup step that someone forgets.

What does this look like on something real?

On NORDHEM, my Nordic home-goods storefront, sign-up runs through Better Auth, which hashes the password and stores the user, an account row, and a session for me. I never wanted to mock the hasher, because a mocked hasher proves nothing about whether real passwords are safe. So the test starts a real Postgres in a container, points the exact same Better Auth config the app mounts at it, calls the real sign-up, then reads the account row back and checks that the stored password is present but is not equal to the plaintext I sent in. If it equals the plaintext, the password was never hashed and the test fails. That single assertion proves real hashing happened, end to end, without a mock anywhere in sight:

typescript
const res = await auth.api.signUpEmail({
  body: { email: EMAIL, password: PASSWORD, name: NAME },
  asResponse: true,
});
expect(res.status).toBe(200);

const accounts = await db
  .select()
  .from(accountTable)
  .where(eq(accountTable.userId, users[0].id));

// The password is stored on `account`, hashed, never the plaintext.
expect(accounts[0].password).toBeTruthy();
expect(accounts[0].password).not.toBe(PASSWORD);

Things that surprised me

A few things were not obvious until I had done this a handful of times:

The first container start is genuinely slow because Docker pulls the image, so the setup needs a long timeout. After that it is cached and the tests feel snappy. Set the timeout once and forget it.

Passing the database in explicitly is what makes this clean. My auth setup takes the connection as an argument, so the test can hand it the container while the app hands it the real pool. Same config, different database. If the library hard-codes its own connection, you have to find that seam first.

The negative assertion is the one that carries the weight. Checking that the stored password is not equal to the plaintext is the whole test. It is tempting to only check that a password column has something in it, but “something” would also pass if the plaintext were stored as-is. The interesting assertion is the one that fails when the bug you actually fear is present.

Not everything belongs in a real-database test. Things you genuinely do not control, like a third-party redirect flow or an external payment API, still get mocked or skipped, because the point of the real database is the part you own. In my auth test I deliberately skip the Google login path for exactly this reason: it is a redirect to a service I cannot run in a container.

When is this worth it?

Here is when I reach for a real disposable database: any time a library writes to my own storage and I care that it wrote the right thing. Auth, ORMs, anything that persists data through an adapter I did not write. In those cases a mock asserts on a fiction, and a container asserts on the rows. The container is a little slower and needs Docker available, and that is the entire cost.

Here is when it is not worth it: pure functions with no side effects, and true externals you cannot run, like a payment provider or an email service. Mock those, because there is no real thing to spin up. For everything that touches a database, though, I would rather wait a few extra seconds for a container than ship a green test that only ever proved my mock works. If a test is going to give me confidence, it has to be allowed to fail when the real code is wrong. A mock of the system under test can never fail that way. A real database can, and that is exactly why I trust it.

Related posts