Back to Blog

Build vs. Buy: The Real Cost of Maintaining Custom Web Scrapers

Richard Zhang·

A fintech team built three Puppeteer integrations in Q1. By Q3, they'd spent 40 hours fixing broken scrapers, one engineer's full sprint. Here's the real math behind custom web scraper maintenance, and a framework for deciding when to stop absorbing th...

At Integuru, we talk to engineering teams every week who are in the same situation: they built a few web integrations, shipped them, and assumed they were done. Six months later, maintaining those integrations has quietly consumed more engineering time than building them did.

Here is a scenario we see often. A fintech startup builds Puppeteer-based integrations with three financial platforms in Q1. It takes about two weeks per integration, across scoping, auth flow mapping, network analysis, and edge case testing. By Q3, two of those integrations have broken twice each: one from a platform UI redesign, one from an authentication flow change, one more from tightened anti-bot detection. The engineering team estimates roughly 40 hours spent diagnosing, fixing, and re-testing across incidents that were never on the roadmap. That's one engineer's full sprint, absorbed by infrastructure that was supposed to be finished.

This post puts numbers on that pattern and gives you a framework for deciding when that cost is worth carrying yourself and when it isn't. Updated June 2026.

What "Maintenance Burden" Actually Means in Hours

For any integration targeting an actively developed third-party platform, ongoing maintenance runs 3–5x the initial build effort over the first 12 months, based on Integuru's data across production deployments. A two-week initial build realistically means six to ten additional weeks of engineering time in year one, distributed across unscheduled incidents rather than planned sprint work.

Most teams frame the decision as "How long will this take to build?" That's the bounded cost, and it's real: one engineer, 1–3 weeks per platform, covering scoping, auth flow mapping, endpoint reverse-engineering, and edge case testing. After shipping, most teams close the ticket. The maintenance cost doesn't appear on a roadmap. It arrives as a pager alert at the wrong time, assigned to whoever is closest to the code.

Rule of thumb: plan for 3–5x your initial build effort in year-one maintenance for any integration that targets an actively developed third-party platform. Most teams that come to us have discovered this ratio the hard way.

That math changes the question from "Can we build it?" to "Can we keep it running without derailing the roadmap?"

The Four Cost Drivers of Custom Web Scraping

Not all breakages are the same. Understanding what causes them is the first step to pricing them correctly.

At Integuru, we track breakage causes across integrations we maintain and across the patterns teams describe when they first reach out to us. The distribution is consistent enough to be useful:

  1. Platform UI and API changes (~40% of all breakages). Third-party platforms ship front-end changes constantly. CSS selectors get renamed. Forms get restructured. Internal API endpoints migrate to new paths or change their response schema. For browser automation integrations, any of these events can produce a selector error or a silent failure. These incidents require detection, diagnosis, a fix, and a re-test cycle.

  2. Authentication and session changes. Login flows change: 2FA methods rotate, SSO integrations update, token refresh intervals shorten, session cookie formats shift. Each of these produces a production incident that looks like a network error until someone traces it back to an auth failure. Teams that don't have robust session health monitoring often find out when customers complain.

  3. Anti-bot escalation (~10% of breakages, rising). Platforms continuously improve their ability to detect automation traffic. A Puppeteer session that passed last quarter may now trigger a CAPTCHA challenge or a silent request block. This category is particularly painful to debug because the failure doesn't surface a useful error: the request just stops returning the expected response.

  4. Scale failures. An integration that handles 100 calls per day without issue can break down completely at 10,000. Connection pooling assumptions, rate limit thresholds, session concurrency limits, and server-side pagination behavior all behave differently under load. Teams that don't discover this until they're scaling a feature often face a rewrite under pressure.

  • 3–5x the initial build effort in year-one maintenance (Integuru data across production deployments)

  • ~40% of all breakages come from platform UI and API changes alone

  • ~10% come from anti-bot detection, a category that grows as platforms invest more in protecting their interfaces

What You're Actually Trading When You Build In-House

The 40 hours in the scenario above isn't free. It has a direct cost that most engineering budgets don't make visible, because it's absorbed as normal sprint work rather than billed against the integration line item.

At a $150,000–200,000 loaded annual salary for a senior engineer, the hourly cost is roughly $75–100. Forty hours of integration maintenance in one quarter costs $3,000–4,000 in absorbed engineering time, before you account for the interruption cost to whatever sprint work got pushed.

At three integrations running the same breakage rate, that's $9,000–12,000 per quarter in engineering capacity spent keeping infrastructure running. Over a full year, the number is $36,000–48,000 in salary cost alone, for a set of integrations that required about $15,000–20,000 to build in the first place.

The question is not whether to pay for integration reliability. It's whether to pay in engineering hours or in dollars.

The opportunity cost is the part that's hardest to make legible to finance, but it's the most real. Every hour a senior engineer spends tracing an auth failure or re-mapping a broken selector is an hour they're not building the features your customers are waiting for. For most companies, the integration is not the product. It's the plumbing. You want plumbing that doesn't need a plumber on call.

What a Managed Integration Service Costs and What It Buys

Integuru generates production-ready API endpoints by analyzing a target platform's network traffic, reverse-engineering its private API, and covering edge cases across branching logic, account states, and authentication flows. The resulting integration uses direct HTTP requests rather than browser automation, which removes the entire category of UI-selector breakages (roughly 40% of incidents) before they can occur.

Setup takes 10–20 minutes via the CLI. There's no services engagement, no weeks-long scoping process.

Pricing is tiered by call volume and maintenance level:

  • Free ($0/month): 100 API calls/month. Suitable for validating an integration before committing to a plan.

  • Developer ($30/month): 1,000 calls/month. Includes manual maintenance for breakages.

  • Production ($300/month): 10,000 calls/month. Includes 24/7 on-call maintenance, auth auto-healing, and 99.9%+ reliability guarantees.

The cost comparison against in-house maintenance looks like this:

Custom build: Year 1

Custom build: Year 2+

Integuru Production: Year 1

Initial build cost

~2–4 weeks engineer time

n/a

10–20 minutes via CLI

Annual maintenance

3–5x build effort (unscheduled)

Same clock resets

Included in plan

Auth handling

Manual rebuild on each change

Manual rebuild on each change

Auto-healing on Production plan

Reliability

No SLA; depends on sprint availability

No SLA

99.9%+

Breakages from UI changes

~40% of incidents

~40% of incidents

Eliminated (direct HTTP, no DOM selectors)

Annual cash cost

Engineering salary absorbed

Engineering salary absorbed

$3,600/year

Who fixes incidents

Your engineers

Your engineers

Integuru on-call team

Last verified: June 2026

The cash cost comparison is straightforward. One senior engineer spending 40 hours on integration maintenance in a single quarter costs more in salary than Integuru's Production plan costs in an entire year.

A Decision Framework: When Building Makes Sense and When It Doesn't

There are real situations where building and maintaining a custom integration is the right call. Outside those situations, the maintenance math tends to resolve the question.

Build in-house when:

  • You own or control the target platform. If you're connecting your own systems or a platform where you have direct backend access, there's no external fragility. The integration doesn't break when a third party ships a front-end update.

  • Call volume is under 1,000/month and downtime is tolerable. Low-frequency internal workflows don't require production SLAs. If an occasional break can wait for business hours to fix, the overhead of a managed service isn't justified.

  • The integration is genuine competitive IP. If the integration itself is what you're selling, or it enables a capability no external service can replicate, the engineering investment is defensible. The maintenance burden is part of the moat.

Use a managed service when:

  • You're integrating two or more third-party platforms. Each additional platform multiplies the maintenance surface. Breakages don't arrive in sequence; they arrive simultaneously, during the same on-call rotation. Three platforms means three independent maintenance clocks.

  • Production reliability is a hard requirement. A 40% annual breakage rate becomes a business risk when broken integrations block customer-facing workflows. On Integuru's Production plan, 24/7 on-call maintenance keeps that risk off your engineering team's plate.

  • The integration is above 10,000 calls/month. This is where scale failures and browser-automation infrastructure costs start to add up. Direct HTTP scales on standard infrastructure; a browser pool does not.

  • Engineering bandwidth is the constraint. A team absorbing 3–5x their initial build effort in maintenance every year is a team that's not shipping product. That time has a direct opportunity cost measured in features delayed and customers waiting.

Get Started

Integuru generates production-ready direct HTTP integrations for web platforms that have no official API. The fastest way to evaluate it against your use case is the CLI:

npm install -g integuru

Or open the web app at app.integuru.com. For a full breakdown of plan pricing, see integuru.com/#pricing.

If you'd prefer to talk through your stack first, schedule a call here or email us.

For a deeper technical comparison of what makes direct HTTP faster and more reliable than browser automation, see Browser Automation vs. Direct HTTP: A Reliability and Speed Comparison.