The Art of Web Performance: From 8s to 800ms Load Times

Performance Impact on Business Metrics

Web performance directly affects revenue and engagement. Amazon measured a 1% sales decrease per 100ms of added latency. Google observed a 20% traffic drop from a 0.5-second increase in search result load time. These numbers hold across smaller-scale applications as well.

The gap between development environments and real user conditions is substantial. A developer on a MacBook Pro with gigabit fiber experiences a fundamentally different application than a user on a Redmi Note 10 with 1.6 Mbps throughput and 150ms baseline latency on a Jio network in Lucknow. Chrome DevTools network throttling to "Slow 3G" (400ms RTT, 400 Kbps) provides a closer approximation of these conditions.

Core Web Vitals thresholds for "good" ratings:

Metric	Good	Needs Improvement	Poor
LCP (Largest Contentful Paint)	< 2.5s	2.5s - 4.0s	> 4.0s
INP (Interaction to Next Paint)	< 200ms	200ms - 500ms	> 500ms
CLS (Cumulative Layout Shift)	< 0.1	0.1 - 0.25	> 0.25

These thresholds are measured at the 75th percentile of field data, not lab data. A Lighthouse score of 95 in a lab environment does not guarantee acceptable field performance.

The Critical Rendering Path

When a browser navigates to a URL, it executes a sequential pipeline. Each stage has specific bottlenecks.

 User enters URL
       |
       v
+--------------------+
|   DNS Resolution   |  20-120ms (uncached)
|   Maps domain to   |  &lt;1ms (cached)
|   IP address       |
+--------------------+
       |
       v
+--------------------+
|   TCP Handshake    |  1 RTT (20-200ms depending
|   SYN > SYN-ACK   |  on server distance)
|   > ACK            |
+--------------------+
       |
       v
+--------------------+
|   TLS Handshake    |  2 RTTs for TLS 1.2
|   Key exchange,    |  1 RTT for TLS 1.3
|   cipher setup     |
+--------------------+
       |
       v
+--------------------+
|   HTTP Request     |  Server processing time
|   GET /index.html  |  + response transfer time
+--------------------+
       |
       v
+--------------------+
|   HTML Parsing     |  Builds DOM incrementally.
|   Tokenize > DOM   |  Blocks on synchronous
|   construction     |  <script> tags.
+--------------------+
       |
  +----+----+
  |         |
  v         v
+-------+ +-------+
|  CSS  | |  JS   |
| Fetch | | Fetch |    Parallel if resources are
| Parse | | Parse |    discovered early enough.
| CSSOM | | Exec  |
+-------+ +-------+
  |         |
  +----+----+
       |
       v
+--------------------+
|   Render Tree      |  DOM + CSSOM merged.
|   Visible elements |  display:none excluded.
|   only.            |
+--------------------+
       |
       v
+--------------------+
|   Layout (Reflow)  |  Computes position and
|                    |  size of every element.
+--------------------+
       |
       v
+--------------------+
|   Paint            |  Rasterizes pixels for
|                    |  each layer.
+--------------------+
       |
       v
+--------------------+
|   Composite        |  Combines layers.
|                    |  GPU-accelerated.
+--------------------+
       |
       v
  Pixels on screen

The pipeline is largely sequential. The render tree requires both the DOM and CSSOM. Layout requires the render tree. Paint requires layout. Any synchronous <script> tag pauses DOM parsing entirely until the script is downloaded, parsed, and executed.

Unoptimized Request Waterfall

Time (ms)  0    200   400   600   800  1000  1200  1400  1600  1800  2000  2200
           |     |     |     |     |     |     |     |     |     |     |     |
DNS        [===]
TCP             [====]
TLS                  [=======]
HTML                          [=============================]
                                |           |
CSS (render-blocking)           [===================]
                                |                   |
JS (parser-blocking)            [===================================]
                                                                     |
Font (late discovery)                               [================|======]
                                                                            |
                                                                      FIRST PAINT
                                                                      (~2200ms)

CSS loading begins only when the HTML parser encounters the <link> tag. JavaScript loading begins at the <script> tag. Font loading begins only after CSS is parsed and a matching @font-face rule is needed. The user sees a blank screen for the entire duration.

Optimized Request Waterfall

Time (ms)  0    200   400   600   800  1000
           |     |     |     |     |     |
DNS        [=]  (pre-resolved via dns-prefetch)
TCP          [==]
TLS             [===] (TLS 1.3, single RTT)
HTML                 [========]
                      |   |
Critical CSS (inline) |   (no network request)
                      |
Preloaded font        [=========]
                      |
JS (defer)            [============]
                               |
                         FIRST PAINT
                         (~650ms)

The improvement is approximately 3x, achieved entirely through resource loading restructuring with zero application code changes.

TCP Slow Start

TCP does not transmit data at full bandwidth immediately. The congestion control algorithm starts with a small congestion window (typically 10 TCP segments, approximately 14.6KB) and doubles it after each successful round trip.

  Congestion Window (bytes)
  ^
  |                                            ________________
  |                                     ______/
  |                               _____/
  |                          ____/
  |                     ____/
  |                ____/
  |            ___/
  |         __/
  |       _/
  |     _/
  |   _/
  |  /
  | /
  +----------------------------------------------------> Time
  RTT1   RTT2   RTT3   RTT4   RTT5   RTT6   RTT7

  RTT1:  ~14.6KB   (10 segments, initial window)
  RTT2:  ~29.2KB   (20 segments)
  RTT3:  ~58.4KB   (40 segments)
  RTT4:  ~116.8KB  (80 segments)
  RTT5:  ~233.6KB  (160 segments)

The practical consequence: if the critical HTML response (including inlined CSS) fits within 14KB compressed, it arrives in a single round trip after connection establishment. At 15KB, a second round trip is required. On a mobile connection with 200ms RTT, that is an additional 200ms before the browser can begin rendering.

This makes 14KB a hard budget for the initial HTML response containing inlined critical CSS.

V8 JavaScript Parsing Costs

JavaScript has a processing cost beyond download time. The V8 engine (Chrome, Edge, Node.js) processes JavaScript through multiple stages, each consuming CPU time.

Source Code (text)
      |
      v
+-------------+
|   Scanner   |   Tokenizes source into tokens
|  (Lexer)    |   ~1.5 MB/s on mid-range mobile
+-------------+
      |
      v
+-------------+
|   Parser    |   Builds Abstract Syntax Tree
|             |   Full parse: ~600 KB/s on mobile
|             |   Lazy parse: ~1.5 MB/s on mobile
+-------------+
      |
      v
+-------------+
|  Ignition   |   Compiles AST to bytecode
|  (Interp.)  |   ~100 MB/s
+-------------+
      |
      v
+-------------+
|  Sparkplug  |   Baseline (non-optimizing) JIT
|  (Baseline) |   Compiles from bytecode
+-------------+
      |
      v  (hot functions only)
+-------------+
|  TurboFan   |   Optimizing compiler
|  (Opt. JIT) |   Type feedback guided
+-------------+

A 1.2MB JavaScript bundle (gzipped) decompresses to approximately 3.5MB of raw source. Processing costs on a Moto G Power (mid-range device, Snapdragon 665):

Stage	Time
Download (3G, 1.6 Mbps)	~6,000ms
Decompression	~50ms
Parse	~1,400ms
Compile (bytecode)	~350ms
Initial execution	~200ms
Total	~8,000ms

On a MacBook Pro M2, the same bundle parses in approximately 120ms. The 12x disparity between development hardware and target user hardware is the source of most performance blind spots.

The Chrome DevTools Performance panel exposes these costs. Record a page load, then examine the "Parse Script" and "Compile Script" entries in the flame chart. The "Bottom-Up" tab aggregates total time per function.

Measurement Tools and Methodology

Chrome DevTools Performance Panel

Record a page load with CPU throttling set to 4x or 6x slowdown to approximate mobile device performance. Key areas to examine:

Main thread flame chart: Identifies long tasks (>50ms) blocking interactivity
Network waterfall: Shows resource loading sequence and timing
Web Vitals lane: Marks LCP, FID/INP, and CLS events directly on the timeline
Bottom-Up tab: Aggregates self-time per function across the recording

Lighthouse

Run from DevTools (Audits tab) or via CLI:

npx lighthouse https://example.com \
  --preset=perf \
  --throttling-method=simulate \
  --chrome-flags="--headless" \
  --output=json \
  --output-path=./report.json

Lighthouse simulated throttling applies a 4x CPU slowdown and simulates a slow 4G connection (150ms RTT, 1.6 Mbps). The "Treemap" view (accessible from the report footer) shows JavaScript module sizes and coverage.

WebPageTest

WebPageTest (webpagetest.org) provides testing from real devices in real locations. Configuration for realistic mobile testing:

Test location: Mumbai, India
Browser: Chrome on Motorola G (gen 4)
Connection: 3G (400ms RTT, 400 Kbps down)

The filmstrip view shows visual progress at 100ms intervals. The connection view shows individual TCP connections and HTTP/2 streams. The "Waterfall" tab provides more detail than DevTools, including DNS, TCP, TLS, and TTFB broken out per request.

Coverage Tab (DevTools)

Open via Ctrl+Shift+P > "Show Coverage." Reload the page. The tab displays byte-level usage for every CSS and JS file. A typical finding: 60-80% of CSS bytes are unused on any given page when using a monolithic stylesheet or large framework like Bootstrap.

Custom Performance Instrumentation

// Performance marks for application-specific milestones
function trackPerformance(): void {
  // Largest Contentful Paint
  const lcpObserver = new PerformanceObserver((list) => {
    const entries = list.getEntries();
    const lastEntry = entries[entries.length - 1] as PerformanceEntry;
    sendMetric("lcp", lastEntry.startTime);
  });
  lcpObserver.observe({ type: "largest-contentful-paint", buffered: true });
 
  // Long tasks (>50ms) blocking the main thread
  const longTaskObserver = new PerformanceObserver((list) => {
    for (const entry of list.getEntries()) {
      sendMetric("long_task", entry.duration, {
        startTime: entry.startTime,
      });
    }
  });
  longTaskObserver.observe({ type: "longtask", buffered: true });
 
  // Application-specific: time until primary action is available
  performance.mark("search-input-ready");
  performance.measure(
    "time-to-interactive-search",
    "navigationStart",
    "search-input-ready"
  );
  const measure = performance.getEntriesByName(
    "time-to-interactive-search"
  )[0];
  sendMetric("tti_search", measure.duration);
}
 
function sendMetric(
  name: string,
  value: number,
  metadata?: Record<string, unknown>
): void {
  if (navigator.sendBeacon) {
    navigator.sendBeacon(
      "/api/metrics",
      JSON.stringify({ name, value, metadata, timestamp: Date.now() })
    );
  }
}

Inlining Critical CSS

Critical CSS is the minimum set of styles required to render above-the-fold content. Inlining it in the HTML <head> eliminates a render-blocking network request.

<head>
  <!-- Critical CSS: inlined, renders without additional requests -->
  <style>
    .hero { display: flex; align-items: center; min-height: 100vh; }
    .nav { position: fixed; top: 0; width: 100%; z-index: 100; }
    .nav-logo { height: 32px; width: auto; }
    h1 { font-size: clamp(2rem, 5vw, 4rem); line-height: 1.1; }
    .cta-button { padding: 12px 24px; background: #2563eb; color: #fff; }
    /* Target: under 14KB total HTML including this block */
  </style>
 
  <!-- Full stylesheet: loaded asynchronously, non-render-blocking -->
  <link rel="preload" href="/styles.css" as="style"
        onload="this.onload=null;this.rel='stylesheet'">
  <noscript><link rel="stylesheet" href="/styles.css"></noscript>
 
  <!-- Preconnect: performs DNS + TCP + TLS for origins needed immediately -->
  <link rel="preconnect" href="https://fonts.googleapis.com">
  <link rel="preconnect" href="https://cdn.example.com" crossorigin>
 
  <!-- DNS prefetch: only DNS resolution, for origins that may be needed -->
  <link rel="dns-prefetch" href="https://analytics.example.com">
</head>

The distinction between preconnect and dns-prefetch: preconnect completes DNS resolution, TCP handshake, and TLS negotiation (the full connection). dns-prefetch only resolves the domain name to an IP address. Use preconnect for origins that will definitely be needed within seconds. Use dns-prefetch for origins that may be needed. Each preconnect holds an open connection, so limit usage to 2-4 origins.

Tools for extracting critical CSS automatically:

critical (npm package): Renders the page in a headless browser, extracts above-the-fold styles
critters (webpack/Vite plugin): Inlines critical CSS at build time

Eliminating Parser-Blocking Scripts

A <script> tag without async or defer pauses the HTML parser until the script is downloaded, parsed, and executed.

Without async or defer:

HTML Parser:  [=========|          BLOCKED          |============]
Script:                  [==== Download ====|= Parse+Exec =]

With defer:

HTML Parser:  [==============================================]
Script:                  [==== Download ====]
                                            [= Parse+Exec =]
                                            (after HTML parsing completes)

With async:

HTML Parser:  [===============|  BLOCKED  |===================]
Script:           [==== Download ====]
                                  [= Parse+Exec =]
                                  (immediately when download finishes)

Key differences:

Attribute	Download	Execution	Order Guaranteed
(none)	Blocks parser	Blocks parser	Yes
`async`	Parallel	Interrupts parser	No
`defer`	Parallel	After HTML parsed	Yes

Use defer for application scripts that depend on DOM or execution order. Use async only for independent third-party scripts (analytics, ads) where execution order is irrelevant.

Resource Preloading

The browser discovers resources as it parses HTML. Resources referenced inside CSS files (fonts, background images) are not discovered until the CSS itself is downloaded and parsed. <link rel="preload"> moves discovery earlier.

<!-- Font: normally discovered only after CSS is parsed -->
<link rel="preload" href="/fonts/Inter-Bold.woff2"
      as="font" type="font/woff2" crossorigin>
 
<!-- LCP image: starts loading before the <img> tag is parsed -->
<link rel="preload" href="/hero.avif"
      as="image" type="image/avif"
      imagesrcset="/hero-400.avif 400w,
                   /hero-800.avif 800w,
                   /hero-1200.avif 1200w"
      imagesizes="100vw">

The crossorigin attribute on font preloads is required even for same-origin fonts. Without it, the browser makes a separate (non-preloaded) request, and the preloaded resource is discarded.

Limit preloads to 3-4 resources. Each preload competes for bandwidth. Preloading 15 resources effectively negates the prioritization benefit. Target: the LCP image, the primary font file, and at most one critical script.

JavaScript Bundle Optimization

Code Splitting

Route-based splitting is the baseline. Component-level splitting provides additional granularity for heavy dependencies.

import dynamic from "next/dynamic";
 
// Component-level splitting: Chart.js (~60KB gz) loads only when needed
const Chart = dynamic(() => import("../components/Chart"), {
  loading: () => <div className="h-64 animate-pulse bg-gray-100" />,
  ssr: false, // Charts require DOM APIs unavailable during SSR
});
 
// Markdown renderer (~45KB gz) loaded on demand
const MarkdownRenderer = dynamic(
  () => import("../components/MarkdownRenderer"),
  { loading: () => <div className="prose animate-pulse h-96" /> }
);

Prefetching on hover intent eliminates perceived loading delay for split chunks. The average time between mouseenter and click is 200-300ms, sufficient to fetch a small chunk.

function DashboardLink(): JSX.Element {
  const prefetch = (): void => {
    import("../components/Chart");
    import("../components/DashboardWidgets");
  };
 
  return (
    <a
      href="/dashboard"
      onMouseEnter={prefetch}
      onFocus={prefetch}
    >
      Dashboard
    </a>
  );
}

Bundle Analysis

Use webpack-bundle-analyzer or rollup-plugin-visualizer (for Vite) to identify oversized dependencies. Common findings and replacements:

Dependency	Size (gzipped)	Replacement	Replacement Size
moment.js	67.9KB	date-fns (individual functions)	3-5KB
lodash (full)	24.5KB	lodash-es (tree-shakeable) or native	1-4KB
react-icons (full)	41KB	Individual SVG imports	<1KB
Chart.js (full)	60KB	Dynamic import, loaded on demand	0KB initial
numeral.js	14KB	Intl.NumberFormat (built-in)	0KB

Before/after example for a typical React application:

Before optimization:
+---------------------------------------------------+
|              moment.js (67.9KB gz)                 |
|  Used for: formatting one date string              |
+-------------------------+-------------------------+
|   lodash (24.5KB gz)    | react-icons (41KB gz)   |
|   Used: _.get,          | Used: 3 icons out of    |
|   _.debounce            | ~4,000 available         |
+-------------------------+-------------------------+
|         Application code (38KB gz)                 |
+---------------------------------------------------+
Total: 171.4KB gzipped

After optimization:
+---------------------------------------------------+
|         Application code (38KB gz)                 |
+--------------------+------------------------------+
| date-fns/format    | Custom debounce (0.3KB)      |
| (2.8KB gz)         +------------------------------+
+--------------------+ 3 inline SVG icons (0.8KB)   |
+---------------------------------------------------+
Total: 41.9KB gzipped (75.6% reduction)

Tree Shaking Requirements

Tree shaking (dead code elimination) removes unused exports from the bundle. It fails when modules contain side effects.

// Side effect: modifies a global. Prevents tree shaking.
Array.prototype.flatMapCustom = function <T, U>(
  fn: (item: T) => U[]
): U[] {
  return this.reduce((acc: U[], item: T) => acc.concat(fn(item)), []);
};
 
// Side effect: function call at module scope. Prevents removal.
registerComponents();
 
// No side effects: pure exports. Tree shaking works correctly.
export function formatCurrency(value: number, locale: string): string {
  return new Intl.NumberFormat(locale, {
    style: "currency",
    currency: "USD",
  }).format(value);
}

Mark packages as side-effect-free in package.json:

{
  "sideEffects": false
}

Or specify files with side effects explicitly:

{
  "sideEffects": ["./src/polyfills.ts", "*.css"]
}

Image Optimization Pipeline

Images typically account for 50-70% of total page weight. An optimization pipeline addresses format, dimensions, loading behavior, and placeholders.

Format Selection

Format	Compression vs JPEG	Browser Support	Use Case
AVIF	50% smaller	92%+ (2024)	Primary format
WebP	25-30% smaller	97%+	Fallback
JPEG	Baseline	100%	Final fallback
PNG	Larger (lossless)	100%	Transparency required

Responsive Images with srcset

<picture>
  <source
    type="image/avif"
    srcset="/hero-400.avif 400w,
            /hero-800.avif 800w,
            /hero-1200.avif 1200w,
            /hero-1600.avif 1600w"
    sizes="(max-width: 640px) 100vw,
           (max-width: 1024px) 75vw,
           50vw"
  />
  <source
    type="image/webp"
    srcset="/hero-400.webp 400w,
            /hero-800.webp 800w,
            /hero-1200.webp 1200w,
            /hero-1600.webp 1600w"
    sizes="(max-width: 640px) 100vw,
           (max-width: 1024px) 75vw,
           50vw"
  />
  <img
    src="/hero-800.jpg"
    width="1600"
    height="900"
    alt="Product showcase"
    loading="eager"
    decoding="async"
    fetchpriority="high"
  />
</picture>

Next.js Image Component

import Image from "next/image";
 
function HeroSection(): JSX.Element {
  return (
    <Image
      src="/hero.jpg"
      width={1600}
      height={900}
      sizes="(max-width: 640px) 100vw,
             (max-width: 1024px) 75vw,
             50vw"
      placeholder="blur"
      blurDataURL="data:image/svg+xml;base64,..."
      priority    // Sets fetchpriority="high", disables lazy loading
      alt="Product showcase"
    />
  );
}

The priority prop should be set only on the LCP image. It adds fetchpriority="high" and removes loading="lazy". All other images default to lazy loading.

Image Quality

The visual difference between quality 80 and quality 100 is imperceptible in most photographs. The file size difference is 40-60%.

Quality Setting	File Size (1200x800 photo)	SSIM vs q100
100	485KB	1.000
90	198KB	0.998
80	124KB	0.994
70	96KB	0.989
50	68KB	0.971

Quality 80 provides the optimal balance for photographic content. Quality 70 is acceptable for thumbnails and background images.

CDN-Based Image Transformation

For user-uploaded content, URL-based image CDNs (Cloudinary, imgix, Cloudflare Images) generate optimized variants on the fly:

https://cdn.example.com/uploads/photo.jpg?w=800&h=600&fit=cover&format=auto&quality=80

The format=auto parameter serves AVIF to supporting browsers, WebP as a fallback, and JPEG otherwise, based on the Accept header. This eliminates the need to pre-generate multiple format variants.

HTTP/2 Multiplexing

HTTP/1.1 limits browsers to 6 parallel connections per origin. HTTP/2 multiplexes all requests over a single TCP connection using streams.

HTTP/1.1 (6 connection limit per origin):

Conn 1: [=== styles.css ===]    [=== app.js ===]
Conn 2: [=== vendor.js ===]     [=== page.js ===]
Conn 3: [=== hero.jpg ===========================]
Conn 4: [=== font.woff2 ===]   [=== logo.svg ==]
Conn 5: [waiting for slot...]
Conn 6: [waiting for slot...]

Queued:  analytics.js, 4 more images (blocked until a connection frees)


HTTP/2 (single multiplexed connection):

        [== styles.css ==]
Stream  [===== vendor.js =====]
  ||    [= app.js =]
  ||    [= page.js =]
  ||    [======= hero.jpg =======]
  ||    [== font.woff2 ==]
  ||    [= logo.svg =]
  ||    [= analytics.js =]
  ||    [= img2 =][= img3 =][= img4 =][= img5 =]

All resources requested concurrently over one connection.

HTTP/2 enables granular code splitting without per-request overhead. A bundle split into 20 chunks can be fetched in parallel over a single connection.

HTTP/2 Head-of-Line Blocking

HTTP/2 multiplexes at the application layer, but the underlying TCP connection treats all data as a single byte stream. A single lost TCP packet stalls all HTTP/2 streams until retransmission completes.

HTTP/2 over TCP (head-of-line blocking):

Stream A: [====][====][    STALLED    ][====]
Stream B: [====][====][    STALLED    ][====]
Stream C: [====][====][    STALLED    ][====]
                       ^
                 Packet loss on Stream A
                 blocks ALL streams


HTTP/3 over QUIC (independent streams):

Stream A: [====][====][    STALLED    ][====]
Stream B: [====][====][====][====][====]       <-- unaffected
Stream C: [====][====][====][====]             <-- unaffected
                       ^
                 Packet loss on Stream A
                 only blocks Stream A

HTTP/3 (QUIC) uses UDP with per-stream loss recovery, eliminating transport-layer head-of-line blocking. Most CDNs (Cloudflare, Fastly, AWS CloudFront) support HTTP/3.

Font Loading Strategies

Custom fonts create a hidden dependency chain. The browser does not begin downloading a font until it has parsed CSS containing a matching @font-face rule and encountered an element using that font family.

Typical font loading timeline:

HTML download:           [======]
  CSS download:                  [==========]
    CSS parsed:                              |
    Font discovered:                         [font request starts]
      Font download:                         [================]
        Text rendered:                                         HERE
                                                              (2.4s after navigation)

Optimized Font Loading

/* Primary font with swap behavior */
@font-face {
  font-family: "Inter";
  src: url("/fonts/Inter-Regular.woff2") format("woff2");
  font-weight: 400;
  font-style: normal;
  font-display: swap;
  unicode-range: U+0000-00FF, U+0131, U+0152-0153, U+02BB-02BC,
                 U+02C6, U+02DA, U+02DC, U+2000-206F, U+2074,
                 U+20AC, U+2122, U+2191, U+2193, U+2212, U+2215,
                 U+FEFF, U+FFFD;
}
 
/* Fallback font sized to match the custom font, reducing CLS */
@font-face {
  font-family: "Inter-fallback";
  src: local("Arial");
  size-adjust: 107.64%;
  ascent-override: 90.49%;
  descent-override: 22.56%;
  line-gap-override: 0%;
}
 
body {
  font-family: "Inter", "Inter-fallback", system-ui, sans-serif;
}

The font-display: swap declaration renders text immediately with the fallback font, then swaps in the custom font once loaded. The size-adjust, ascent-override, and descent-override properties on the fallback minimize layout shift during the swap.

Preload the font in the HTML <head> to start downloading before CSS parsing:

<link rel="preload" href="/fonts/Inter-Regular.woff2"
      as="font" type="font/woff2" crossorigin>

The crossorigin attribute is required even for same-origin font files. Without it, the browser makes a separate CORS request, and the preloaded resource goes unused.

System Font Stack Alternative

Using system fonts eliminates font loading entirely:

body {
  font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto,
               Oxygen-Sans, Ubuntu, Cantarell, "Helvetica Neue", sans-serif;
}

Result: zero font download time, zero font-related CLS, zero FOIT/FOUT.

Font Subsetting

Full font files often include glyphs for Latin, Cyrillic, Greek, and other scripts. Subsetting to Latin-only reduces file size significantly:

Font File	Full	Latin Subset	Reduction
Inter Regular (woff2)	98KB	18KB	81.6%
Roboto Regular (woff2)	84KB	15KB	82.1%
Open Sans Regular (woff2)	92KB	17KB	81.5%

Use unicode-range in @font-face to declare which character ranges each file covers. The browser only downloads files for ranges present in the page content.

Edge Computing and Latency

The speed of light in fiber optic cable is approximately 200,000 km/s. A round trip from Sydney to us-east-1 (Virginia) covers approximately 30,000 km, producing a minimum theoretical latency of 150ms. Practical latency with routing hops: 220-280ms.

A page load requiring 4 round trips (DNS + TCP + TLS + HTTP) accumulates 880-1,120ms of pure network latency before the first byte of content arrives.

Single-origin architecture:

User in         Origin in
Sydney  ------  Virginia
        ~250ms RTT


Edge architecture:

User in      Edge in       Origin in
Sydney  ---  Sydney  ----  Virginia
       ~10ms  (cache hit: serve directly)
        RTT   (cache miss: fetch from origin, ~250ms)

Edge Middleware

import { NextRequest, NextResponse } from "next/server";
 
export function middleware(request: NextRequest): NextResponse {
  const country = request.geo?.country ?? "US";
  const response = NextResponse.next();
 
  // A/B test assignment at the edge, no origin round trip
  const bucket =
    request.cookies.get("ab-experiment")?.value ??
    (Math.random() > 0.5 ? "control" : "variant");
 
  response.cookies.set("ab-experiment", bucket, {
    httpOnly: true,
    sameSite: "lax",
    maxAge: 60 * 60 * 24 * 30,
  });
  response.headers.set("x-ab-bucket", bucket);
 
  // Geo-redirect to localized content
  if (country === "DE" && !request.nextUrl.pathname.startsWith("/de")) {
    return NextResponse.redirect(
      new URL(`/de${request.nextUrl.pathname}`, request.url)
    );
  }
 
  // Bot detection: block known bad actors
  const ua = request.headers.get("user-agent") ?? "";
  if (/AhrefsBot|SemrushBot|DotBot/.test(ua)) {
    return new NextResponse(null, { status: 403 });
  }
 
  return response;
}
 
export const config = {
  matcher: ["/((?!api|_next/static|_next/image|favicon.ico).*)"],
};

Edge functions execute in V8 isolates, not full Node.js environments. Available APIs are limited (no fs, no native modules). Execution time limits are typically 25-50ms on Vercel Edge, 50ms on Cloudflare Workers (free tier).

Caching Architecture

Request flow through cache layers:

Request
   |
   v
+---------------------+
|   Browser Cache     |    Fastest: 0ms latency.
|   (HTTP headers)    |    Cache-Control: max-age, immutable.
|                     |    No network request at all.
+---------------------+
   | MISS
   v
+---------------------+
|   CDN / Edge Cache  |    Fast: 10-50ms latency.
|   (s-maxage, SWR)   |    Surrogate-Key based purging.
|                     |    No origin hit.
+---------------------+
   | MISS
   v
+---------------------+
|   Application Cache |    Medium: origin latency + lookup.
|   (Redis/Memcached) |    Cached computations, DB results.
+---------------------+
   | MISS
   v
+---------------------+
|   Origin Server     |    Slowest: full computation.
|   (Database, APIs)  |    DB queries, template rendering.
+---------------------+

Cache-Control Headers for Static Assets

Content-hashed filenames (e.g., app.a1b2c3d4.js) enable aggressive caching. If the content changes, the hash changes, generating a new URL. The old URL is safe to cache indefinitely.

Cache-Control: public, max-age=31536000, immutable

max-age=31536000: Cache for one year (effectively forever for hashed assets)
immutable: Do not revalidate even on page refresh (supported by Firefox, Safari)

Cache-Control Headers for HTML

HTML pages need to serve fresh content while remaining fast for repeat visits:

Cache-Control: public, max-age=60, stale-while-revalidate=600

max-age=60: Serve from cache for 60 seconds without revalidation
stale-while-revalidate=600: After max-age expires, serve stale content while fetching a fresh copy in the background. For up to 600 seconds after max-age expiry.

This produces near-instant page loads on repeat visits with eventual consistency (content is at most ~10 minutes stale).

Compression

Algorithm	Compression Ratio (JS)	Decompression Speed	Support
None	1.0x	N/A	100%
gzip	3-4x	500 MB/s	100%
Brotli (level 6)	3.5-4.5x	400 MB/s	97%+
Brotli (level 11)	4-5.5x	400 MB/s	97%+

Brotli at compression level 11 (static, pre-compressed) provides 15-25% better compression than gzip for JavaScript and CSS. Use Brotli for static assets compressed at build time. Use gzip or Brotli level 4-6 for dynamic responses where compression speed matters.

Performance Budget Enforcement

Automated performance budgets prevent regressions. Integrate into CI with Lighthouse CI or bundlesize.

// lighthouserc.ts
export default {
  ci: {
    collect: {
      url: ["http://localhost:3000", "http://localhost:3000/products"],
      numberOfRuns: 3,
    },
    assert: {
      assertions: {
        "categories:performance": ["error", { minScore: 0.9 }],
        "first-contentful-paint": ["error", { maxNumericValue: 1500 }],
        "largest-contentful-paint": ["error", { maxNumericValue: 2500 }],
        "cumulative-layout-shift": ["error", { maxNumericValue: 0.1 }],
        "total-byte-weight": ["error", { maxNumericValue: 400000 }],
        "mainthread-work-breakdown": [
          "warn",
          { maxNumericValue: 3000 },
        ],
      },
    },
  },
};

Bundle size budgets using bundlesize:

{
  "bundlesize": [
    {
      "path": ".next/static/chunks/main-*.js",
      "maxSize": "80KB",
      "compression": "gzip"
    },
    {
      "path": ".next/static/chunks/pages/_app-*.js",
      "maxSize": "50KB",
      "compression": "gzip"
    },
    {
      "path": ".next/static/css/*.css",
      "maxSize": "30KB",
      "compression": "gzip"
    }
  ]
}

Before/After: Full Optimization Pass

Results from applying the techniques described above to a React e-commerce application:

Metric	Before	After	Change
HTML response size	142KB	12.8KB	-90.9%
Total JS (gzipped)	1.2MB	148KB	-87.7%
Total CSS (gzipped)	89KB	14KB	-84.3%
Total images	4.8MB	620KB	-87.1%
HTTP requests	47	23	-51.1%
LCP	6.2s	0.8s	-87.1%
INP	320ms	45ms	-85.9%
CLS	0.34	0.02	-94.1%
Lighthouse Performance	12	98	+717%
Bounce rate	67%	34%	-49.3%
Conversion rate (India)	0.3%	2.1%	+600%

The conversion rate change in India (0.3% to 2.1%) corresponds to the application becoming usable on low-bandwidth, high-latency connections. Those users were present before the optimization, but the page did not load within their patience threshold.

Pre-Launch Performance Checklist

NETWORK AND PROTOCOL
====================
[ ] HTTP/2 or HTTP/3 enabled on the server and CDN
[ ] Brotli compression enabled for static assets
[ ] gzip or Brotli (level 4-6) for dynamic responses
[ ] preconnect hints for critical third-party origins (max 2-4)
[ ] dns-prefetch for non-critical third-party origins

CRITICAL RENDERING PATH
========================
[ ] HTML response under 14KB compressed (fits in TCP initial window)
[ ] Critical CSS inlined in <head>
[ ] Full CSS loaded asynchronously via preload pattern
[ ] All <script> tags use defer or async
[ ] No synchronous third-party scripts in <head>
[ ] LCP resource has <link rel="preload">

JAVASCRIPT
==========
[ ] Route-based code splitting active
[ ] Heavy components use dynamic imports
[ ] Main bundle under 100KB gzipped
[ ] No full-library imports (lodash, moment, icon libraries)
[ ] Bundle analysis reviewed (webpack-bundle-analyzer or equivalent)
[ ] Tree shaking verified, sideEffects declared in package.json
[ ] Third-party scripts audited and lazy-loaded

IMAGES
======
[ ] AVIF as primary format, WebP fallback, JPEG final fallback
[ ] srcset and sizes attributes on all <img> tags
[ ] Explicit width and height on all images (CLS prevention)
[ ] loading="lazy" on below-fold images
[ ] LCP image NOT lazy-loaded, has fetchpriority="high"
[ ] Image quality set to 80 for photographs
[ ] CDN with format=auto for user-uploaded content

FONTS
=====
[ ] Font files preloaded with crossorigin attribute
[ ] font-display: swap on all @font-face rules
[ ] Fallback font size-adjusted to match custom font
[ ] Font files subsetted to required character ranges
[ ] woff2 format used (best compression)

CACHING
=======
[ ] Static assets: Cache-Control immutable + content-hashed filenames
[ ] HTML: stale-while-revalidate caching strategy
[ ] CDN cache configured with appropriate TTLs

TESTING
=======
[ ] Tested with Chrome DevTools throttled to Slow 3G
[ ] Tested on WebPageTest from a non-US location (e.g., Mumbai)
[ ] Core Web Vitals passing at 75th percentile in CrUX data
[ ] Lighthouse CI in CI/CD pipeline with performance budget
[ ] Bundle size budget enforced in CI/CD pipeline