The Limits of Vibe Coding: Open Source Is Not Free Real Estate
The Papermark controversy shows why AI-assisted speed runs cannot bypass the legal realities of software licensing.
When the term "vibe coding" entered the developer lexicon, it was celebrated as a triumph of productivity. The promise was simple: describe what you want, let AI agents or rapid prototyping frameworks generate the boilerplate, and ship products at a fraction of the traditional cost. But a recent public clash between the founders of Papermark and a startup called Corgi highlights the dark side of this high-speed development culture.
The controversy began when Nico Laqua, founder of Corgi, announced the release of "DataRoom," pitching it as a low-cost alternative to expensive document-sharing platforms. Almost immediately, Marc Seitz, the creator of Papermark, an open-source virtual data room, pointed out that DataRoom was not a novel creation. It was a near 1:1 reskin of Papermark's open-source and enterprise-licensed codebase.
The defense from the Corgi team followed a predictable trajectory. First came the assertion that the product was built entirely from scratch. Then, as community members pointed out identical UI elements and structural matches, the narrative shifted. The team announced they would "clean up" the product and update the UI, even as they maintained that the allegations of copying were false.
This incident is not just startup drama. It is a cautionary tale for the modern developer. Vibe coding, whether done by a human copying open-source repositories or an AI agent pulling from its training data, does not exempt you from copyright law.
The Illusion of "Inspiration" in the AI Era
In the traditional software world, copying someone else's proprietary or copyleft code and selling it as your own was a clear, deliberate boundary violation. Today, that boundary is blurred by the sheer speed of development. When developers rely on AI agents to generate entire features, they often lose visibility into where the underlying logic originates.
An LLM does not invent code out of thin air. It predicts tokens based on its training data, which includes millions of open-source repositories. If an agent is tasked with building a "secure document viewer with tracking," it is highly likely to output code blocks that closely mirror dominant open-source solutions in that niche, such as Papermark.
If you ship that generated code without auditing it, you are still legally responsible. Under copyright law, ignorance is not a defense. Whether the code was copied by a junior developer looking for a shortcut or an AI agent optimizing for a prompt, the entity distributing the software is liable for license infringement.
The Legal Reality of Commercial Open Source
Many modern developer tools and SaaS platforms operate on a Commercial Open Source Software (COSS) model. They make their code publicly available on GitHub to encourage community trust, self-hosting, and contributions, but they protect their commercial viability using specific licenses.
These licenses typically fall into two categories:
- Copyleft Licenses (e.g., AGPLv3): These require any derivative work, or any software that interacts with the licensed code over a network, to also be open-sourced under the same license. You cannot take an AGPL-licensed project, wrap a new UI around it, and sell it as a proprietary SaaS.
- Source-Available or Enterprise Licenses: These allow users to view the code but explicitly forbid commercial exploitation or competitive hosting without a paid agreement.
When the Corgi team reskinned Papermark, they violated these terms. Even if a product is launched as "mostly free," distributing or hosting code that violates a copyleft or enterprise license constitutes copyright infringement.
The Developer Angle: How to Audit Your Codebase
If you are leading an engineering team or building a product using AI assistance, you must establish guardrails to ensure you do not accidentally ship infringing code.
1. Configure AI Code Generation Guardrails
Most enterprise code generation tools allow you to block suggestions that match public code. In GitHub Copilot, for example, administrators can enable the "Block suggestions matching public code" setting. This performs a real-time check against public repositories on GitHub and suppresses suggestions that contain matches of approximately 150 characters or more.
2. Implement Continuous License Scanning
Do not wait for a launch day call-out on social media to find out what licenses are hiding in your dependency tree. Integrate license compliance tools like FOSSA or Snyk directly into your CI/CD pipeline.
These tools scan your package manifests (such as package.json, go.mod, or Cargo.toml) and flag any dependencies that use restrictive licenses like GPL or AGPL. You can configure your build pipeline to fail if an unapproved license is detected:
# Example using FOSSA CLI in a CI pipeline
fossa analyze
fossa test
3. Perform AST-Based Code Audits
If you suspect that a contractor, a team member, or an AI agent has copied code from an existing open-source project, simple text searches are easily bypassed by changing variable names. Instead, use Abstract Syntax Tree (AST) comparison tools.
An AST represents the structural logic of the code rather than the literal text. Tools like jscpd (JavaScript Copy/Paste Detector) can find structural duplicates across codebases even if variables have been renamed or formatting has changed.
To run a quick check for duplicate blocks in a Node.js project:
npm install -g jscpd
jscpd ./src ./path-to-suspected-source-code
This will output a report showing the percentage of cloned code and the exact blocks that match structurally.
The True Cost of Skipping the Audit
The appeal of vibe coding is obvious: it minimizes the time between idea and execution. But the Papermark incident proves that speed without governance is a liability.
When you bypass the engineering discipline of code reviews, dependency auditing, and license verification, you are not actually moving fast. You are simply deferring the cost of compliance. When that debt comes due, it rarely arrives as a quiet bug report. It arrives as a public accusation of fraud, a damaged reputation, and a forced teardown of your product.
If you are going to build on the shoulders of open-source giants, you have to play by their rules. Read the license, audit your code, and remember that "the AI wrote it" is not a get-out-of-jail-free card.
Sources & further reading
Lenn writes about cloud platforms, Kubernetes internals, and the infrastructure decisions that quietly make or break engineering organizations. Based in Berlin's vibrant tech scene, they have a talent for turning dense platform-engineering topics into prose that people actually finish reading.
Discussion 0
No comments yet
Be the first to weigh in.