LLMs Move Into CAD with Progressive Refinement Pipelines
A new research paper introduces PR-CAD, unifying text-to-CAD generation and editing into a single reinforcement learning agent.
For software developers building tools for engineering, manufacturing, or 3D design, computer-aided design (CAD) has long been a notoriously difficult domain to automate. Traditional CAD modeling is highly manual, relying on precise geometric constraints, parametric histories, and specialized desktop software. While generative AI has made massive strides in text-to-image and text-to-3D mesh generation, translating natural language into precise, editable CAD files has remained a stubborn bottleneck.
The core issue is that existing text-to-CAD tools typically treat model generation and model editing as two entirely separate problems. You either generate a static model from scratch, or you attempt to edit an existing one using a different pipeline. This disjointed approach makes iterative design—the way human engineers actually work—nearly impossible to automate.
A new research paper published on arXiv by Jiyuan An and a team of researchers introduces PR-CAD (Progressive Refinement CAD), a framework designed to bridge this gap. By unifying generation and editing into a single, cohesive workflow, the system allows developers to build agents that can iteratively refine CAD models based on natural language feedback.
The Problem with One-Shot CAD Generation
In a typical developer workflow, generating an asset is only the first step. If an LLM generates a piece of code, you don't just throw it away when there is a bug; you feed the error message back to the LLM to patch it.
With CAD, however, representation is everything. Traditional generative models often output dense boundary representations (B-reps) or polygonal meshes that lack an underlying parametric history. If a user asks to "make the mounting holes 5mm wider," a standard text-to-3D model cannot easily locate those specific holes, let alone adjust their parameters without distorting the surrounding geometry.
PR-CAD addresses this by treating CAD generation not as a single-shot translation task, but as an ongoing, interactive dialogue.
Inside the PR-CAD Architecture
According to the researchers, PR-CAD achieves its "all-in-one" generation and editing capability through three core components:
- An LLM-Friendly CAD Representation: Standard CAD formats are designed for geometric kernels, not tokenizers. PR-CAD utilizes a CAD representation specifically optimized for LLM reasoning, allowing the model to parse and output geometric commands more reliably.
- A High-Fidelity Interaction Dataset: To train the model, the researchers curated a dataset spanning the entire CAD lifecycle. This dataset includes multiple CAD representations, qualitative descriptions (e.g., "make it look sleeker"), quantitative descriptions (e.g., "increase the length by 12mm"), and systematically defined edit operations.
- An RL-Enhanced Reasoning Agent: Instead of relying purely on supervised fine-tuning, PR-CAD uses a reinforcement learning-enhanced reasoning framework. This single agent is responsible for three critical tasks: understanding user intent, estimating exact geometric parameters, and localizing where edits need to occur on the model.
By combining these elements, the agent can handle both the initial creation of a model and the subsequent, highly localized edits required to refine it.
Mutual Reinforcement: Why Unification Works
One of the paper's most compelling findings is that generation and editing tasks are not mutually exclusive; in fact, they actively help each other. The researchers observed a strong "mutual reinforcement" effect when training the model on both tasks simultaneously.
Learning how to edit a model teaches the LLM about the underlying structure of CAD geometry, which in turn makes its initial generations more robust and structurally sound. Similarly, understanding how to build a model from scratch gives the agent the context it needs to perform precise edit localizations later on.
This synergy also extends across modalities. The model showed improved performance when handling both qualitative, subjective requests and highly precise, quantitative engineering constraints. On public benchmarks, this unified approach allowed PR-CAD to achieve state-of-the-art controllability and faithfulness in both generation and refinement scenarios.
What This Means for Developers
For developers looking to build the next generation of design automation tools, PR-CAD points to a future where CAD software can be controlled programmatically via natural language interfaces.
Instead of building fragile, hard-coded scripts to programmatically alter CAD files, developers can leverage unified LLM pipelines to act as intelligent middleware. This opens up new possibilities for integrating CAD generation directly into broader software ecosystems—such as automated hardware testing pipelines, generative architecture tools, or interactive AI assistants embedded directly within existing CAD suites.
Sources & further reading
Priya covers AI frameworks, developer productivity tooling, and the startup ecosystem across South and Southeast Asia, bringing a researcher's rigour and a practitioner's empathy to every story. She is deeply sceptical of benchmarks and asks hard questions so her readers don't have to.
Discussion 2
i'm both excited and terrified by the idea of pr-cad, mainly because i can already imagine the compute costs of training a reinforcement learning agent to handle cad files - guess it's time to upgrade from running models on a potato
i'm curious to see how pr-cad handles the nuances of parametric histories, as that's often where manual cad work gets really tricky - can the reinforcement learning agent learn to preserve those relationships during refinement?