The review process
When you upload a PDF, it passes through a multi-agent pipeline. Each agent focuses on one aspect of your requirements specification. The entire process runs on EU-based infrastructure — your document never leaves Europe.
■ Rule-based (instant) ■ AI model (2-4 min) ■ Extraction ■ Merge
How your PDF is read
Before the review agents run, the system needs to find and extract the requirements section from your PDF. This is harder than it sounds — reports typically contain introductions, methodology, test results, and appendices alongside the actual requirements.
The system scans every page for requirements signals: section headings ("Requirements", "Eisen"), table headers with requirement-related columns, bold headings like "Functional Requirements", and requirement IDs (FR-01, NF1, etc.). It scores each page and selects the best match. It also auto-detects your project domain and finds the problem statement section for separate review.
When it can't confidently find your requirements, it bounces the submission back with formatting tips rather than reviewing the wrong section.
What each agent does
Six specialized agents review your document sequentially. The first two are instant (rule-based), the next four use AI models. After merging, a separate ML model scores your content quality. The rule-based checks also detect redundant requirements by comparing all pairs for content overlap.
The six review steps
| Step | What it checks | How |
|---|---|---|
| 1. Structure | Table format, requirement IDs, MoSCoW priorities, measurable thresholds, F/NF split, verification methods | Rule-based (instant) |
| 2. INCOSE rules | Vague terms, escape clauses, subjective language, compound requirements, EARS pattern compliance, redundancy detection, SMART time-bound check, decomposition depth | Rule-based (instant) |
| 3. Quality | 13 quality anti-patterns per requirement with SMART labels: ambiguity, untestability, missing context, boilerplate, unrealistic absolutes, no evidence of RE process | AI model (Gemma 27B) |
| 4. Coverage | Missing non-functional requirements for your domain (ML, embedded, embedded vision) | AI model (Phi-4 14B) |
| 5. Traceability | Can each requirement be traced to a test? V-model alignment. | AI model (Phi-4 14B) |
| 6. Alignment | Do your requirements actually address the problem statement? Finds uncovered needs, orphan requirements, and scope mismatches. | AI model (Gemma 27B) |
| After merge | Content quality: how your writing compares to 122 human-scored specifications | ML classifier |
How scoring works
Your overall score is the average of four structure categories. Each category is determined by which structural elements are present in your document — not by the AI's opinion.
The content quality score is computed by a separate machine learning model that compares your requirements text to the best specifications we've reviewed. When it's higher than the overall score, it means your writing is stronger than your structure suggests — improving the format will raise your score.
Why multiple models?
Different tasks need different tools. Using one large model for everything would be slower, less reliable, and more expensive.
| Approach | Used for | Why |
|---|---|---|
| Rule-based Python regex + heuristics |
Structure detection, INCOSE rule checks | Deterministic — same input always gives the same result. Instant (<0.1s). An AI model would be overkill for checking "does this document have a table?" |
| Gemma 3 27B Large language model (local) |
Quality review, PS-requirements alignment | Needs deep understanding of each requirement's meaning, context, and testability. The alignment check requires nuanced semantic matching between problem statement and requirements. The largest model produces the most accurate results. |
| Phi-4 14B Language model (local) |
Coverage gaps, traceability | Focused checklist tasks that don't need the full 27B capacity. Faster to load, freeing GPU time for the quality review. |
| ML classifier Sentence embeddings + Ridge regression |
Content quality score | Compares your text to 122 human-scored specifications using semantic similarity. Not a language model — it reads the overall shape of your document, not individual sentences. Trained on real engineering reports. |
The AI models both analyze your requirements and write the feedback text (summary, findings, suggestions, rewrites) in a single pass. There is no separate "writing" model — the analysis and the explanation come from the same model run. A Python merge step then combines the outputs from all agents, computes your scores, and formats the email.
All models run locally on a GPU server in Europe. No data is sent to external AI services.
EARS requirement patterns
The EARS (Easy Approach to Requirements Syntax) patterns are used by Airbus, Bosch, NASA, and Rolls-Royce. We check what percentage of your requirements follow these patterns.
| Pattern | Template | Example |
|---|---|---|
| Ubiquitous | The <system> shall <response> | The system shall respond within 200 ms |
| Event-driven | When <trigger>, the <system> shall <response> | When the user presses stop, the system shall halt playback |
| State-driven | While <condition>, the <system> shall <response> | While in standby, the device shall consume less than 1 W |
| Optional | Where <feature>, the <system> shall <response> | Where GPS is available, the system shall log position |
| Unwanted | If <error>, then the <system> shall <response> | If the connection fails, then the system shall retry 3 times |
| Complex | While <state>, when <trigger>, the <system> shall <response> | While connected, when the user presses stop, the system shall disconnect |
Reading your feedback email
Your email follows a three-phase structure based on educational research (Hattie & Timperley, 2007):
| Section | What it tells you | What to do with it |
|---|---|---|
| What we look for | The criteria: table structure, IDs, priorities, measurability, testability | Compare your document against these criteria before reading further |
| Structure checklist | Which structural elements are present or missing, with an explanation of why each matters | Fix the missing elements first — they have the biggest impact on your score |
| Scores | Overall (average of 4 structure categories) + Content quality (how your writing compares) | If Content is higher than Overall, your writing is good — focus on improving the format |
| Top 3 improvements | The three highest-impact changes you can make | Start here. These are your next steps. |
| Improvement suggestions | Per-requirement findings with severity, explanation, suggestion, and standards reference | Work through these after fixing the top 3. Each links to the relevant standard. |
| Specification profile | EARS syntax %, time-bound %, decomposition depth, redundancy, alignment score, V-model readiness | Background stats showing how your spec compares to best practices. Focus on items at 0%. |
Tip: Don't try to fix everything at once. Focus on the top 3 improvements, resubmit, and iterate.
Standards we reference
Findings in your feedback are linked to international requirements engineering standards:
- IEEE 29148 — International standard for requirements engineering (ISO/IEC/IEEE 29148:2018)
- INCOSE Guide — 42 rules for writing quality requirements
- EARS — Easy Approach to Requirements Syntax (Rolls-Royce, 2009)
- IREB — International Requirements Engineering Board quality criteria (consistency, non-redundancy, completeness, traceability)
- V-model — Systems development model linking requirements to verification