New:

AEC-Bench

A 2025 multimodal benchmark for evaluating AI agents on real-world AEC tasks including drawing understanding, cross-sheet reasoning, and project-level document coordination.

Definition

AEC-Bench is a rigorous evaluation framework published in 2025 for testing multimodal AI agents on authentic architecture, engineering, and construction tasks. The benchmark fills a critical gap: while general AI benchmarks like MMMU and DocVQA evaluate broad visual reasoning, none capture the specific challenges of construction documentation—tightly packed drawing annotations, cross-sheet reference chains, project-level discipline coordination, and the specialized visual grammar of AEC documents. Tasks include drawing understanding (reading dimensions, identifying elements, interpreting symbols), cross-sheet reasoning (following reference bubbles across detail, plan, and elevation sheets), and project-level coordination (identifying conflicts between structural, architectural, and MEP drawings). AEC-Bench reveals that state-of-the-art multimodal LLMs still struggle with AEC-specific tasks that experienced engineers handle routinely—particularly cross-sheet reasoning and dense annotation interpretation. Results are already influencing the architecture of platforms like BIMgent and multimodal construction document AI tools.

Examples

1

Testing whether an AI can correctly identify that a window schedule references a detail contradicting the elevation

2

Benchmarking five multimodal models on extracting and cross-checking structural beam sizes across 20 drawing sheets

3

Using AEC-Bench scores to select the best foundation model for a construction document AI platform

Nomic Use Cases

See how Nomic applies this in production AEC workflows:

Frequently Asked Questions

AEC-Bench is a rigorous evaluation framework published in 2025 for testing multimodal AI agents on authentic architecture, engineering, and construction tasks. The benchmark fills a critical gap: while general AI benchmarks like MMMU and DocVQA evaluate broad visual reasoning, none capture the specific challenges of construction documentation—tightly packed drawing annotations, cross-sheet reference chains, project-level discipline coordination, and the specialized visual grammar of AEC documents. Tasks include drawing understanding (reading dimensions, identifying elements, interpreting symbols), cross-sheet reasoning (following reference bubbles across detail, plan, and elevation sheets), and project-level coordination (identifying conflicts between structural, architectural, and MEP drawings). AEC-Bench reveals that state-of-the-art multimodal LLMs still struggle with AEC-specific tasks that experienced engineers handle routinely—particularly cross-sheet reasoning and dense annotation interpretation. Results are already influencing the architecture of platforms like BIMgent and multimodal construction document AI tools.

Testing whether an AI can correctly identify that a window schedule references a detail contradicting the elevation. Benchmarking five multimodal models on extracting and cross-checking structural beam sizes across 20 drawing sheets. Using AEC-Bench scores to select the best foundation model for a construction document AI platform.

Automated Drawing Review: Automatically review drawings against building codes, internal standards, and client requirements. Automated Code Compliance: Check drawings against 380+ building codes and standards with cited answers.

More Technology Terms

View all

See AEC-Bench in action

Nomic is purpose-built AI for architecture, engineering, and construction. Connect your project data and start getting answers in minutes.