dataset catalog

AI-ready BIM datasets

Explore synthetic BIM datasets built as semantic IFC graphs and packaged for graph learning, classical ML and BIM LLM training. These catalog entries describe procedurally generated buildings, not real projects, and the core export stays compatible with Buildata today.

Each catalog entry highlights the synthetic origin of the data, the dataset layers, supported AI tasks and the primary file formats available in the package.

Synthetic origin

Every dataset in the catalog is procedurally generated. Buildata does not sell or expose real-building BIM models in the public catalog.

Core dataset

metadata.json, nodes.json, edges.json, tasks.json, statistics.json and dataset_card.md remain the canonical Buildata export.

Derived AI layers

Optional ML, LLM and graph exports are generated from the same source data, so nothing is lost and compatibility stays intact.