Large-scale synthetic IFC building graphs for machine learning

Buildata: A foundation dataset for BIM AI

Buildata is a large-scale synthetic dataset of IFC building graphs for machine learning. Instead of focusing on 3D geometry as the primary unit, Buildata organizes each sample as a building graph with BIM elements, relationships, attributes and metadata.

1 sample = 1 building Nodes Edges Metadata

building graph dataset

01 Nodes

Each building contains BIM elements represented as IFC entities such as walls, doors, windows, slabs, spaces and structural components.

02 Edges

Relationships describe how building elements connect, contain, host, support or bound one another inside the graph.

03 Features

Attributes, property sets and metadata make the graph useful for classification, reasoning, prediction and benchmarking workflows.

Unit1 sample = 1 building
GraphNodes, edges, hierarchy and metadata
ScopeIFC semantic structures for machine learning

what buildata contains

A building graph dataset, not a 3D model library

Buildata is designed for AI systems that need to learn BIM semantic structures. Each sample represents a complete building organized as a graph, with entities, relationships, hierarchy and structured attributes aligned with IFC logic.

Graph learning

Train models on nodes and edges that capture BIM relationships such as contains, bounded_by, hosted_by and supported_by.

Semantic classification

Learn to identify IFC entities, predefined types and semantic patterns across complete buildings.

Attribute prediction

Use features and property sets to test metadata completion, validation and BIM reasoning workflows.

featured datasets

Explore IFC building graph datasets

See full catalog

technology

Structured for reproducible BIM AI experiments

Buildata organizes each building sample into graph-ready files that separate metadata, nodes, edges, hierarchy and schema information for reproducibility.

01. metadata.json

Describes the building sample, typology, schema, generator profile, counts and reproducibility settings.

02. nodes.json

Represents BIM elements as IFC nodes with attributes, features and selected property sets.

03. edges.json

Captures semantic relationships that convert the building into a BIM graph for machine learning.

04. hierarchy.json

Preserves the IFC hierarchy from project and site down to building, storey and space levels.

explore buildata

Start with sample building graphs for BIM AI

Download the first sample, inspect its graph structure and evaluate how IFC semantic data can support machine learning workflows.