Multimodal Architecture

Ongoing Project

Harvard Graduate School of Design Thesis 2022

Throughout architectural history, different tools of design have affected the culture of architectural production. While drawings and visual imagery often act as a primary form of contemporary representation, architecture cannot be reduced to a single mode. The cyclical tension between the conceptual and material relies on a multimodal process originating from semantics. Whether built form or text, both can be seen as a form of architecture that rely on a necessary conceptual dimension. This thesis questions what opportunities arise within the cyclical translations and differences between these modes of representation, in particular language, through the use of machine-learning algorithms.

Emergent multimodal neural networks are capable of learning visual concepts from natural language supervision. They can be instructed in common language to generate images, which represents a historical moment of convergence between image and text processing. These models will bring forward a fundamental shift in the way language and its articulation can be used within the creative process. By exploring the potentials and limitations of these machine-learning models, from text to image to 3D geometries, the thesis seeks to uncover new relationships between language and architecture. Ultimately, these new collaborative human-machine interactions augment, rather than limit the agency and creativity of the architect within this process.

Thesis Advisor : Andrew Witt, Jose Luis Garcia del Castillo Lopez

Awarded Harvard 2022 Digital Design Prize

Mediums Thesis Article 2022

Project Overview