I am broadly interested in Probabilistic Machine Learning, Perception and Geometry,
with a focus on Conditional Generative Models, Hierarchical Variational Inference,
Few-Shot Generation, Multitask Language Models, and Diffusion Models.
|
Research
I am interested in the generalization and adaptation capacities of hierarchical generative models.
Large generative models trained on millions of data points exhibit emergent adaptation properties like in-context learning, being able to solve novel tasks given a handful of samples.
However, such adaptation properties emerge only with scale, being absent for middle-size models trained on small datasets,
a typical scenario in engineering design and scientific discovery, where data collection is expensive and computational constraints are present.
My research goal is to bridge the gap in adaptation capabilities between large and middle-size generative models.
Leveraging the tools of hierarchical inference and using the training dataset in such a way as to encourage adaptation at inference time,
we can learn expressive families of conditional generative processes and perform few-shot generation and transfer learning in the small data regime.
|
Publications
|
 |
Aligning Optimization Trajectories with Diffusion Models for Constrained Design Generation
Giorgio Giannone,
Akash Srivastava,
Ole Winther,
Faez Ahmed
NeurIPS, 2023
Generative models have had a profound impact on vision and language, paving the way for a new era of multimodal generative applications.
While these successes have inspired researchers to explore using generative models in science and engineering to accelerate the design process and reduce the reliance on iterative optimization, challenges remain.
Specifically, engineering optimization methods based on physics still outperform generative models when dealing with constrained environments where data is scarce and precision is paramount.
To address these challenges, we introduce Diffusion Optimization Models (DOM) and Trajectory Alignment (TA),
a learning framework that demonstrates the efficacy of aligning the sampling trajectory of diffusion models with the optimization trajectory derived from traditional physics-based methods.
|
 |
Learning from Invalid Data: On Constraint Satisfaction in Generative Models
Giorgio Giannone*,
Lyle Regenwetter*,
Akash Srivastava*,
Dan Gutfreund,
Faez Ahmed
under-review, 2023
Generative models have demonstrated impressive results in vision, language, and speech.
However, even with massive datasets, they struggle with precision, generating physically invalid or factually incorrect data.
This is particularly problematic when the generated data must satisfy constraints, for example, to meet product specifications in engineering design or to adhere to the laws of physics in a natural scene.
To improve precision while preserving diversity and fidelity, we propose a novel training mechanism that leverages datasets of constraint-violating data points, which we consider invalid.
Our approach minimizes the divergence between the generative distribution and the valid prior while maximizing the divergence with the invalid distribution.
We demonstrate how generative models like Generative Adversarial Networks and Denoising Diffusion Probabilistic Models that we augment to train with negative data vastly outperform their standard counterparts which solely train on valid data points.
|
 |
Unifying Molecular and Textual Representations via Multi-task Language Modelling
Dimitrios Christofidellis*,
Giorgio Giannone*,
Jannis Born,
Ole Winther,
Teodoro Laino,
Matteo Manica
International Conference on Machine Learning, ICML, 2023
The recent advances in neural language models have also been successfully applied to the field of chemistry,
offering generative solutions for classical problems in molecular design and synthesis planning. These new methods have the potential to optimize laboratory operations and fuel
a new era of data-driven automation in scientific discovery. However, specialized models are still typically required for each task, leading to the need for problem-specific
fine-tuning and neglecting task interrelations. The main obstacle in this field is the lack of a unified representation between natural language and chemical representations,
complicating and limiting human-machine interaction. Here, we propose a multi-domain, multi-task language model to solve a wide range of tasks in both the chemical and natural
language domains. By leveraging multi-task learning, our model can handle chemical and natural language concurrently, without requiring expensive pre-training on single
domains or task-specific models.
|
 |
Diffusing the Optimal Topology: A Generative Optimization Approach
Giorgio Giannone,
Faez Ahmed
IDETC, 2023
Topology Optimization seeks to find the best design that satisfies a set of constraints while maximizing system performance.
Traditional iterative optimization methods like SIMP can be computationally expensive and get stuck in local minima,
limiting their applicability to complex or large-scale problems.
Learning-based approaches have been developed to accelerate the topology optimization process, but these methods can generate designs with floating material
and low performance when challenged with out-of-distribution constraint configurations. Recently, deep generative models, such as Generative Adversarial Networks and Diffusion Models,
conditioned on constraints and physics fields have shown promise, but they require extensive pre-processing and surrogate models for improving performance.
To address these issues, we propose a Generative Optimization method that integrates classic optimization like SIMP as a refining mechanism for the topology generated by a deep generative model.
|
 |
Accelerating material design with the generative toolkit for scientific discovery
Manica & the GT4SD Team (Core Contributor)
Nature npj Computational Materials, 2023
With the growing availability of data within various scientific domains,
generative models hold enormous potential to accelerate scientific discovery.
They harness powerful representations learned from datasets to speed up the formulation
of novel hypotheses with the potential to impact material discovery broadly.
We present the Generative Toolkit for Scientific Discovery (GT4SD).
This extensible open-source library enables scientists, developers,
and researchers to train and use state-of-the-art generative models
to accelerate scientific discovery focused on organic material design.
|
 |
Few-Shot Diffusion Models
Giorgio Giannone,
Didrik Nielsen,
Ole Winther
Score-Based Methods Workshop, NeurIPS, 2022
Denoising diffusion probabilistic models (DDPM) are powerful hierarchical latent variable models with remarkable sample generation quality and training stability.
These properties can be attributed to parameter sharing in the generative hierarchy, as well as a parameter-free diffusion-based inference procedure.
In this paper, we present Few-Shot Diffusion Models (FSDM), a framework for few-shot generation leveraging conditional DDPMs.
FSDMs are trained to adapt the generative process conditioned on a small set of images from a given class by aggregating image patch information using a set-based Vision Transformer (ViT).
At test time, the model is able to generate samples from previously unseen classes conditioned on as few as 5 samples from that class.
We empirically show that FSDM can perform few-shot generation and transfer to new datasets taking full advantage of the conditional DDPM.
|
 |
SCHA-VAE: Hierarchical Context Aggregation for
Few-Shot Generation
Giorgio Giannone,
Ole Winther
International Conference on Machine Learning, ICML, 2022
A few-shot generative model should be able to generate data from a novel distribution by only observing a limited set of examples.
In few-shot learning the model is trained on data from many sets from distributions sharing some underlying properties such
as sets of characters from different alphabets or objects from different categories.
We extend current latent variable models for sets to a fully hierarchical approach with an attention-based point to
set-level aggregation and call our method SCHA-VAE for Set-Context-Hierarchical-Aggregation Variational Autoencoder.
We explore likelihood-based model comparison, iterative data sampling, and adaptation-free out-of-distribution generalization.
Our results show that the hierarchical formulation better captures the intrinsic variability within the sets in the small data regime.
This work generalizes deep latent variable approaches to few-shot learning, taking a step toward large-scale few-shot generation with
a formulation that readily works with current state-of-the-art deep generative models.
|
 |
Just Mix Once: Worst-group Generalization by Group
Interpolation
Giorgio Giannone,
Serhii Havrylov,
Jordan Massiah,
Emine Yilmaz,
Yunlong Jiao
Distribution Shifts Workshop, NeurIPS, 2021
Advances in deep learning theory have revealed how average generalization relies on superficial patterns in data.
The consequences are brittle models with poor performance with shift in group distribution at test time.
When group annotation is available, we can use robust optimization tools to tackle the problem.
However, identification and annotation are time-consuming, especially on large datasets.
A recent line of work leverages self-supervision and oversampling to improve generalization on minority groups
without group annotation. We propose to unify and generalize these approaches using a class-conditional variant
of mixup tailored for worst-group generalization. Our approach, Just Mix Once (JM1),
interpolates samples during learning, augmenting the training distribution with a continuous mixture of groups.
JM1 is domain agnostic and computationally efficient, can be used with any level of group annotation,
and performs on par or better than the state-of-the-art on worst-group generalization.
Additionally, we provide a simple explanation of why JM1 works.
|
 |
Hierarchical Few-Shot Generative Models
Giorgio Giannone,
Ole Winther
Meta-Learning Workshop, NeurIPS, 2021
A few-shot generative model should be able to generate data from a distribution
by only observing a limited set of examples. In few-shot learning the model is
trained on data from many sets from different distributions sharing some underlying
properties such as sets of characters from different alphabets or sets of images
of different type objects. We study a latent variables approach that extends the
Neural Statistician to a fully hierarchical approach with an attention-based
point to set-level aggregation. We extend the previous work to iterative data sampling,
likelihood-based model comparison, and adaptation-free out of distribution
generalization. Our results show that the hierarchical formulation better captures
the intrinsic variability within the sets in the small data regime. With this work
we generalize deep latent variable approaches to few-shot learning, taking a step
towards large-scale few-shot generation with a formulation that readily can work
with current state-of-the-art deep generative models.
|
 |
Transformation-aware Variational Autoencoder
Giorgio Giannone,
Saeed Saremi,
Jonathan Masci,
Christian Osendorfer
Technical Report, 2020
We extend the framework of variational autoencoders to represent transformations
explicitly in the latent space. This is achieved in the form of a generative model
structured such that the group of transformations
that act in the input space is instead represented by latent variables
which are linear operators that only act in the latent space.
In the family of hierarchical graphical models that emerges,
the latent space is populated by higher order objects which are inferred
jointly with the latent representations they act on.
|
 |
Real-time Classification from Short Event-Camera Streams using Input-filtering Neural ODEs
Giorgio Giannone,
Asha Anoosheh,
Alessio Quaglino,
Pierluca D'Oro,
Marco Gallieri,
Jonathan Masci
Interpretable Inductive Biases
and Physically Structured Learning Workshop, NeurIPS, 2020
Event-based cameras are novel, efficient sensors inspired by the human vision system,
generating an asynchronous, pixel-wise stream of data.
Learning from such data is generally performed through heavy preprocessing and event integration into images.
This requires buffering of possibly long sequences and can limit the response time of the inference system.
In this work, we instead propose to directly use events from a DVS camera,
a stream of intensity changes and their spatial coordinates.
This sequence is used as the input for a novel asynchronous RNN-like architecture,
the Input-filtering Neural ODEs.
|
 |
No Representation without Transformation
Giorgio Giannone,
Jonathan Masci,
Christian Osendorfer
Bayesian Deep Learning and Perception as Generative Reasoning Workshops, NeurIPS , 2019
We propose to extend Latent Variable Models with a simple idea:
learn to encode not only samples but also transformations of such samples.
This means that the latent space is not only populated by embeddings
but also by higher order objects that map between these embeddings.
We show how a hierarchical graphical model can be utilized to enforce
desirable algebraic properties of such latent mappings.
|
 |
Learning Common Representation from RGB and Depth Images
Giorgio Giannone,
Boris Chidlovskii
Multimodal Learning and Applications Workshop, CVPR, 2019
We propose a new deep learning architecture for the tasks of semantic segmentation
and depth prediction from RGB-D images.
We revise the state of art based on the RGB and depth feature fusion,
where both modalities are assumed to be available at train and test time.
We propose a new architecture where the feature fusion is replaced with a common deep representation.
Combined with an encoder-decoder type of the network, the architecture can jointly learn models
for semantic segmentation and depth estimation based on their common representation.
|
Datasets
|
 |
2d Topology Optimization
We built a dataset of optimized topologies and intermediate optimization steps at low-resolution (64x64) and
high-resolution (256x256) with constraints.
- 50K low-resolution optimized topologies.
- 60K high-resolution optimizer topologies.
- 250K low-resolution intermediate steps.
- 300K high-resolution intermediate steps.
|
 |
3d Topology Optimization
We built a multifidelity dataset of 300K optimized topologies with constraints.
- 150K beams.
- 100K plates.
- 50K l-shapes.
|
Theses
|
 |
Learning Common Representation for Scene Understanding
Giorgio Giannone
Master's Thesis, Data Science, Sapienza University of Rome, 2018
|
 |
Bubble Dynamics in Turbulent Shear Flows
Giorgio Giannone
Master's Thesis, Mechanical Engineering, Sapienza University of Rome, 2016
|
|