Posts by Collection

portfolio

projects

CodCad (2016)

CodCad was an online platform created to teach competitive programming for free. I co-founded CodCad in 2016.

Noic (2016)

Noic is a project that promotes scientific olympiads in Brazil and democratizes access to them. I presided Noic in 2016

publications

Text-image Alignment for Diffusion-based Perception

Published in CVPR, 2024

We use automatically generated captions to improve the text-image alignment of a diffusion backbone in downstream visual tasks such as semantic segmentation, depth estimation and object detection. Our method also achieves improves the SOTA in both single-domain and cross-domain tasks.

Recommended citation: Neehar Kondapaneni, Markus Marks, Manuel Knott, Rogerio Guimaraes, Pietro Perona; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 13883-13893 https://arxiv.org/abs/2310.00031

Diffusion-Based Action Recognition Generalizes to Untrained Domains

Published in arXiv preprint, 2025

We propose using features generated by a Vision Diffusion Model (VDM), aggregated via a transformer, to achieve human-like action recognition across domain shifts. We find that generalization is enhanced by the use of a model conditioned on earlier timesteps of the diffusion process to highlight semantic information over pixel level details in the extracted features. Our model sets a new state-of-the-art across three generalization benchmarks, bringing machine action recognition closer to human-like robustness.

Recommended citation: Rogerio Guimaraes, Frank Xiao, Pietro Perona & Markus Marks. (2025). Diffusion-Based Action Recognition Generalizes to Untrained Domains. https://arxiv.org/abs/2509.08908

talks

teaching