Sitemap

A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.

Pages

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

projects

CodCad (2016)

CodCad was an online platform created to teach competitive programming for free. I co-founded CodCad in 2016.

Noic (2016)

Noic is a project that promotes scientific olympiads in Brazil and democratizes access to them. I presided Noic in 2016

publications

Text-image Alignment for Diffusion-based Perception

Published in CVPR, 2024

We use automatically generated captions to improve the text-image alignment of a diffusion backbone in downstream visual tasks such as semantic segmentation, depth estimation and object detection. Our method also achieves improves the SOTA in both single-domain and cross-domain tasks.

Recommended citation: Neehar Kondapaneni, Markus Marks, Manuel Knott, Rogerio Guimaraes, Pietro Perona; Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 13883-13893 https://arxiv.org/abs/2310.00031

Diffusion-Based Action Recognition Generalizes to Untrained Domains

Published in arXiv preprint, 2025

We propose using features generated by a Vision Diffusion Model (VDM), aggregated via a transformer, to achieve human-like action recognition across domain shifts. We find that generalization is enhanced by the use of a model conditioned on earlier timesteps of the diffusion process to highlight semantic information over pixel level details in the extracted features. Our model sets a new state-of-the-art across three generalization benchmarks, bringing machine action recognition closer to human-like robustness.

Recommended citation: Rogerio Guimaraes, Frank Xiao, Pietro Perona & Markus Marks. (2025). Diffusion-Based Action Recognition Generalizes to Untrained Domains. https://arxiv.org/abs/2509.08908

talks

teaching