Ioannis Kakogeorgiou
   National Center for Scientific Research "Demokritos"   null      Athens    CV

About me

I am a Researcher (Grade C) at the Institute of Informatics and Telecommunications, NCSR “Demokritos”. Before joining Demokritos, I was a Postdoctoral Researcher at Archimedes/Athena RC, working with Nikos Komodakis. I completed my Ph.D. at the Remote Sensing Laboratory of the National Technical University of Athens, where I worked under the supervision of Konstantinos Karantzalos. My Ph.D. research focused on Unsupervised Learning and Explainable AI in Computer Vision and Remote Sensing.
I have a Master’s degree in Mathematical Modeling from the National Technical University of Athens and a Bachelor’s degree in Mathematics from the University of Athens.

I serve as a reviewer at CVPR [Outstanding Reviewer, 2025], ICCV, ECCV, IJCV, IEEE TNNLS, Neural Networks, WACV, IEEE GRSL, IEEE Access.

Teaching (Adjunct Lecturer)
  • Image Analysis & Computer Vision (Undergraduate) | Dept. of Informatics & Telecommunications, UOA (Spring 2024–25)
  • Image Processing (Undergraduate) | Dept. of Informatics & Telecommunications, UOA (Spring 2024–25)
  • Machine Learning (Postgraduate) | Dept. of Informatics & Telematics, HUA (Fall 2024–25)

Papers

Boosting Generative Image Modeling via Joint Image-Feature Synthesis
We introduced a latent-semantic diffusion framework that jointly models low-level VAE image latents and high-level self-supervised semantic features in a single diffusion process. By generating coherent image and feature pairs from pure noise, our method delivers improved image quality and faster convergence with minimal modifications to standard Diffusion Transformer architectures. This approach simplifies training by eliminating complex distillation objectives. We also proposed Representation Guidance, an inference strategy that uses learned semantics to further refine image generation.
Theodoros Kouzelis, Efstathios Karypidis, Ioannis Kakogeorgiou, Spyros Gidaris, Nikos Komodakis
Preprint, 2025
arXiv | code | project page
EQ-VAE: Equivariance Regularized Latent Space for Improved Generative Image Modeling
We introduced EQ-VAE, a simple regularization approach that enforces equivariance to semantic-preserving transformations (e.g., scaling and rotation) in the latent space of autoencoders, reducing its complexity without degrading reconstruction quality. By fine-tuning pre-trained autoencoders with EQ-VAE, we boost the performance of state-of-the-art generative models (i.e., DiT, SiT, REPA, MaskGIT), achieving up to a 7× speedup on DiT-XL/2 with only five epochs of fine-tuning, and supporting both continuous and discrete latent representations.
Theodoros Kouzelis, Ioannis Kakogeorgiou, Spyros Gidaris, Nikos Komodakis
ICML, 2025
paper | arXiv | code | project page
Advancing Semantic Future Prediction through Multimodal Visual Sequence Transformers
We introduced FUTURIST, a unified multimodal visual sequence transformer that uses a masked visual modeling objective and specialized masking to fuse modalities for future semantic prediction. Its VAE-free hierarchical tokenization reduces computational complexity and enables end-to-end high-resolution training, achieving state-of-the-art performance in future semantic segmentation for both short- and mid-term forecasting on Cityscapes.
Efstathios Karypidis, Ioannis Kakogeorgiou, Spyros Gidaris, Nikos Komodakis
CVPR, 2025
paper | arXiv | code
SPOT: Self-Training with Patch-Order Permutation for Object-Centric Learning with Autoregressive Transformers
We designed an unsupervised object-centric learning framework that uses attention-based self-training and a novel patch-order permutation strategy for autoregressive transformers. This approach achieves state-of-the-art performance in unsupervised object segmentation, especially with complex real-world images.
Ioannis Kakogeorgiou, Spyros Gidaris, Konstantinos Karantzalos, Nikos Komodakis
CVPR, 2024 [CVPR Highlight (2.8% of submissions)]
paper | arXiv | code
Composed Image Retrieval for Remote Sensing
We introduced composed image retrieval to remote sensing. It allows querying a large image archive using image examples alternated by a textual description. We presented a new evaluation benchmark for this task and proposed a novel method that fuses image-to-image and text-to-image similarity for effective composed image retrieval.
Bill Psomas, Ioannis Kakogeorgiou, Nikos Efthymiadis, Giorgos Tolias, Ondrej Chum, Yannis Avrithis, Konstantinos Karantzalos
IEEE IGARSS, 2024 [Oral]
paper | arxiv | code
A Comparative Study on Sentinel-2 Cloud Detection Algorithms in Marine Environments
We evaluated four well-established cloud detection algorithms (FMASK, SEN2COR, KAPPAMASK, and S2CLOUDLESS) in marine environments using the MARIDA dataset derived from Sentinel-2 satellite imagery. Our evaluation assessed performance across Cloud, Thin Cloud, Cloud Shadow, and Clear categories.
Ioannis Kakogeorgiou, Paraskevi Mikeli, Katerina Kikaki, Emmanouela Prassou, Konstantinos Karantzalos
IEEE IGARSS, 2024 [Oral]
paper
Detecting Marine Pollutants and Sea Surface Features with Deep Learning in Sentinel-2 Imagery
We introduced a new open-access dataset named MADOS, which includes 15 classes, featuring oil spills and marine debris, based on Sentinel-2 multispectral satellite data. Moreover, we proposed a novel deep learning framework named MariNeXt, which outperforms all baselines.
Katerina Kikaki, Ioannis Kakogeorgiou, Ibrahim Hoteit, Konstantinos Karantzalos
ISPRS Journal of Photogrammetry and Remote Sensing, 2024
paper | code | project page
Keep It SimPool: Who Said Supervised Transformers Suffer from Attention Deficit?
We developed a universal attention-based pooling mechanism called SimPool to replace default pooling strategies in both convolutional and transformer encoders, significantly improving performance and generating high-quality attention maps for both supervised and self-supervised settings.
Bill Psomas, Ioannis Kakogeorgiou, Konstantinos Karantzalos, Yannis Avrithis
ICCV, 2023
paper | arXiv | code
What to Hide from Your Students: Attention-Guided Masked Image Modeling
We introduce a novel masking strategy, called attention-guided masking (AttMask), and we demonstrate its effectiveness over random masking for dense distillation-based MIM.
Ioannis Kakogeorgiou, Spyros Gidaris, Bill Psomas, Yannis Avrithis, Andrei Bursuc, Konstantinos Karantzalos, Nikos Komodakis
ECCV, 2022
paper | DOI | arXiv | code
MARIDA: A benchmark for Marine Debris detection from Sentinel-2 remote sensing data
We present Marine Debris Archive (MARIDA), the first open-access dataset based on the multispectral Sentinel-2 (S2) satellite data, which distinguishes Marine Debris from various marine features that co-exist.
Katerina Kikaki, Ioannis Kakogeorgiou, Paraskevi Mikeli, Dionysios E. Raitsos, Konstantinos Karantzalos
PLOS One, 2022 [Sentinel Success Stories]
paper | code | project page
HOW CHALLENGING IS THE DISCRIMINATION OF FLOATING MATERIALS ON THE SEA SURFACE USING HIGH RESOLUTION MULTISPECTRAL SATELLITE DATA?
We explore the ability to discriminate marine debris from other floating materials and sea features using high-resolution multispectral satellite data. To perform our analysis, we utilized the open-access Marine Debris Archive (MARIDA). We indicate that the spectral information alone is insufficient to distinguish marine plastic from other floating materials which exhibit similar spectral behavior, such as vessels.
Paraskevi Mikeli, Katerina Kikaki, Ioannis Kakogeorgiou, Konstantinos Karantzalos
ISPRS Archives, 2022 [ISPRS Best Poster Award]
paper
Evaluating explainable artificial intelligence methods for multi-label deep learning classification tasks in remote sensing
We evaluated quantitatively and qualitatively different aspects of ten XAI methods. We assess XAI methods’ performance for multi-label classification tasks in BigEarthNet and SEN12MS datasets employing various metrics. We extracted significant insights regarding models’ decisions as well as datasets’ composition and conclude that Occlusion, LIME and Grad-CAM were the most interpretable methods for the specific multi-label remote sensing classification task.
Ioannis Kakogeorgiou, Konstantinos Karantzalos
Int. J. Appl. Earth Obs. Geoinf., 2021
paper | arXiv