Research

Publications

Peer-reviewed papers, workshop presentations, awards, and project links.

2026 International Journal of Computer Vision (IJCV)

US\(^{3}\)Net: Ultralightweight Self-Supervised Stereo Matching Network using Depth-Aware Geometric Soft Occlusion

Po-Chung Jen, Tzu-Chi Liu, I-Sheng Fang, Hsiao-Chieh Wen, Chia-Lun Hsu, Ping-Yang Chen, Chang-Hsing Lee, Yong-Sheng Chen

An ultralightweight self-supervised stereo matching network for efficient stereo depth estimation on resource-constrained devices, combining a low-complexity feature extractor with Depth-Aware Geometric Soft Occlusion (DAGSO) to improve occlusion handling while using only 12K parameters.

Adapt2Hide teaser showing reversible visual processing with an off-the-shelf autoencoder.
2026 IEEE International Conference on Image Processing (ICIP)

Adapt2Hide: Leveraging Off-the-Shelf Autoencoder for Reversible Visual Processing

Ernie Chu*, I-Sheng Fang*, Tai-Ming Huang, Pin-Yen Chiu, Vishal Patel, Jun-Cheng Chen

* Equal contribution.

An image steganography approach using large pre-trained autoencoders and LoRA to support high-quality message reconstruction with minimal additional parameters for reversible visual processing applications.

  • ICIP 2026
  • Reversible visual processing
  • Image steganography
Gen-n-Val teaser showing agentic image data generation and validation workflow.
2026 IEEE/CVF Conference on Computer Vision and Pattern Recognition Findings Track (CVPRF)

Gen-n-Val: Agentic Image Data Generation and Validation

Jing-En Huang*, I-Sheng Fang*, Tzuhsuan Huang, Chih-Yu Wang, Jun-Cheng Chen

* Equal contribution.

An agentic framework for image data generation and validation using Layer Diffusion, LLM prompt agents, and VLLM validation agents to improve object detection and segmentation training data.

  • CVPRF 2026
  • Agentic data generation
2026 International Conference on Learning Representations (ICLR)

ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks

Samin Mahdizadeh Sani*, Max Ku*, Nima Jamali, Matina Mahdizadeh Sani, Paria Khoshtab, Wei-Chieh Sun, Parnian Fazel, Zhi Rui Tam, Thomas Chong, Edisy Kin Wai Chan, Donald Wai Tong Tsang, Chiao-Wei Hsu, Ting Wai Lam, Ho Yin Sam Ng, Chiafeng Chu, Chak-Wing Mak, Keming Wu, Hiu Tung Wong, Yik Chun Ho, Chi Ruan, Zhuofeng Li, I-Sheng Fang, Shih-Ying Yeh, Ho Kei Cheng, Ping Nie, Wenhu Chen

* Equal contribution.

A large-scale benchmark for evaluating modern image-generation models across open-ended real-world tasks with fine-grained human annotations.

  • ICLR 2026
  • Image generation evaluation
Text Slider teaser showing continuous visual concept control.
2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

Text Slider: Efficient and Plug-and-Play Continuous Concept Control for Image/Video Synthesis via LoRA Adapters

Pin-Yen Chiu, I-Sheng Fang, Jun-Cheng Chen

A lightweight and efficient method for continuous concept control in diffusion models by fine-tuning low-rank directions in the text encoder.

  • WACV 2026
  • CVPRW VisCon 2025
  • Image and video synthesis
2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

KMOPS: Keypoint-Driven Method for Multi-Object Pose and Metric Size Estimation from Stereo Images

Ying-Kun Wu, Yi Shen, Tzuhsuan Huang, I-Sheng Fang, Jun-Cheng Chen

A keypoint-driven method for estimating 6-DoF pose and metric size of multiple objects from a calibrated stereo image pair.

  • WACV 2026
  • Stereo vision
  • Object pose
2025 NeurIPS 2025 Workshop on SPACE in Vision, Language, and Embodied AI (SpaVLE)

Every Camera Effect, Every Time, All at Once: 4D Gaussian Ray Tracing for Physics-based Camera Effect Data Generation

Yi-Ruei Liu*, You-Zhe Xie*, Yu-Hsiang Hsu*, I-Sheng Fang, Yu-Lun Liu, Jun-Cheng Chen

* Equal contribution. Work done at Academia Sinica as interns. Internship mentor.

A two-stage pipeline combining 4D Gaussian Splatting with physically based ray tracing to simulate real-world camera effects such as fisheye distortion, rolling shutter, and depth of field.

  • Oral
  • Physics-based data generation
2025 Multimodal Algorithmic Reasoning Workshop (MAR), IEEE/CVF CVPR Workshops

CameraBench: Benchmarking Visual Reasoning in MLLMs via Photography

I-Sheng Fang, Jun-Cheng Chen

A benchmark for photography-related visual reasoning tasks that test how multimodal large language models understand the effects of camera settings on image appearance.

  • CVPRW MAR 2025
  • Photography reasoning
  • MLLM benchmark
2024 SIGGRAPH Asia

Camera Settings as Tokens: Modeling Photography on Latent Diffusion Models

I-Sheng Fang, Yue-Hua Han, Jun-Cheng Chen

A latent diffusion approach that represents camera settings as controllable tokens for photographic image generation.

  • SIGGRAPH Asia
  • Visual Generative Models
  • Photography
iToF and RGB depth integration teaser animation.
2024 International Conference on Pattern Recognition (ICPR)

Best of Both Sides: Integration of Absolute and Relative Depth Sensing Modalities Based on iToF and RGB Cameras

I-Sheng Fang, Wei-Chen Chiu, Yong-Sheng Chen

A depth-sensing integration method that combines active iToF sensing with passive RGB cues to estimate high-resolution metric depth without metric depth supervision.

  • Depth sensing
  • iToF + RGB
  • Multi-modal integration
ES3Net drone stereo matching teaser.
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

ES3Net: Accurate and Efficient Edge-Based Self-Supervised Stereo Matching Network

I-Sheng Fang, Hsiao-Chieh Wen, Chia-Lun Hsu, Po-Chung Jen, Ping-Yang Chen, Yong-Sheng Chen

An efficient edge-based self-supervised stereo matching network for robust depth estimation on drones and embedded devices.

  • Best Paper Award
  • Embedded Vision Workshop
Self-contained stylization result animation.
2020 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

Self-Contained Stylization via Steganography for Reverse and Serial Style Transfer

Hung-Yu Chen*, I-Sheng Fang*, Chia-Ming Cheng, Wei-Chen Chiu

* Equal contribution.

A two-stage model that integrates neural style transfer and deep steganography to support reverse and serial style transfer.

  • Style transfer
  • Steganography
2022 IEEE Signal Processing Letters

Single Image Reflection Removal Based on Knowledge-Distilling Content Disentanglement

Yan-Tsung Peng, Kai-Han Cheng, I-Sheng Fang, Wen-Yi Peng, Jr-Shian Wu

A single-image reflection removal method that disentangles reflection and transmission features with knowledge distillation.

  • Reflection removal
  • Knowledge distillation

For citation indexes and additional metadata, visit Google Scholar.