Research Assistant, CITI, Academia Sinica

I-Sheng Fang

Ethan Fang / 方 宜晟 / Gî-Tshiânn Png

"I-Sheng" is pronounced like "Ethan".

I am currently exploring Fall 2027 PhD opportunities in computer vision, generative AI, and photography-aware visual intelligence.

Portrait of I-Sheng Fang

News

Recent Updates

Gen-n-Val will take place on June 7, 7:30–9:00 AM. See you in Denver!

Adapt2Hide is accepted by ICIP 2026.

The Generative AI for Photography (GAIP) Workshop was successfully held at WACV 2026. Thanks to all speakers, authors, reviewers, and participants!

Gen-n-Val is accepted by CVPR 2026 Findings track. See you in Denver!

I am honored to be selected for Taiwan Government Fellowship for Study Abroad(公費留考), which provides full support in 3 years for my future PhD study abroad.

About

Researcher working where vision models meet creative tools

I am a Research Assistant in CITI at Academia Sinica, working with Dr. Jun-Cheng Chen on generative models and computer vision. I received my Master's degree in Robotics from National Yang Ming Chiao Tung University in January 2023, advised by Prof. Yong-Sheng Chen and Prof. Wei-Chen (Walon) Chiu. Before that, I studied computer science at National Chengchi University with Prof. Yan-Tsung Peng, worked as a research assistant at the Enriched Vision Applications Lab at National Chiao Tung University, and received my Bachelor's degree in Mathematical Science from National Chengchi University in January 2018.

Publications

Selected Publications

Peer-reviewed and workshop work in generative modeling, camera-aware synthesis, depth sensing, stereo matching, and style transfer.

2026 International Journal of Computer Vision (IJCV)

US\(^{3}\)Net: Ultralightweight Self-Supervised Stereo Matching Network using Depth-Aware Geometric Soft Occlusion

Po-Chung Jen, Tzu-Chi Liu, I-Sheng Fang, Hsiao-Chieh Wen, Chia-Lun Hsu, Ping-Yang Chen, Chang-Hsing Lee, Yong-Sheng Chen

An ultralightweight self-supervised stereo matching network for efficient stereo depth estimation on resource-constrained devices, combining a low-complexity feature extractor with Depth-Aware Geometric Soft Occlusion (DAGSO) to improve occlusion handling while using only 12K parameters.

Adapt2Hide teaser showing reversible visual processing with an off-the-shelf autoencoder.
2026 IEEE International Conference on Image Processing (ICIP)

Adapt2Hide: Leveraging Off-the-Shelf Autoencoder for Reversible Visual Processing

Ernie Chu*, I-Sheng Fang*, Tai-Ming Huang, Pin-Yen Chiu, Vishal Patel, Jun-Cheng Chen

* Equal contribution.

An image steganography approach using large pre-trained autoencoders and LoRA to support high-quality message reconstruction with minimal additional parameters for reversible visual processing applications.

  • ICIP 2026
  • Reversible visual processing
  • Image steganography
Gen-n-Val teaser showing agentic image data generation and validation workflow.
2026 IEEE/CVF Conference on Computer Vision and Pattern Recognition Findings Track (CVPRF)

Gen-n-Val: Agentic Image Data Generation and Validation

Jing-En Huang*, I-Sheng Fang*, Tzuhsuan Huang, Chih-Yu Wang, Jun-Cheng Chen

* Equal contribution.

An agentic framework for image data generation and validation using Layer Diffusion, LLM prompt agents, and VLLM validation agents to improve object detection and segmentation training data.

  • CVPRF 2026
  • Agentic data generation
2026 International Conference on Learning Representations (ICLR)

ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks

Samin Mahdizadeh Sani*, Max Ku*, Nima Jamali, Matina Mahdizadeh Sani, Paria Khoshtab, Wei-Chieh Sun, Parnian Fazel, Zhi Rui Tam, Thomas Chong, Edisy Kin Wai Chan, Donald Wai Tong Tsang, Chiao-Wei Hsu, Ting Wai Lam, Ho Yin Sam Ng, Chiafeng Chu, Chak-Wing Mak, Keming Wu, Hiu Tung Wong, Yik Chun Ho, Chi Ruan, Zhuofeng Li, I-Sheng Fang, Shih-Ying Yeh, Ho Kei Cheng, Ping Nie, Wenhu Chen

* Equal contribution.

A large-scale benchmark for evaluating modern image-generation models across open-ended real-world tasks with fine-grained human annotations.

  • ICLR 2026
  • Image generation evaluation
Text Slider teaser showing continuous visual concept control.
2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

Text Slider: Efficient and Plug-and-Play Continuous Concept Control for Image/Video Synthesis via LoRA Adapters

Pin-Yen Chiu, I-Sheng Fang, Jun-Cheng Chen

A lightweight and efficient method for continuous concept control in diffusion models by fine-tuning low-rank directions in the text encoder.

  • WACV 2026
  • CVPRW VisCon 2025
  • Image and video synthesis
2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

KMOPS: Keypoint-Driven Method for Multi-Object Pose and Metric Size Estimation from Stereo Images

Ying-Kun Wu, Yi Shen, Tzuhsuan Huang, I-Sheng Fang, Jun-Cheng Chen

A keypoint-driven method for estimating 6-DoF pose and metric size of multiple objects from a calibrated stereo image pair.

  • WACV 2026
  • Stereo vision
  • Object pose
2025 NeurIPS 2025 Workshop on SPACE in Vision, Language, and Embodied AI (SpaVLE)

Every Camera Effect, Every Time, All at Once: 4D Gaussian Ray Tracing for Physics-based Camera Effect Data Generation

Yi-Ruei Liu*, You-Zhe Xie*, Yu-Hsiang Hsu*, I-Sheng Fang, Yu-Lun Liu, Jun-Cheng Chen

* Equal contribution. Work done at Academia Sinica as interns. Internship mentor.

A two-stage pipeline combining 4D Gaussian Splatting with physically based ray tracing to simulate real-world camera effects such as fisheye distortion, rolling shutter, and depth of field.

  • Oral
  • Physics-based data generation
2025 Multimodal Algorithmic Reasoning Workshop (MAR), IEEE/CVF CVPR Workshops

CameraBench: Benchmarking Visual Reasoning in MLLMs via Photography

I-Sheng Fang, Jun-Cheng Chen

A benchmark for photography-related visual reasoning tasks that test how multimodal large language models understand the effects of camera settings on image appearance.

  • CVPRW MAR 2025
  • Photography reasoning
  • MLLM benchmark
2024 SIGGRAPH Asia

Camera Settings as Tokens: Modeling Photography on Latent Diffusion Models

I-Sheng Fang, Yue-Hua Han, Jun-Cheng Chen

A latent diffusion approach that represents camera settings as controllable tokens for photographic image generation.

  • SIGGRAPH Asia
  • Visual Generative Models
  • Photography
iToF and RGB depth integration teaser animation.
2024 International Conference on Pattern Recognition (ICPR)

Best of Both Sides: Integration of Absolute and Relative Depth Sensing Modalities Based on iToF and RGB Cameras

I-Sheng Fang, Wei-Chen Chiu, Yong-Sheng Chen

A depth-sensing integration method that combines active iToF sensing with passive RGB cues to estimate high-resolution metric depth without metric depth supervision.

  • Depth sensing
  • iToF + RGB
  • Multi-modal integration
ES3Net drone stereo matching teaser.
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

ES3Net: Accurate and Efficient Edge-Based Self-Supervised Stereo Matching Network

I-Sheng Fang, Hsiao-Chieh Wen, Chia-Lun Hsu, Po-Chung Jen, Ping-Yang Chen, Yong-Sheng Chen

An efficient edge-based self-supervised stereo matching network for robust depth estimation on drones and embedded devices.

  • Best Paper Award
  • Embedded Vision Workshop
Self-contained stylization result animation.
2020 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

Self-Contained Stylization via Steganography for Reverse and Serial Style Transfer

Hung-Yu Chen*, I-Sheng Fang*, Chia-Ming Cheng, Wei-Chen Chiu

* Equal contribution.

A two-stage model that integrates neural style transfer and deep steganography to support reverse and serial style transfer.

  • Style transfer
  • Steganography
2022 IEEE Signal Processing Letters

Single Image Reflection Removal Based on Knowledge-Distilling Content Disentanglement

Yan-Tsung Peng, Kai-Han Cheng, I-Sheng Fang, Wen-Yi Peng, Jr-Shian Wu

A single-image reflection removal method that disentangles reflection and transmission features with knowledge distillation.

  • Reflection removal
  • Knowledge distillation

Projects

Creative, Research, and Open Source Work

Creative AI May 8, 2021 - May 8, 2022

A Century of Heartfelt Sentiment: 100th Anniversary Special Exhibition of the Taiwan Cultural Association

Exhibition Room D, National Museum of Taiwan Literature, Tainan, Taiwan

Deepfake video synthesis

Synthesized DeepFake video for historical figures of the Taiwan Cultural Association as part of a museum exhibition.

  • Exhibition
  • Generative AI
  • Digital humanities
Research Collection Ongoing

Typography Research Collection

Curator

A research collection at the intersection of typography, computer graphics, computer vision, and machine learning.

  • Typography
  • Computer graphics
  • Machine learning
Research Collection Ongoing

Awesome Generative AI for Photography

Curator

A curated list of research that integrates photographic principles, concepts, techniques, and domain knowledge into image and video generative models.

  • Photography
  • Generative AI
  • Survey

Experience

Experience Snapshot

March 2024 - Present

Research Assistant

Research Center for Information Technology Innovation (CITI), Academia Sinica

Working on generative models with Dr. Jun-Cheng Chen.

March 2022 - November 2022

Software Engineer Intern

Microsoft AI R&D Center

Worked on vision transformers, perceptual loss, and generative models with SunDa Yang, Chien-Yi Wang, Prof. Shang-Hong Lai, Dr. Trista Chen, and the face science team.

September 2018 - September 2019

Research Assistant

Enriched Vision Applications Lab, National Chiao Tung University

Worked on style transfer and generative models with Prof. Wei-Chen (Walon) Chiu.

Service

Reviewing and Organizing

Reviewer / 2024 - 2026

Conference Reviewing

NeurIPS 2024, CVPR 2025, ICCV 2025, NeurIPS 2025, WACV 2026, ICRA 2026, CVPR 2026, ICML 2026, ECCV 2026, SIGGRAPH Asia 2026.

Organizer / 2026

Workshop on Generative AI for Photography, WACV 2026

Primary and contact organizer.

Administrator / Ongoing

Enjoyfonts

Administrator for a typography-focused Facebook group.

Personal

Outside the Lab

My work is technical, but my eye is shaped by cameras, letterforms, airports, ballparks, and gyms.

  • 📷 Film photography
  • 🔤 Typography
  • ⚾ Fantasy baseball
  • 🏋️ Strength training
  • ✈️ Aviation