US\(^{3}\)Net: Ultralightweight Self-Supervised Stereo Matching Network using Depth-Aware Geometric Soft Occlusion
An ultralightweight self-supervised stereo matching network for efficient stereo depth estimation on resource-constrained devices, combining a low-complexity feature extractor with Depth-Aware Geometric Soft Occlusion (DAGSO) to improve occlusion handling while using only 12K parameters.
Adapt2Hide: Leveraging Off-the-Shelf Autoencoder for Reversible Visual Processing
* Equal contribution.
An image steganography approach using large pre-trained autoencoders and LoRA to support high-quality message reconstruction with minimal additional parameters for reversible visual processing applications.
Gen-n-Val: Agentic Image Data Generation and Validation
* Equal contribution.
An agentic framework for image data generation and validation using Layer Diffusion, LLM prompt agents, and VLLM validation agents to improve object detection and segmentation training data.
ImagenWorld: Stress-Testing Image Generation Models with Explainable Human Evaluation on Open-ended Real-World Tasks
* Equal contribution.
A large-scale benchmark for evaluating modern image-generation models across open-ended real-world tasks with fine-grained human annotations.
Every Camera Effect, Every Time, All at Once: 4D Gaussian Ray Tracing for Physics-based Camera Effect Data Generation
* Equal contribution. Work done at Academia Sinica as interns. † Internship mentor.
A two-stage pipeline combining 4D Gaussian Splatting with physically based ray tracing to simulate real-world camera effects such as fisheye distortion, rolling shutter, and depth of field.
Single Image Reflection Removal Based on Knowledge-Distilling Content Disentanglement
A single-image reflection removal method that disentangles reflection and transmission features with knowledge distillation.
For citation indexes and additional metadata, visit Google Scholar.