Kevin (Yu-Teng) Li

I am an Applied Research Scientist at Adobe Firefly, working on foundation model training and multimodal research. Most recently I co-led the multimodal pretraining of Firefly Image 5, which enables textual editing, single-image reference generation, layer generation and text-to-image.

Previously, I graduated from University of California, Berkeley with a B.S. in Electrical Engineering and Computer Sciences, where I did my research in Active Learning in Segmentation under the supervision of Trevor Darrell.

Email  /  CV  /  Twitter  /  LinkedIn  /  Github

profile photo

Research & Industry Projects

UniFusion: Vision-Language Model as Unified Encoder for Image Generation
Yu-Teng Li*, Manuel Brack*, Sudeep Katakol, Hareesh Ravi, Ajinkya Kale
ArXiv, 2025

The first architecture that uses only VLM as input-condition encoder without auxiliary signals from VAE or CLIP to do editing. The unified encoder framework enables emergent capabilities such as zero-shot multi-ref generation when trained on single-reference pairs.

Firefly Image 5
Multimodal pretraining (textual editing, single-reference generation, layer generation) October 2025

I co-led the training of Firefly Image 5 model for workflows of textual editing, single-image reference and layer generation. Throughout model development, I led ablation studies on the architecture and data combinations, and drove decisions on the final production model's training recipe, scaling to ~1000-GPU distributed training on a daily basis (July-Oct 2025).

Firefly Image 4
Foundation model pretraining & post-training April 2025

Being part of the foundation model training team of Image 4, I developed recipes for synthetic data handling and aesthetics fine-tuning (SFT), as well as sampling improvements. Firefly Image 4 is, as of Oct 2025, still one of the most advanced text-to-image models in the industry, leading competitors such as Qwen-Image, Runway Gen-4 Image, Luma Photon...etc in "General & Photorealistic" category on Text-to-Image Arena.

Firefly Image 3 Custom Models
August 2024

I led the personalization effort of Firefly Image 3 called Custom Models, which enables copyright content generations for Adobe's enterprise customers. I developed the training recipe (e.g. improved Dreambooth's stability with VLM-predicted superclass, improve optimizer memory efficiency) and integrated the finetuning pipeline into prodcution.

Hyperbolic Active Learning for Semantic Segmentation under Domain Shift
Luca Franco*, Paolo Mandica*, Konstantinos Kallidromitis, Devin Guillory, Yu-Teng Li, Trevor Darrell, Fabio Galasso
ICML, 2024

HALO introduces a hyperbolic neural network approach to pixel-based active learning (AL) for semantic segmentation, and is the first active AL method to surpass the performance of supervised domain adaptation with just 1% of labeled pixels.

Neighboring state-based RL Exploration
Jeffery Cheng*, Yu-Teng Li*, Justin Lin*, Pedro Pachuca*
ArXiv, 2022

We propose ρ-explore, a model-free exploration algorithm that selects actions based on nearby perturbed states, which consistently outperforms the Double DQN baseline in an discrete environment by 49% in terms of validation reward return [code].

Teaching

CS 182/282A  Deep Neural Networks  |  UC Berkeley
Head Teaching Assistant of Discussions

I led the curriculum design of weekly discussion sections in the Deep Learning course at UC Berkeley, with 300+ graduate & undergraduate students in Spring 2023. I also designed various exam and homework questions on denoising diffusion models (DDPM), Transformers, and more.


Website source code from here.