Lingyu Zhang

Lingyu Zhang 张凌宇

PhD Student
Duke University
lingyu.zhang@duke.edu

I am a PhD student at the General Robotics Lab @ Duke University, advised by Prof. Boyuan Chen .

Previously, I graduated from Columbia with an MS EE degree. At Columbia, I worked on robust computer vision models with Prof. Junfeng Yang and Prof. Carl Vondrick, and worked closely with PhD student Chengzhi Mao. I was also a research assistant in Prof. Shih-Fu Chang's DVMM lab, where I worked on multimodal learning, under the supervision of Dr. Mingyang Zhou.

Prior to Columbia, I earned my Bachelor's degree at Nanjing University, China, where I did my undergraduate thesis on neural image compression, advised by Prof. Qiu Shen.

CV / Scholar / Linkedin / Twitter / Github

News

[2024 Aug] Our platform for Human-AI teaming CREW is released

[2023 Aug] Joined Duke General Robotics Lab

[2023 April] Our paper was accepted at ICML 2023

[2023 Feb] Started working as an RA at Columbia DVMM lab

[2023 feb] Graduated from Columbia as an MS student

Research

I am broadly interested in machine learning for decision making and perception. In particular, advancing the performance of agents in the real world and improving the robustness and generalization of machine learning models.

	CREW: Facilitating Human-AI Teaming Lingyu Zhang, Zhengran Ji, Boyuan Chen arxiv / project page / video / documentation We introduce a platform for Human-AI teaming research. CREW offers extensible environment design, enables real-time human-AI communication, supports hybrid Human-AI teaming, parallel sessions, multimodal feedback, and physiological data collection, and features ML community-friendly algorithm design.
	Robust Perception through Equivariance Chengzhi Mao, Lingyu Zhang, Abhishek Vaibhav Joshi, Junfeng Yang, Hao Wang, Carl Vondrick ICML 2023 arxiv / project page We introduce a framework that uses the dense intrinsic constraints in natural images to robustify inference, allowing the model to adjust dynamically to each individual image's unique and potentially novel characteristics at inference time.
	Adversarially Robust Video Perception by Seeing Motion Lingyu Zhang, Chengzhi Mao, Junfeng Yang, Carl Vondrick In submission arxiv / project page We find that adversarial attacks generated for fooling video classifiers also collaterally corrupt motion. We propose to defend against attacks at test time by restoring disrupted motion.
	A Stereo Matching Method for Three-Dimensional Eye Localization of Autostereoscopic Display Bangpeng Xiao, Shenyuan Ye, Xicai Li, Min Li, Lingyu Zhang, Yuanqing Wang International Conference on Image and Graphics, 2021 paper We improve and optimize the ZNCC stereo matching algorithm for three-dimensional eye localization. We improve operation logic of the matching and optimize the scanning strategy based on the application scenarios. algorithm

Selected Projects

	Noise as Masks Lingyu Zhang COMS6998 Final Project, 2022 paper We propose to use noise as masks for masked image modeling. While randomized patch masking has yielded decent results in self-supervised learning, it is not at all obvious that it is the optimal design. We show that theoretically inspired semantically-guided noise masks can be a potentially well-performing alternative.
	Entropy Constrained Information Bottleneck Lingyu Zhang E6876 Final Project, 2022 paper We propose to use deterministic encoding along with actual quantization on latents, rendering the IB problem a source compression. By doing so, finite non-trivial mutual information can be estimated.
	Black-box Adversarial Attacks with Style Information Christodoulos Constantinides, Lingyu Zhang E6691 Final Project, 2022 paper We propose two types of blackbox attacks based on style transfer and investigate how robust classifiers behave against them.
	Unsupervised Harmonic Sound Source Separation with Spectral Clustering Yiming Lin, Lucy Wang, Lingyu Zhang, Zhaoyuan Deng COMS4774 Final Project, 2021 paper We modeled mixed sources of audio signals by sinusoidal modeling with Short-Time Fourier Transforms. Based on selected spectral peaks of sinusoidal parameters, we constructed a similarity function between time and frequency components, and applied spectral clustering to globally partition the data.
	Exploring Diverse Ways To Improve An Agent On Active Object Localization With Deep Reinforcement Learning Jiawei Lu, Lingyu Zhang, Xinyi Liu, Yukai Song, Zixuan Yan E6885 Final Project, 2021 paper We proposed improvement to using DQNs for Object Detection from four aspects, including using advanced CNNs to generate state representation, defining more flexible action spaces, changing reward function to avoid undesired activity in agent and using mask instead cross for multiple objects.
	Design and Optimization of a Multi-scale Representation based Image Compression Network Lingyu Zhang Undergraduate Thesis, 2021 thesis (Chinese) Learned image compression has surpassed the rate-distortion performance of hand-crafted traditional image codecs in recent years. However, they are not yet practical because of their significantly slower decoding speed than classical algorithms. We investigated the possibility of directly performing vision tasks in the latent space and found that using a multi-scale encoder helped preserve semantic meaning in latent codes while maintaining state-of-the-art compression rates
	Dynamic Disparity Range Semi-Global Matching for Video Stereo Matching Lingyu Zhang, Computer Vision Final Project, 2020 slides / report (Chinese) Implemented an accelerated stereo matching algorithm for video sequences, utilizing a dynamic disparity range search based on temporal correlation between frames, saving 21% of computational time with minimal accuracy loss. Designed a Divided Section cost function, preserving more information than Census cost, achieving 18% better matching accuracy while trading off computational complexity.

Webpage template from Jon Barron