A team from the Image Formation and Processing (IFP) Group housed at the University of Illinois at Urbana-Champaign’s Beckman Institute has won the first place in all three human parsing tracks in the Look Into Person (LIP) Challenge organized at the 2018 International Conference on Computer Vision and Pattern Recognition (CVPR), the premier artificial intelligence conference with focus on computer vision which took place this week at Salt Lake City earlier this month. The winning team was advised by ECE ILLINOIS and Beckman Research Professor Thomas S Huang, who has not only just been a leader in tackling human or data image challenges, but also cataloging images from medicine and astronomy.
The Look Into Person (LIP) Challenge asks contestants to produce computer models to successfully identify which part of the human body a pixel belongs to in an image or a process called human parsing.
The winning team consists of members from Illinois, Beijing Jiaotong University, and IBM as part of the research efforts sponsored IBM-ILLINOIS Center for Cognitive Computing Systems Research (C3SR), a multi-year collaboration between IBM and the University of Illinois College of Engineering co-led by ECE ILLINOIS professor Wen-mei Hwu and IBM program director Jinjun Xiong. Hwu is also affiliated with the CSL.
The team won first place in the single-person human parsing track, which consists of pixels from one person. The team scored a 57.90 mean score and finished ahead of a team from the Chinese online shopping giant JD (54.44). The group also claimed first in the more advanced multi-person human parsing task (scoring a 45.41) and fine-grained multi-person human parsing task (scoring a 33.34), which asks each team to create a model to identify not only which body part, but also which person the part belonged to from an image. The team had two months to complete all tasks using the LIP Datasets of ~80,000 images.
“This was a very difficult challenge,” said Yunchao Wei, a postdoctoral IFP researcher who led the winning team. “It is hard to win one track, not to mention multiple tracks. In semantic parsing or semantic segmentation, you need to know the semantic of each pixel. It is one of the most challenging tasks in computer vision. We are proud to win first place in this category.”
“Computer vision is one of the highest areas of artificial intelligence,” explained team member Humphrey Shi from Huang’s IFP group and now an IBM Researcher. “The IFP group has a long history of solving challenging problems in computer vision and image processing using deep learning in core tasks such as visual classification, detection, and segmentation (which includes human parsing).”
This is not the first victory for the Imaging Processing Group. The IFP claimed first in the NVIDIA AI City Challenge in 2017 and the ImageNet Large Scale Visual Recognition Challenge in both 2015 and 2017.