Last month marked the completion of the 6th annual ImageNet Large Scale Visual Recognition Challenge (ILSVRC 2015) in Beijing, a prestigious global competition referred to as the “Olympics of Computer Vision.” The ILSVRC is a benchmark in object classification and detection, with millions of images and hundreds of object classes, and the competition is meant to bring out the best of the best in the field. Competitors from major industry and academia establishments including IBM T.J. Watson Research Center, Microsoft Research, Google, Baidu, Adobe, MIT, UC Berkeley, and Oxford University, among others, faced off to determine whose technology would set the bar for object detection and image classification on a large scale.
After the dust settled, three of the competitors supported by AMAX-built GPU platforms won several #1 rankings in the competition. Those teams included Team AMAX, which was a collaboration between the University of Technology, Sydney (UTS) and Nanjing University of Information Science and Technology, and Team CUVideo, a collaboration between AMAX and Sensetime, an enterprise leader in next-gen visual understanding and artificial intelligence.
The collaboration between UTS and the Nanjing University of Information Science and Technology was headed by Prof. Tao Dacheng, Dr. Deng Jiangkang, and Prof. Liu Qingshan. Using the AMAX GPU-powered Deep Learning engine, the DL-E400, Team AMAX placed 1st in four total categories including: Object Detection (DET): Object Detection with Additional Training Data, Object Localization (LOC): Classification + Localization with Additional Training Data, and both the number of categories won and mean average precision for Object Detection from Video (VID): Object Detection from Video with Additional Training Data.
“Our team consisted of UTS and Nanjing University of Information, along with hardware support from AMAX. Our team took the advantage of both the talent and idea from research institutions, as well as the hardware support and resources from enterprise to demonstrate advanced results,” says Prof. Tao Dacheng, UTS. “This kind of collaboration will be the best approach for the future artificial intelligence development.”
Team CUVideo was based on a collaboration between AMAX and SenseTime, a company renowned for its recognition technology. The co-developed technology is called SenseBox, which achieved a No. 1 ranking in both number of categories recognized, as well as the mean average precision of VID (Object detection from video).
In preparation for ILSVRC, CUvideo worked closely with AMAX’s platform development team in China to engineer the powerful Deep Learning platform which would become SenseBox. Based on the DL-E400 from AMAX’s Deep Learning Solutions product line, SenseBox features complex, multi-layered non-linear algorithms based on neural networks. SenseBox was specifically designed by the CUvideo team to be optimized for this competition, and for the first time, CUvideo proved to the world that MMLAB and SenseTime have superior ability in the field of video detection. According to the final results, the CUvideo team led the competition by recognizing an impressive 28 out of 30 object categories, and achieved 68% mean average precision in the VID.
The CUVideo team was headed by top scientists that included Professor Wanli Ouyang from MMLAB of CUHK and Dr. Junjie Yan from SenseTime.
ImageNet 2015 CUHK leader Professor Wanli Ouyang
ImageNet 2015 SenseTime leader scientist Junjie Yan
VID, the newly added video detection task, was generally considered to be the most difficult task at ILSVRC. “There is still a lot of work to do in video processing in computer vision. Compared with images, videos are actually more complicated, and contain more information. Therefore, the difficulty increased by an order of magnitude,” said Professor Xiaogang Wang from MMLAB, in a July interview.
“These wins represent major validation for the technical capabilities of AMAX and SenseTime,” said Jerry Shih, President, AMAX. “Deep learning is the key technology transforming a range of applications from security to retail to social networking to meet the demands of a fast-changing, modern world. The fact that our technology is leading in such a groundbreaking arena when we are only scratching the surface means there are many more exciting developments to look forward to.”
The Future of Deep Learning Is Bright
With major companies heavily investing in AI and machine learning such as Google, Facebook, Amazon, Microsoft, Baidu, etc., and thousands of potential applications that test the boundaries of creativity and science fiction, we are only scratching the surface of how this technology could change our lives. The collaboration efforts to prepare for ILSVRC2015 and the 1st Place finishes in multiple major categories give much validation to AMAX’s engineering capabilities in this emerging and exciting field. AMAX will continue to collaborate with leading organizations like SenseTime and the Nanjing University of Information Science and Technology in order to explore the full potential of these Deep Learning solutions. With advanced technologies and engineering expertise, AMAX intends to perform ever more challenging tasks and further expand the applications of Deep Learning in the field of computer vision. If you have an application that could benefit from deep learning development and implementation and need the most powerful GPU engines to fast track your development, please don’t hesitate to contact us! We are excited to work with all the players in this field!