Deep neural networks have achieved outstanding results in various applications such as vision, language, audio, speech, or reinforcement learning. These powerful function approximators typically require large amounts of data to be trained, which poses a challenge in the usual case where little labeled data is available. During the last year, multiple solutions have been proposed to leverage this problem, based on the concept of self-supervised learning, which can be understood as a specific case of unsupervised learning. This talk will cover its basic principles and provide examples in the field of multimedia.
Be ready for a long and interesting list of experiences, titles, and recognitions that our guest Xavier has.Xavier Giro-i-Nieto is an associate professor at the Universitat Politecnica de Catalunya (UPC) in Barcelona, as member of the Intelligent Data Science and Artificial Intelligence Research Center (IDEAI-UPC) and Image Processing Group (GPI), and also a visiting researcher at Barcelona Supercomputing Center (BSC). He graduated in Telecommuncations Engineering at ETSETB (UPC) in 2000, after completing his master thesis on image compression at the Vrije Universiteit in Brussels (VUB) with Prof. Peter Schelkens. After working one year in Sony Brussels, he started a Phd on computer vision, supervised by Prof. Ferran Marqués. In parallel, he designed and taught courses at the ESEIAAT (video content delivery) and ETSETB (deep learning) schools at UPC, as well as the Master in Computer Vision of Barcelona (video analysis). He visited multiple times the Digital Video and MultiMedia laboratory directed by Prof. Shih-Fu Chang at Columbia University in New York between 2008-2014, with whom keeps collaborating. He also works closely with the Insight Center of Data Analytics at Dublin City University, as well as his industrial partners at Vilynx, Mediapro, and Crisalix. He serves as associate editor at IEEE Transactions in Multimedia and reviews for top tier conferences in machine learning (NeurIPS, ICML), computer vision (CVPR, ECCV, ICCV) and multimedia (ACMMM, ICMR).