Multimodal Learning Toward Micro Video Understanding

DOWNLOAD
Download Multimodal Learning Toward Micro Video Understanding PDF/ePub or read online books in Mobi eBooks. Click Download or Read Online button to get Multimodal Learning Toward Micro Video Understanding book now. This website allows unlimited access to, at the time of writing, more than 1.5 million titles, including hundreds of thousands of titles in various foreign languages. If the content not found or just blank you must refresh this page
Multimodal Learning Toward Micro Video Understanding
DOWNLOAD
Author : Liqiang Nie
language : en
Publisher: Morgan & Claypool Publishers
Release Date : 2019-09-17
Multimodal Learning Toward Micro Video Understanding written by Liqiang Nie and has been published by Morgan & Claypool Publishers this book supported file pdf, txt, epub, kindle and other format this book has been release on 2019-09-17 with Computers categories.
Micro-videos, a new form of user-generated content, have been spreading widely across various social platforms, such as Vine, Kuaishou, and TikTok. Different from traditional long videos, micro-videos are usually recorded by smart mobile devices at any place within a few seconds. Due to their brevity and low bandwidth cost, micro-videos are gaining increasing user enthusiasm. The blossoming of micro-videos opens the door to the possibility of many promising applications, ranging from network content caching to online advertising. Thus, it is highly desirable to develop an effective scheme for high-order micro-video understanding. Micro-video understanding is, however, non-trivial due to the following challenges: (1) how to represent micro-videos that only convey one or few high-level themes or concepts; (2) how to utilize the hierarchical structure of venue categories to guide micro-video analysis; (3) how to alleviate the influence of low quality caused by complex surrounding environments and camera shake; (4) how to model multimodal sequential data, i.e. textual, acoustic, visual, and social modalities to enhance micro-video understanding; and (5) how to construct large-scale benchmark datasets for analysis. These challenges have been largely unexplored to date. In this book, we focus on addressing the challenges presented above by proposing some state-of-the-art multimodal learning theories. To demonstrate the effectiveness of these models, we apply them to three practical tasks of micro-video understanding: popularity prediction, venue category estimation, and micro-video routing. Particularly, we first build three large-scale real-world micro-video datasets for these practical tasks. We then present a multimodal transductive learning framework for micro-video popularity prediction. Furthermore, we introduce several multimodal cooperative learning approaches and a multimodal transfer learning scheme for micro-video venue category estimation. Meanwhile, we develop a multimodal sequential learning approach for micro-video recommendation. Finally, we conclude the book and figure out the future research directions in multimodal learning toward micro-video understanding.
Multimodal Learning Toward Micro Video Understanding
DOWNLOAD
Author : Liqiang Nie
language : en
Publisher: Springer Nature
Release Date : 2022-05-31
Multimodal Learning Toward Micro Video Understanding written by Liqiang Nie and has been published by Springer Nature this book supported file pdf, txt, epub, kindle and other format this book has been release on 2022-05-31 with Technology & Engineering categories.
Micro-videos, a new form of user-generated contents, have been spreading widely across various social platforms, such as Vine, Kuaishou, and Tik Tok. Different from traditional long videos, micro-videos are usually recorded by smart mobile devices at any place within a few seconds. Due to its brevity and low bandwidth cost, micro-videos are gaining increasing user enthusiasm. The blossoming of micro-videos opens the door to the possibility of many promising applications, ranging from network content caching to online advertising. Thus, it is highly desirable to develop an effective scheme for the high-order micro-video understanding. Micro-video understanding is, however, non-trivial due to the following challenges: (1) how to represent micro-videos that only convey one or few high-level themes or concepts; (2) how to utilize the hierarchical structure of the venue categories to guide the micro-video analysis; (3) how to alleviate the influence of low-quality caused by complex surrounding environments and the camera shake; (4) how to model the multimodal sequential data, {i.e.}, textual, acoustic, visual, and social modalities, to enhance the micro-video understanding; and (5) how to construct large-scale benchmark datasets for the analysis? These challenges have been largely unexplored to date. In this book, we focus on addressing the challenges presented above by proposing some state-of-the-art multimodal learning theories. To demonstrate the effectiveness of these models, we apply them to three practical tasks of micro-video understanding: popularity prediction, venue category estimation, and micro-video routing. Particularly, we first build three large-scale real-world micro-video datasets for these practical tasks. We then present a multimodal transductive learning framework for micro-video popularity prediction. Furthermore, we introduce several multimodal cooperative learning approaches and a multimodal transfer learning scheme for micro-video venue category estimation. Meanwhile, we develop a multimodal sequential learning approach for micro-video recommendation. Finally, we conclude the book and figure out the future research directions in multimodal learning toward micro-video understanding.
Multimodal Learning Toward Recommendation
DOWNLOAD
Author : Fan Liu
language : en
Publisher: Springer Nature
Release Date : 2025-01-17
Multimodal Learning Toward Recommendation written by Fan Liu and has been published by Springer Nature this book supported file pdf, txt, epub, kindle and other format this book has been release on 2025-01-17 with Mathematics categories.
This book presents an in-depth exploration of multimodal learning toward recommendation, along with a comprehensive survey of the most important research topics and state-of-the-art methods in this area. First, it presents a semantic-guided feature distillation method which employs a teacher-student framework to robustly extract effective recommendation-oriented features from generic multimodal features. Next, it introduces a novel multimodal attentive metric learning method to model user diverse preferences for various items. Then it proposes a disentangled multimodal representation learning recommendation model, which can capture users’ fine-grained attention to different modalities on each factor in user preference modeling. Furthermore, a meta-learning-based multimodal fusion framework is developed to model the various relationships among multimodal information. Building on the success of disentangled representation learning, it further proposes an attribute-driven disentangled representation learning method, which uses attributes to guide the disentanglement process in order to improve the interpretability and controllability of conventional recommendation methods. Finally, the book concludes with future research directions in multimodal learning toward recommendation. The book is suitable for graduate students and researchers who are interested in multimodal learning and recommender systems. The multimodal learning methods presented are also applicable to other retrieval or sorting related research areas, like image retrieval, moment localization, and visual question answering.
Image Fusion In Remote Sensing
DOWNLOAD
Author : Arian Azarang
language : en
Publisher: Springer Nature
Release Date : 2022-05-31
Image Fusion In Remote Sensing written by Arian Azarang and has been published by Springer Nature this book supported file pdf, txt, epub, kindle and other format this book has been release on 2022-05-31 with Technology & Engineering categories.
Image fusion in remote sensing or pansharpening involves fusing spatial (panchromatic) and spectral (multispectral) images that are captured by different sensors on satellites. This book addresses image fusion approaches for remote sensing applications. Both conventional and deep learning approaches are covered. First, the conventional approaches to image fusion in remote sensing are discussed. These approaches include component substitution, multi-resolution, and model-based algorithms. Then, the recently developed deep learning approaches involving single-objective and multi-objective loss functions are discussed. Experimental results are provided comparing conventional and deep learning approaches in terms of both low-resolution and full-resolution objective metrics that are commonly used in remote sensing. The book is concluded by stating anticipated future trends in pansharpening or image fusion in remote sensing.
Advanced Multimodal Compatibility Modeling And Recommendation
DOWNLOAD
Author : Weili Guan
language : en
Publisher: Springer Nature
Release Date : 2025-03-18
Advanced Multimodal Compatibility Modeling And Recommendation written by Weili Guan and has been published by Springer Nature this book supported file pdf, txt, epub, kindle and other format this book has been release on 2025-03-18 with Computers categories.
This Third Edition sheds light on state-of-the-art theories and practices in multimodal compatibility modeling and recommendation, offering comprehensive insights into this evolving field. This topic, and fashion compatibility modeling in particular, has garnered increasing research attention in recent years due to the significant economic impact of e-commerce. Building upon recent research and the prior edition, the authors present a series of graph-learning based multimodal compatibility modeling schemes, all of which have been proven to be effective over several public real-world datasets. This book introduces a number of advanced multimodal compatibility modeling and recommendation methods, including category-guided multimodal compatibility modeling and try-on-guided multimodal compatibility modeling. The authors also provide comprehensive solutions, including correlation-oriented graph learning, modality-oriented graph learning, unsupervised disentangled graph learning, partially supervised disentangled graph learning, and metapath-guided heterogeneous graph learning.
Intelligent And Efficient Video Moment Localization
DOWNLOAD
Author : Meng Liu
language : en
Publisher: Springer Nature
Release Date : 2025-06-19
Intelligent And Efficient Video Moment Localization written by Meng Liu and has been published by Springer Nature this book supported file pdf, txt, epub, kindle and other format this book has been release on 2025-06-19 with Computers categories.
This book provides a comprehensive exploration of video moment localization, a rapidly emerging research field focused on enabling precise retrieval of specific moments within untrimmed, unsegmented videos. With the rapid growth of digital content and the rise of video-sharing platforms, users face significant challenges when searching for particular content across vast video archives. This book addresses how video moment localization uses natural language queries to bridge the gap between video content and semantic understanding, offering an intuitive solution for locating specific moments across diverse domains like surveillance, education, and entertainment. This book explores the latest advancements in video moment localization, addressing key issues such as accuracy, efficiency, and scalability. It presents innovative techniques for contextual understanding and cross-modal semantic alignment, including attention mechanisms and dynamic query decomposition. Additionally, the book discusses solutions for enhancing computational efficiency and scalability, such as semantic pruning and efficient hashing, while introducing frameworks for better integration between visual and textual data. It also examines weakly-supervised learning approaches to reduce annotation costs without sacrificing performance. Finally, the book covers real-world applications and offers insights into future research directions.
Ecai 2023
DOWNLOAD
Author : K. Gal
language : en
Publisher: IOS Press
Release Date : 2023-10-18
Ecai 2023 written by K. Gal and has been published by IOS Press this book supported file pdf, txt, epub, kindle and other format this book has been release on 2023-10-18 with Computers categories.
Artificial intelligence, or AI, now affects the day-to-day life of almost everyone on the planet, and continues to be a perennial hot topic in the news. This book presents the proceedings of ECAI 2023, the 26th European Conference on Artificial Intelligence, and of PAIS 2023, the 12th Conference on Prestigious Applications of Intelligent Systems, held from 30 September to 4 October 2023 and on 3 October 2023 respectively in Kraków, Poland. Since 1974, ECAI has been the premier venue for presenting AI research in Europe, and this annual conference has become the place for researchers and practitioners of AI to discuss the latest trends and challenges in all subfields of AI, and to demonstrate innovative applications and uses of advanced AI technology. ECAI 2023 received 1896 submissions – a record number – of which 1691 were retained for review, ultimately resulting in an acceptance rate of 23%. The 390 papers included here, cover topics including machine learning, natural language processing, multi agent systems, and vision and knowledge representation and reasoning. PAIS 2023 received 17 submissions, of which 10 were accepted after a rigorous review process. Those 10 papers cover topics ranging from fostering better working environments, behavior modeling and citizen science to large language models and neuro-symbolic applications, and are also included here. Presenting a comprehensive overview of current research and developments in AI, the book will be of interest to all those working in the field.
Graph Learning For Fashion Compatibility Modeling
DOWNLOAD
Author : Weili Guan
language : en
Publisher: Springer Nature
Release Date : 2022-11-02
Graph Learning For Fashion Compatibility Modeling written by Weili Guan and has been published by Springer Nature this book supported file pdf, txt, epub, kindle and other format this book has been release on 2022-11-02 with Computers categories.
This book sheds light on state-of-the-art theories for more challenging outfit compatibility modeling scenarios. In particular, this book presents several cutting-edge graph learning techniques that can be used for outfit compatibility modeling. Due to its remarkable economic value, fashion compatibility modeling has gained increasing research attention in recent years. Although great efforts have been dedicated to this research area, previous studies mainly focused on fashion compatibility modeling for outfits that only involved two items and overlooked the fact that each outfit may be composed of a variable number of items. This book develops a series of graph-learning based outfit compatibility modeling schemes, all of which have been proven to be effective over several public real-world datasets. This systematic approach benefits readers by introducing the techniques for compatibility modeling of outfits that involve a variable number of composing items. To deal with the challenging task of outfit compatibility modeling, this book provides comprehensive solutions, including correlation-oriented graph learning, modality-oriented graph learning, unsupervised disentangled graph learning, partially supervised disentangled graph learning, and metapath-guided heterogeneous graph learning. Moreover, this book sheds light on research frontiers that can inspire future research directions for scientists and researchers.
Pattern Recognition And Computer Vision
DOWNLOAD
Author : Shiqi Yu
language : en
Publisher: Springer Nature
Release Date : 2022-10-27
Pattern Recognition And Computer Vision written by Shiqi Yu and has been published by Springer Nature this book supported file pdf, txt, epub, kindle and other format this book has been release on 2022-10-27 with Computers categories.
The 4-volume set LNCS 13534, 13535, 13536 and 13537 constitutes the refereed proceedings of the 5th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2022, held in Shenzhen, China, in November 2022. The 233 full papers presented were carefully reviewed and selected from 564 submissions. The papers have been organized in the following topical sections: Theories and Feature Extraction; Machine learning, Multimedia and Multimodal; Optimization and Neural Network and Deep Learning; Biomedical Image Processing and Analysis; Pattern Classification and Clustering; 3D Computer Vision and Reconstruction, Robots and Autonomous Driving; Recognition, Remote Sensing; Vision Analysis and Understanding; Image Processing and Low-level Vision; Object Detection, Segmentation and Tracking.
Web Information Systems Engineering Wise 2024
DOWNLOAD
Author : Mahmoud Barhamgi
language : en
Publisher: Springer Nature
Release Date : 2024-11-30
Web Information Systems Engineering Wise 2024 written by Mahmoud Barhamgi and has been published by Springer Nature this book supported file pdf, txt, epub, kindle and other format this book has been release on 2024-11-30 with Computers categories.
This five-volume set LNCS 15436 -15440 constitutes the proceedings of the 25th International Conference on Web Information Systems Engineering, WISE 2024, held in Doha, Qatar, in December 2024. The 110 full papers and 55 short papers were presented in these proceedings were carefully reviewed and selected from 368 submissions. The papers have been organized in the following topical sections as follows: Part I : Information Retrieval and Text Processing; Text and Sentiment Analysis; Data Analysis and Optimisation; Query Processing and Information Extraction; Knowledge and Data Management. Part II: Social Media and News Analysis; Graph Machine Learning on Web and Social; Trustworthy Machine Learning; and Graph Data Management. Part III: Recommendation Systems; Web Systems and Architectures; and Humans and Web Security. Part IV: Learning and Optimization; Large Language Models and their Applications; and AI Applications. Part V: Security, Privacy and Trust; Online Safety and Wellbeing through AI; and Web Technologies.a