Home eBooks Download › local part model for action recognition in realistic videos

Local Part Model For Action Recognition In Realistic Videos

Download Local Part Model For Action Recognition In Realistic Videos PDF/ePub or read online books in Mobi eBooks. Click Download or Read Online button to get Local Part Model For Action Recognition In Realistic Videos book now. This website allows unlimited access to, at the time of writing, more than 1.5 million titles, including hundreds of thousands of titles in various foreign languages. If the content not found or just blank you must refresh this page

Local Part Model For Action Recognition In Realistic Videos

DOWNLOAD
Author : Feng Shi
language : en
Publisher:
Release Date : 2014

Local Part Model For Action Recognition In Realistic Videos written by Feng Shi and has been published by this book supported file pdf, txt, epub, kindle and other format this book has been release on 2014 with University of Ottawa theses categories.

This thesis presents a framework for automatic recognition of human actions in uncontrolled, realistic video data such as movies, internet and surveillance videos. In this thesis, the human action recognition problem is solved from the perspective of local spatio-temporal feature and bag-of-features representation. The bag-of-features model only contains statistics of unordered low-level primitives, and any information concerning temporal ordering and spatial structure is lost. To address this issue, we proposed a novel multiscale local part model on the purpose of maintaining both structure information and ordering of local events for action recognition. The method includes both a coarse primitive level root feature covering event-content statistics and higher resolution overlapping part features incorporating local structure and temporal relationships. To extract the local spatio-temporal features, we investigated a random sampling strategy for efficient action recognition. We also introduced the idea of using very high sampling density for efficient and accurate classification. We further explored the potential of the method with the joint optimization of two constraints: the classification accuracy and its efficiency. On the performance side, we proposed a new local descriptor, called GBH, based on spatial and temporal gradients. It significantly improved the performance of the pure spatial gradient-based HOG descriptor on action recognition while preserving high computational efficiency. We have also shown that the performance of the state-of-the-art MBH descriptor can be improved with a discontinuity-preserving optical flow algorithm. In addition, a new method based on histogram intersection kernel was introduced to combine multiple channels of different descriptors. This method has the advantages of improving recognition accuracy with multiple descriptors and speeding up the classification process. On the efficiency side, we applied PCA to reduce the feature dimension which resulted in fast bag-of-features matching. We also evaluated the FLANN method on real-time action recognition. We conducted extensive experiments on real-world videos from challenging public action datasets. We showed that our methods achieved the state-of-the-art with real-time computational potential, thus highlighting the effectiveness and efficiency of the proposed methods.

Video Representation For Fine Grained Action Recognition

DOWNLOAD
Author : Yang Zhou
language : en
Publisher:
Release Date : 2016

Video Representation For Fine Grained Action Recognition written by Yang Zhou and has been published by this book supported file pdf, txt, epub, kindle and other format this book has been release on 2016 with High definition video recording categories.

Recently, fine-grained action analysis has raised a lot of research interests due to its potential applications in smart home, medical surveillance, daily living assist and child/elderly care, where action videos are captured indoor with fixed camera. Although background motion (i.e. one of main challenges for general action recognition) is more controlled compared to general action recognition, it is widely acknowledged that fine-grained action recognition is very challenging due to large intra-class variability, small inter-class variability, large variety of action categories, complex motions and complicated interactions. Fine-Grained actions, especially the manipulation sequences involve a large amount of interactions between hands and objects, therefore how to model the interactions between human hands and objects (i.e., context) plays an important role in action representation and recognition. We propose to discover the manipulated objects by human by modeling which objects are being manipulated and how they are being operated. Firstly, we propose a representation and classification pipeline which seamlessly incorporates localized semantic information into every processing step for fine-grained action recognition. In the feature extraction stage, we explore the geometric information between local motion features and the surrounding objects. In the feature encoding stage, we develop a semantic-grouped locality-constrained linear coding (SG-LLC) method that captures the joint distributions between motion and object-in-use information. Finally, we propose a semantic-aware multiple kernel learning framework (SA-MKL) by utilizing the empirical joint distribution between action and object type for more discriminative action classification. This approach can discover and model the inter- actions between human and objects. However, discovering the detailed knowledge of pre-detected objects (e.g. drawer and refrigerator). Thus, the performance of action recognition is constrained by object recognition, not to mention detection of objects requires tedious human labor for object annotation. Secondly, we propose a mid-level video representation to be suitable for fine-grained action classification. Given an input video sequence, we densely sample a large amount of spatio-temporal motion parts by temporal segmentation with spatial segmentation, and represent them with local motion features. The dense mid-level candidate parts are rich in localized motion information, which is crucial to fine-grained action recognition. From the candidate spatio-temporal parts, we perform an unsupervised approach to discover and learn the representative part detectors for final video representation. By utilizing the dense spatio-temporal motion parts, we highlight the human-object interactions and localized delicate motion in the local spatio-temporal sub-volume of the video. Thirdly, we propose a novel fine-grained action recognition pipeline by interaction part proposal and discriminative mid-level part mining. Firstly, we generate a large number of candidate object regions using off-the-shelf object proposal tool, e.g., BING. Secondly, these object regions are matched and tracked across frames to form a large spatio-temporal graph based on the appearance matching and the dense motion trajectories through them. We then propose an efficient approximate graph segmentation algorithm to partition and filter the graph into consistent local dense sub-graphs. These sub-graphs, which are spatio-temporal sub-volumes, represent our candidate interaction parts. Finally, we mine discriminative mid-level part detectors from the features computed over the candidate interaction parts. Bag-of-detection scores based on a novel Max-N pooling scheme are computed as the action representation for a video sample. Finally, we also focus on the first-view (egocentric) action recognition problem, which contains lots of hand-object interactions. On one hand, we propose a novel end-to-end trainable semantic parsing network for hand segmentation. On the other hand, we propose a second end-to-end deep convolutional network to maximally utilize the contextual information among hand, foreground object, and motion for interactional foreground object detection.

Towards Action Recognition And Localization In Videos With Weakly Supervised Learning

DOWNLOAD
Author : Nataliya Shapovalova
language : en
Publisher:
Release Date : 2014

Towards Action Recognition And Localization In Videos With Weakly Supervised Learning written by Nataliya Shapovalova and has been published by this book supported file pdf, txt, epub, kindle and other format this book has been release on 2014 with categories.

Human behavior understanding is a fundamental problem of computer vision. It is an important component of numerous real-life applications, such as human-computer interaction, sports analysis, video search, and many others. In this thesis we work on the problem of action recognition and localization, which is a crucial part of human behavior understanding. Action recognition explains what a human is doing in the video, while action localization indicates where and when in the video the action is happening. We focus on two important aspects of the problem: (1) capturing intra-class variation of action categories and (2) inference of action location. Manual annotation of videos with fine-grained action labels and spatio-temporal action locations is a nontrivial task, thus employing weakly supervised learning approaches is of interest. Real-life actions are complex, and the same action can look different in different scenarios. A single template is not capable of capturing such data variability. Therefore, for each action category we automatically discover small clusters of examples that are visually similar to each other. A separate classifier is learnt for each cluster, so that more class variability is captured. In addition, we establish a direct association between a novel test example and examples from training data and demonstrate how metadata (e.g., attributes) can be transferred to test examples. Weakly supervised learning for action recognition and localization is another challenging task. It requires automatic inference of action location for all the training videos during learning. Initially, we simplify this problem and try to find discriminative regions in videos that lead to a better recognition performance. The regions are inferred in a manner such that they are visually similar across all the videos of the same category. Ideally, the regions should correspond to the action location; however, there is a gap between inferred discriminative regions and semantically meaningful regions representing action location. To fill the gap, we incorporate human eye gaze data to drive the inference of regions during learning. This allows inferring regions that are both discriminative and semantically meaningful. Furthermore, we use the inferred regions and learnt action model to assist top-down eye gaze prediction.

Computer Vision Accv 2010

DOWNLOAD
Author : Ron Kimmel
language : en
Publisher: Springer
Release Date : 2011-02-28

Computer Vision Accv 2010 written by Ron Kimmel and has been published by Springer this book supported file pdf, txt, epub, kindle and other format this book has been release on 2011-02-28 with Computers categories.

The four-volume set LNCS 6492-6495 constitutes the thoroughly refereed post-proceedings of the 10th Asian Conference on Computer Vision, ACCV 2009, held in Queenstown, New Zealand in November 2010. All together the four volumes present 206 revised papers selected from a total of 739 Submissions. All current issues in computer vision are addressed ranging from algorithms that attempt to automatically understand the content of images, optical methods coupled with computational techniques that enhance and improve images, and capturing and analyzing the world's geometry while preparing the higher level image and shape understanding. Novel geometry techniques, statistical learning methods, and modern algebraic procedures are dealt with as well.

Computer Vision Eccv 2014

DOWNLOAD
Author : David Fleet
language : en
Publisher: Springer
Release Date : 2014-08-14

Computer Vision Eccv 2014 written by David Fleet and has been published by Springer this book supported file pdf, txt, epub, kindle and other format this book has been release on 2014-08-14 with Computers categories.

The seven-volume set comprising LNCS volumes 8689-8695 constitutes the refereed proceedings of the 13th European Conference on Computer Vision, ECCV 2014, held in Zurich, Switzerland, in September 2014. The 363 revised papers presented were carefully reviewed and selected from 1444 submissions. The papers are organized in topical sections on tracking and activity recognition; recognition; learning and inference; structure from motion and feature matching; computational photography and low-level vision; vision; segmentation and saliency; context and 3D scenes; motion and 3D scene analysis; and poster sessions.

Gesture Recognition

DOWNLOAD
Author : Sergio Escalera
language : en
Publisher: Springer
Release Date : 2017-07-19

Gesture Recognition written by Sergio Escalera and has been published by Springer this book supported file pdf, txt, epub, kindle and other format this book has been release on 2017-07-19 with Computers categories.

This book presents a selection of chapters, written by leading international researchers, related to the automatic analysis of gestures from still images and multi-modal RGB-Depth image sequences. It offers a comprehensive review of vision-based approaches for supervised gesture recognition methods that have been validated by various challenges. Several aspects of gesture recognition are reviewed, including data acquisition from different sources, feature extraction, learning, and recognition of gestures.

Computer Vision Eccv 2012

DOWNLOAD
Author : Andrew Fitzgibbon
language : en
Publisher: Springer
Release Date : 2012-09-26

Computer Vision Eccv 2012 written by Andrew Fitzgibbon and has been published by Springer this book supported file pdf, txt, epub, kindle and other format this book has been release on 2012-09-26 with Computers categories.

The seven-volume set comprising LNCS volumes 7572-7578 constitutes the refereed proceedings of the 12th European Conference on Computer Vision, ECCV 2012, held in Florence, Italy, in October 2012. The 408 revised papers presented were carefully reviewed and selected from 1437 submissions. The papers are organized in topical sections on geometry, 2D and 3D shapes, 3D reconstruction, visual recognition and classification, visual features and image matching, visual monitoring: action and activities, models, optimisation, learning, visual tracking and image registration, photometry: lighting and colour, and image segmentation.

Human Action Detection Tracking And Segmentation In Videos

DOWNLOAD
Author : Yicong Tian
language : en
Publisher:
Release Date : 2018

Human Action Detection Tracking And Segmentation In Videos written by Yicong Tian and has been published by this book supported file pdf, txt, epub, kindle and other format this book has been release on 2018 with categories.

This dissertation addresses the problem of human action detection, human tracking and segmentation in videos. They are fundamental tasks in computer vision and are extremely challenging to solve in realistic videos. We first propose a novel approach for action detection by exploring the generalization of deformable part models from 2D images to 3D spatiotemporal volumes. By focusing on the most distinctive parts of each action, our models adapt to intra-class variation and show robustness to clutter. This approach deals with detecting action performed by a single person. When there are multiple humans in the scene, humans need to be segmented and tracked from frame to frame before action recognition can be performed. Next, we propose a novel approach for multiple object tracking (MOT) by formulating detection and data association in one framework. Our method allows us to overcome the confinements of data association based MOT approaches, where the performance is dependent on the object detection results provided at input level. We show that automatically detecting and tracking targets in a single framework can help resolve the ambiguities due to frequent occlusion and heavy articulation of targets. In this tracker, targets are represented by bounding boxes, which is a coarse representation. However, pixel-wise object segmentation provides fine level information, which is desirable for later tasks. Finally, we propose a tracker that simultaneously solves three main problems: detection, data association and segmentation. This is especially important because the output of each of those three problems are highly correlated and the solution of one can greatly help improve the others. The proposed approach achieves more accurate segmentation results and also helps better resolve typical difficulties in multiple target tracking, such as occlusion, ID-switch and track drifting.

Web Age Information Management

DOWNLOAD
Author : Feifei Li
language : en
Publisher: Springer
Release Date : 2014-06-14

Web Age Information Management written by Feifei Li and has been published by Springer this book supported file pdf, txt, epub, kindle and other format this book has been release on 2014-06-14 with Computers categories.

This book constitutes the refereed proceedings of the 15th International Conference on Web-Age Information Management, WAIM 2014, held in Macau, China, in June 2014. The 48 revised full papers presented together with 35 short papers were carefully reviewed and selected from numerous submissions. The papers are organized in topical sections on information retrieval; recommender systems; query processing and optimization; data mining; data and information quality; information extraction; mobile and pervasive computing; stream, time-series; security and privacy; semantic web; cloud computing; new hardware; crowdsourcing; social computing.

Trends And Topics In Computer Vision

DOWNLOAD
Author : Kiriakos N. Kutulakos
language : en
Publisher: Springer
Release Date : 2013-01-18

Trends And Topics In Computer Vision written by Kiriakos N. Kutulakos and has been published by Springer this book supported file pdf, txt, epub, kindle and other format this book has been release on 2013-01-18 with Computers categories.

The two volumes LNCS 6553 and 6554 constitute the refereed post-proceedings of 7 workshops held in conjunction with the 11th European Conference on Computer Vision, held in Heraklion, Crete, Greece in September 2010. The 62 revised papers presented together with 2 invited talks were carefully reviewed and selected from numerous submissions. The first volume contains 26 revised papers and 2 invited talks selected from the following workshops: First International Workshop on Parts and Attributes; Third Workshop on Human Motion Understanding, Modeling, Capture and Animation; and International Workshop on Sign, Gesture and Activity (SGA 2010).

Local Part Model For Action Recognition In Realistic Videos

Local Part Model For Action Recognition In Realistic Videos

Video Representation For Fine Grained Action Recognition

Towards Action Recognition And Localization In Videos With Weakly Supervised Learning

Computer Vision Accv 2010

Computer Vision Eccv 2014

Gesture Recognition

Computer Vision Eccv 2012

Human Action Detection Tracking And Segmentation In Videos

Web Age Information Management

Trends And Topics In Computer Vision

Recent Posts

Advertisement