Skip to main content

Table 13 Automatic video summarization

From: An analytical study of information extraction from unstructured and multidimensional big data

  Approach Technique Purpose Dataset Results/limitations
[97] Supervised with prior segmentation SVM based kernel video segmentation Category specific video summarization MED summaries training set 12,249 videos and testing set 60 videos Higher quality video summaries can be produced with known categories than unsupervised approach
[95] Unsupervised with web based prior information Used four baseline algorithms: random and uniform sampling, k-means and spectral clustering followed by crowdsourcing To deal with content sparsity and large scale evaluation 180 videos 25 for training and 155 for evaluation Content sparsity and poor quality of user generated videos are major challenges
Expert evaluation is not possible for large scale data, therefore crowdsourcing used
Adding web images of category to incorporate knowledge is time consuming process especially for unknown categories
[98] Supervised Linear combination of Submodular maximization for each objective using structured learning To implement interestingness, representativeness, and uniformity Egocentric dataset and SumMe dataset Shortage of large datasets for summarization
[94] Supervised vsLSTM to model variable range temporal dependency To address the need for large amount of annotated data SumMe and TVSum Domain adaptation can improve learning and reduces discrepancies
[93] Supervised Sequential determinantal point process: supervised DPP coupled with NN representation To incorporate human created summaries for selection of informative and diverse datasets Open video project (50), YouTube (39), Kodak consumer video (18 videos) Supervised approach with linear representation performed better
[99] NA MSR (Minimum Sparse Representation) based summarization To utilize min number of keyframes
To provide flexibility for practical applications
Open video project (50 videos), several genres dataset (50 videos) Two variants were proposed for off-line and on-line applications
Focused on selection of key frames
[96] Unsupervised General adversarial framework: summarizer (auto encoder LSTM) + discriminator LSTM (LSTM) To regularize the summary length, diversity, and keyframes for SumMe, TVSum, open video project, YouTube Different performance on different datasets
Deep features perform better than shallow features
Frames with very slow motion and no scene change gave poor results