From: An analytical study of information extraction from unstructured and multidimensional big data
Approach | Technique | Purpose | Dataset | Results/limitations | |
---|---|---|---|---|---|
[97] | Supervised with prior segmentation | SVM based kernel video segmentation | Category specific video summarization | MED summaries training set 12,249 videos and testing set 60 videos | Higher quality video summaries can be produced with known categories than unsupervised approach |
[95] | Unsupervised with web based prior information | Used four baseline algorithms: random and uniform sampling, k-means and spectral clustering followed by crowdsourcing | To deal with content sparsity and large scale evaluation | 180 videos 25 for training and 155 for evaluation |
Content sparsity and poor quality of user generated videos are major challenges Expert evaluation is not possible for large scale data, therefore crowdsourcing used Adding web images of category to incorporate knowledge is time consuming process especially for unknown categories |
[98] | Supervised | Linear combination of Submodular maximization for each objective using structured learning | To implement interestingness, representativeness, and uniformity | Egocentric dataset and SumMe dataset | Shortage of large datasets for summarization |
[94] | Supervised | vsLSTM to model variable range temporal dependency | To address the need for large amount of annotated data | SumMe and TVSum | Domain adaptation can improve learning and reduces discrepancies |
[93] | Supervised | Sequential determinantal point process: supervised DPP coupled with NN representation | To incorporate human created summaries for selection of informative and diverse datasets | Open video project (50), YouTube (39), Kodak consumer video (18 videos) | Supervised approach with linear representation performed better |
[99] | NA | MSR (Minimum Sparse Representation) based summarization |
To utilize min number of keyframes To provide flexibility for practical applications | Open video project (50 videos), several genres dataset (50 videos) |
Two variants were proposed for off-line and on-line applications Focused on selection of key frames |
[96] | Unsupervised | General adversarial framework: summarizer (auto encoder LSTM) + discriminator LSTM (LSTM) | To regularize the summary length, diversity, and keyframes for | SumMe, TVSum, open video project, YouTube |
Different performance on different datasets Deep features perform better than shallow features Frames with very slow motion and no scene change gave poor results |