Dense Video Captioning Using Graph-Based Sentence Summarization

Top Reasons to Join SPS Today!

1. IEEE Signal Processing Magazine
2. Signal Processing Digital Library*
3. Inside Signal Processing Newsletter
4. SPS Resource Center
5. Career advancement & recognition
6. Discounts on conferences and publications
7. Professional networking
8. Communities for students, young professionals, and women
9. Volunteer opportunities
10. Coming soon! PDH/CEU credits
Click here to learn more.

TMM Volume 23 | 2021

Dense Video Captioning Using Graph-Based Sentence Summarization

TMM Featured Articles

By:

Zhiwang Zhang; Dong Xu; Wanli Ouyang; Luping Zhou

Recently, dense video captioning has made attractive progress in detecting and captioning all events in a long untrimmed video. Despite promising results were achieved, most existing methods do not sufficiently explore the scene evolution within an event temporal proposal for captioning, and therefore perform less satisfactorily when the scenes and objects change over a relatively long proposal. To address this problem, we propose a graph-based partition-and-summarization (GPaS) framework for dense video captioning within two stages. For the “partition” stage, a whole event proposal is split into short video segments for captioning at a finer level. For the “summarization” stage, the generated sentences carrying rich description information for each segment are summarized into one sentence to describe the whole event. We particularly focus on the “summarization” stage, and propose a framework that effectively exploits the relationship between semantic words for summarization. We achieve this goal by treating semantic words as the nodes in a graph and learning their interactions by coupling Graph Convolutional Network (GCN) and Long Short Term Memory (LSTM), with the aid of visual cues. Two schemes of GCN-LSTM Interaction (GLI) modules are proposed for seamless integration of GCN and LSTM. The effectiveness of our approach is demonstrated via an extensive comparison with the state-of-the-arts methods on the two benchmarks ActivityNet Captions dataset and YouCook II dataset.

Read on IEEE Xplore

Tags:

IEEE TMM Article

SPS Social Media

IEEE SPS Facebook Page https://www.facebook.com/ieeeSPS
IEEE SPS X Page https://x.com/IEEEsps
IEEE SPS Instagram Page https://www.instagram.com/ieeesps/?hl=en
IEEE SPS LinkedIn Page https://www.linkedin.com/company/ieeesps/
IEEE SPS YouTube Channel https://www.youtube.com/ieeeSPS

IEEE SPS Educational Resources

IEEE SPS Resource Center

IEEE SPS YouTube Channel

© Copyright 2025 IEEE - All rights reserved. Use of this website signifies your agreement to the IEEE Terms and Conditions.
A public charity, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity.

signal_general.jpg

IEEE JSTSP Special Issue on Advanced AI and Signal Processing for Low-Altitude Wireless Networks

TMM.png

New Editor-in-Chief (EIC) of the IEEE Transactions on Multimedia (T-MM)

ICASSP 2026 Blog Header.png

(ICASSP 2026) 2026 IEEE International Conference on Acoustics, Speech, and Signal Processing

What is Signal Processing?

Popular Pages

Today's:

All time:

Last viewed:

Dense Video Captioning Using Graph-Based Sentence Summarization

Transactions on Multimedia

Publications & Resources

For Authors

TMM.png

mentor_help_general_3.jpg

general_get_involved_tc_article_full.jpg

Top Reasons to Join SPS Today!

Dense Video Captioning Using Graph-Based Sentence Summarization

SPS Social Media

IEEE SPS Educational Resources

What is Signal Processing?

Popular Pages

Today's:

All time:

Last viewed:

Dense Video Captioning Using Graph-Based Sentence Summarization

Search form

You are here

Transactions on Multimedia

Publications & Resources

For Authors

Top Reasons to Join SPS Today!

Dense Video Captioning Using Graph-Based Sentence Summarization

SPS Social Media

IEEE SPS Educational Resources