Image-Text Parsing for Understanding Social and Political News Events
Rapidly changing technologies of multi-modal communication, from the global reach of international satellite TV, the proliferation of Internet news outlets, to YouTube, are transforming the news industry. In parallel, “citizen journalism” is on the rise, enabled by smart phones, social networks, and blogs. The Internet is becoming a vast information ecosystem driven by mediated events – elections, social movements, natural disasters, disease epidemics – with rich heterogeneous data: text, image, and video. Meanwhile, the tools and methodologies for users and researchers are not keeping pace: it remains prohibitively labor-intensive to systematically access and study the vast amount of emerging news data.
We propose to develop a new computational paradigm for analyzing massive datasets of social and political news events:
- (i) Studying joint image-text parsing to categorize news by topics and events, and analyzing selection and presentation biases across networks and media spheres in a statistical and quantitative manner never before possible;
- (ii) Studying by joint image-text mining to reason the persuasion intents, and modeling the techniques of verbal and visual persuasions;
- (iii) Discovering spatio-temporal patterns in the interactions of multiple mediated events, and analyzing agenda setting patterns;
- (iv) Developing an interactive multi-perspective news interface, vrNewsScape, for visualizing and interacting with our computational and statistical results.
This interdisciplinary project will make innovative contributions to three disciplines. Transforming social science research: The project develops a data-driven paradigm for transforming communication research in the social sciences. By enabling quantitative studies of massive visual datasets, we will be able to identify and characterize large-scale patterns of news mediation and persuasion currently inaccessible to researchers, due to the prohibitive cost of manual analysis. Paradigm-shifting innovations in computer vision: We will go beyond traditional object detection, segmentation, and recognition by studying framing and persuasion techniques in images, an untouched topic in computer vision. We will learn the semantic associations and meanings for object and scene categories in their social context. We will study image parsing to fill the semantic gap – a long standing technical barrier in image retrieval, and will generate narrative text descriptions from the parse trees so that they can be fused with the input text and closed captioning for topic mining. Disruptive innovations in text mining: This research will go beyond conventional topic mining from text to perform integrative text-image mining, bias detection, and pattern discovery in the spatio-temporal evolution of mediated news events. Our proposed research will detect and summarize controversy and mine user-generated content for analyzing communicative intent and persuasive effects.
The proposed vrNewsScape will be made publicly available to researchers and graduate students. Because the news media report on events in multiple different expert domains – including congressional and presidential politics, international relations, war and public uprisings, natural disasters and humanitarian aid missions, disease epidemics and health initiatives, criminal activity and court cases, celebrities and cultural events – the analytical tools we propose to develop are not limited to a particular research domain in social, political and computer sciences, but permit for the first time a systematic and quantitative examination of the massive datasets required to understand today’s mediated society.
This work is supported by the National Science Foundation CNS 1028381 (under the Cyberenabled Discovery Initiative)
Program manager: Dr. Anita J. La Salle