
Lecture Description
We’ll try to define computational journalism, as the application of computer science to four different areas: data-driven reporting, story presentation, information filtering, and effect tracking. But first we have to figure out how to represent the outside world as data. We do this using the feature vector representation. One of the most useful things we can do with such vectors is compute the distances between two of them. We can also visualize the entire vector space, but to do this we have to project the high-dimensional space down to the two dimensions of the screen.
Topics: The definition of computational journalism, encoding the world as feature vectors, distance metrics, clustering algorithms, and visualization using multi-dimensional scaling.
Course blog at jmsc.hku.hk/courses/jmsc6041spring2013/
Instructor: Jonathan Stray
Course Index
- Basics of Computational Journalism: Feature Vectors, Clustering, Projections
- Text Analysis: Tokenization, TF-IDF, Topic Modeling
- Algorithmic Filters: Information Overload
- Social and Hybrid Filters: Collaborative Filtering
- Social Network Analysis: Centrality Algorithms
- Knowledge Representation: Structured data & Linked open data
- Drawing Conclusions from Data
- Security, Surveillance, and Privacy
Course Description
Computational Journalism is a course given at JMSC during the Spring 2013 semester. It covers, in great detail, some of the most advanced techniques used by journalists to understand digital information, and communicate it to users. We will focus on unstructured text information in large quantities, and also cover related topics such as how to draw conclusions from data without fooling yourself, social network analysis, and online security for journalists. These are the algorithms used by search engines and intelligence agencies and everyone in between.