Hi there!

I am Johannes Knittel, a Wojcicki Troper HDSI Postdoctoral Fellow and member of the Visual Computing Group at Harvard. I am passionate about researching new ways of combining visualization and machine learning to gain more comprehensive insights into textual and multivariate datasets, as well as to better explain and understand black box AI models. I am also keen on developing data structures, algorithms, and visualizations that scale to larger, real-world datasets, offline and online.

Google Scholar GitHub LinkedIn Twitter

Selected Papers

Real-Time Visual Analysis of High-Volume Social Media Posts
Johannes Knittel, Steffen Koch, Tan Tang, Wei Chen, Yingcai Wu, Shixia Liu, and Thomas Ertl
2022 Honorable Mention IEEE Transactions on Visualization and Computer Graphics 28, no. 1 (2022): 879–89

This paper tackles the challenge of monitoring and analyzing a large number of social media posts in real-time. It proposes an efficient and explainable dynamic clustering algorithm that provides a continuous landscape of topics, as well as a diverse yet digestible stream of posts. Analysts can gradually increase the resolution to dive deeper into particular topics.

Visual Neural Decomposition to Explain Multivariate Data Sets
Johannes Knittel, Andres Lalama, Steffen Koch, and Thomas Ertl
2021 IEEE Transactions on Visualization and Computer Graphics 27, no. 2 (2021): 1374–84

Given a multivariate dataset, investigating which combination of variables relate to a particular dependent variable (and how) is an important task in many fields. This paper proposes a novel and scalable method based on neural networks for extracting potentially nonlinear multivariate relationships with more than three variables. It further presents stacked histograms and smartly sorted parallel coordinates for visualizing the extracted patterns.

PyramidTags: Context-, Time- And Word Order-Aware Tag Maps to Explore Large Document Collections
Johannes Knittel, Steffen Koch, and Thomas Ertl
2021 IEEE Transactions on Visualization and Computer Graphics 27, no. 12 (2021): 4455–68

PyramidTags extracts words and short phrases from document collections and places them onto an interactive map such that analysts can infer in which time range particular tags appear the most in the collection. Since it also places related tags nearby and tries to respect the reading order of the tags, analysts can visually explore large sets of documents (e.g., news reports) without hard topics.

Projects

ESP-kMeans

ESP-kMeans is a fast and easy-to-use library for clustering high-dimensional and potentially sparse data with k-Means++ or Spherical k-Means. The goal of this library is to cluster large datasets efficiently even if the number of clusters k is high. It has a highly parallel implementation and applies several optimizations to reduce the number of comparisons. For instance, the Spherical k-Means implementation achieves sublinear scaling with respect to the number of clusters if applied to sparse data (e.g., text documents).

ELSKE

ELSKE is a library that extracts relevant keywords and keyphrases from text sources. It aims at providing an efficient implementation for extracting keyphrases not only from individual documents but also from (micro-)document collections such as tweets, as well as sentences. The library is very lightweight with little assumptions about the data, so that it can also be used in an interactive streaming setting where we need to extract keyphrases at regular intervals. It further supports extracting frequently appearing longer phrases that are often richer in context.

Education

2018-2022
Dr. rer. nat. (PhD), Informatik (Computer Science)
University of Stuttgart
Dissertation: Large-Scale Analysis of Textual and Multivariate Data Combining Machine Learning and Visualization
Summa cum laude
2012-2015
Master of Science, Informatik (Computer Science)
University of Stuttgart
Graduated with distinction
2009-2012
Bachelor of Science, Informatik (Computer Science)
University of Stuttgart

Papers

The Role of Interactive Visualization in Explaining (Large) NLP Models: from Data to Inference
Richard Brath, Daniel Keim, Johannes Knittel, Shimei Pan, Pia Sommerauer, Hendrik Strobelt
2023 arXiv preprint

Real-Time Visual Analysis of High-Volume Social Media Posts
Johannes Knittel, Steffen Koch, Tan Tang, Wei Chen, Yingcai Wu, Shixia Liu, and Thomas Ertl
2022 Honorable Mention IEEE Transactions on Visualization and Computer Graphics 28, no. 1 (2022): 879–89

Visual Neural Decomposition to Explain Multivariate Data Sets
Johannes Knittel, Andres Lalama, Steffen Koch, and Thomas Ertl
2021 IEEE Transactions on Visualization and Computer Graphics 27, no. 2 (2021): 1374–84

PyramidTags: Context-, Time- And Word Order-Aware Tag Maps to Explore Large Document Collections
Johannes Knittel, Steffen Koch, and Thomas Ertl
2021 IEEE Transactions on Visualization and Computer Graphics 27, no. 12 (2021): 4455–68

Efficient Sparse Spherical K-Means for Document Clustering
Johannes Knittel, Steffen Koch, and Thomas Ertl
2021 In Proceedings of the 21st ACM Symposium on Document Engineering, DocEng 2021

ELSKE: Efficient Large-Scale Keyphrase Extraction
Johannes Knittel, Steffen Koch, and Thomas Ertl
2021 In Proceedings of the 21st ACM Symposium on Document Engineering, DocEng 2021

Online Study of Word-Sized Visualizations in Social Media
Franziska Huth, Miriam Awad-Mohammed, Johannes Knittel, Tanja Blascheck, and Petra Isenberg
2021 In Proceedings of the EuroVis 2021 Posters

PlotThread: Creating Expressive Storyline Visualizations Using Reinforcement Learning
Tan Tang, Renzhong Li, Xinke Wu, Shuhan Liu, Johannes Knittel, Steffen Koch, Thomas Ertl, Lingyun Yu, Peiran Ren, and Yingcai Wu
2021 IEEE Transactions on Visualization and Computer Graphics 27, no. 2 (2021): 294–303

Pattern-Based Semantic and Temporal Exploration of Social Media Messages
Johannes Knittel, Steffen Koch, and Thomas Ertl
2019 In Proceedings of the 2019 IEEE Conference on Visual Analytics Science and Technology, VAST 2019, 134–35

Interactive Hierarchical Quote Extraction for Content Insights
Johannes Knittel, Steffen Koch, and Thomas Ertl
2019 In Proceedings of the EuroVis 2019 Posters

Highlighting Text Regions of Interest with Character-Based LSTM Recurrent Networks
Johannes Knittel, Steffen Koch, and Thomas Ertl
2018 In Proceedings of the 2018 Postersession at the IEEE Conference on Visualization

PDF
Mining Subtitles for Real-Time Content Generation for Second-Screen Applications
Johannes Knittel, and Tilman Dingler
2016 In Proceedings of the ACM International Conference on Interactive Experiences for TV and Online Video, 93–103, TVX ’16

Utilizing Contextual Information for Mobile Communication
Johannes Knittel, Alireza Sahami Shirazi, Niels Henze, and Albrecht Schmidt
2013 In CHI ’13 Extended Abstracts on Human Factors in Computing Systems, 1371–1376, CHI EA ’13

Contact

Harvard University
Science and Engineering Complex (SEC), room 2.419
150 Western Ave
Boston, MA 02134