Embedding Comparator: Visualizing Differences in Global Structure and Local Neighborhoods via Small Multiples Angie Boggust*, Brandon Carter*, Arvind Satyanarayan arXiv:1912.04853

The Embedding Comparator (left) facilitates comparison of embedding spaces via small multiples of Local Neighborhood Dominoes (right).

Abstract

Embeddings — mappings from high-dimensional discrete in- put to lower-dimensional continuous vector spaces — have been widely adopted in machine learning, linguistics, and computational biology as they often surface interesting and unexpected domain semantics. Through semi-structured interviews with embedding model researchers and practitioners, we find that current tools poorly support a central concern: comparing different embeddings when developing fairer, more robust models. In response, we present the Embedding Comparator, an interactive system that balances gaining an overview of the embedding spaces with making fine-grained comparisons of local neighborhoods. For a pair of models, we compute the similarity of the k-nearest neighbors of every embedded object, and visualize the results as Local Neighborhood Dominoes: small multiples that facilitate rapid comparisons. Using case studies, we illustrate the types of insights the Embedding Comparator reveals including how fine-tuning embeddings changes semantics, how language changes over time, and how training data differences affect two seemingly similar models.

Materials
BibTeX
@article{boggust2019embedding,
  title={Embedding Comparator: Visualizing Differences in Global Structure and Local Neighborhoods via Small Multiples},
  author={Boggust, Angie and Carter, Brandon and Satyanarayan, Arvind},
  journal={arXiv preprint arXiv:1912.04853},
  year={2019}
}