Abbreviation on Demand
A Crowdsourced Approach to Building an API for Abbreviating Long Words in Labels
Mariana Shimabukuro and Christopher Collins
This project is the product of my M.Sc. thesis defended in 2017. The problem we were trying to solve is that in the context of visualization, sometimes the text labels can get too long to fit in the available space. The commonly used solutions usually include making the font size smaller, which can cause legibility problems; or apply some sort of techniques such as dropping the vowels and/or truncating the words to make them fit. In the picture below we can see a visual comparison of known techniques versus the Abbreviation on Demand technique.
The Abbreviation on Demand technique is essentially a recommendation algorithm that uses data on which letters are less important, consequentially the least important letter can be dropped automatically creating an abbreviation. The abbreviations created by this technique aims to make the word fit in a given space (screen space in pixels considering typeface configurations) or a number of character while keeping as many letters as possible.
The data fed into the algorithm was collected in an adaptive crowdsourced study which collected abbreviations for given English words and original words for given abbreviations. When we say adaptive crowdsourced study we mean that we had a pre-set list of words to be abbreviated and we used a ranking algorithm to select a relevant abbreviation for a word when participants moved to the task where they had to guess the original word from an abbreviation. The rationale behind this is for each word several abbreviations were created, but in order to test all the abbreviations without running two separate studies we had to created a way to select the ones that are relevant (relevance was calculated based on the similarity of the abbreviation to the original word and some others factors for de-biasing the algorithm).
The adaptive study design is what allowed us to run both tasks in the same study, however, the same results could had been achieved with we had manually selected the abbreviation using the same criteria.
Finally, at this point, our Abbreviation on Demand API can have its performance improved if we feed it more data about how abbreviations are created, the more data the better the algorithm can abbreviate different words. So, our next step is to collect more abbreviation to feed into the Abbreviation on Demand API; and in order to validate our approach, we plan to run an evaluation to compare the performance of our technique versus the previously mentioned techniques.
We can see below some other examples of the Abbreviation on Demand technique being applied to visualizations.
A known problem in information visualization labeling is when the text is too long to fit in the label space. There are some commonly known techniques used in order to solve this problem like setting a very small font size. On the other hand, sometimes the font size is so small that the text can be difficult to read. Wrapping sentences, dropping letters and text truncation can also be used. However, there is no research on how these techniques affect the legibility and readability of the visualization. In other words, we don’t know whether or not applying these techniques is the best way to tackle this issue. This thesis describes the design and implementation of a crowdsourced study that uses a recommendation system to narrow down abbreviations created by participants allowing us to efficiently collect and test the data in the same session. The study design also aims to investigate the effect of semantic context on the abbreviation that the participants create and the ability to decode them. Finally, based on the study data analysis we present a new technique to automatically make words as short as they need to be to maintain text legibility and readability.
The Abbreviation on Demand API
Based on this project we implemented and made available online an API which allows other programmers to use our abbreviation algorithm in their web applications.
API available at: https://abbreviation.vialab.ca
GitHub project: https://github.com/vialab/Abbreviation-On-Demand-API
Demo and Supplemental Materials
For some demos applying our Abbreviation on Demand algorithm, and some visualizations of our study data access: http://vialab.science.uoit.ca/abbrVisualization/
|An Adaptive Crowdsourced Investigation of Word Abbreviation Techniques for Text Visualizations. 2017, (Master’s Thesis).:|
|Abbreviating Text Labels on Demand. 2017, (Poster Paper).:|
|Abbreviating Text Labels on Demand. 2017, (Poster).:|