Machine learning, and in particular Natural Language Processing, has become a black box that most people just accept as a part of their lives.
"Hey Siri", "Hey Alexa", and "Ok Google" have all become nearly household phrases, but understanding the data underneath their responses is left to the experts.
...and being able to communicate the importance of these data is even more difficult.
Even though current data visualization methods exist, nothing more robust than (pretty much) the brat rapid annotation tool is currently in production.
Further, web apps like this try to visualize things like twitter sentiment but fail to provide clear context for the content.
This is where Emory NLP is coming in: providing a complete, dynamic, and interactive visualization system for researchers and the public alike.
I love graphic design and take a lot of inspiration from robust logo systems, specifically like these below because they are adaptable and, although look different with different data, are inherently unified.
So, from here, I knew that I needed an adaptable visual system but one that could take complex data and allow users to explore those data visually, and visualize differences between different parts of the data through animation, color, and scale.
Instead of viewing word dependencies, parts of speech, and other meta data in a cluttered fashion all at once, I prototyped a depth-separated menu for every word that shows its dependencies, part of speech, and coreferences.
Instead of being confined to traditional 2D scrolling in order to analyze a long list of sentences, I created a 3D sentence scrolling concept that allows users to focus on any given sentence while still maintaining a context from its surrounding content.
In order to solve a fundamental issue with understanding how people learn language and understand grammar, linguistics and NLP researchers often use dependency trees to visualize grammatical relationships between words in a sentence. However, most diagrams are static and offer little context about where the relationships are coming from, or how you can see it through normal text.
I took this problem and created a small prototype in Principle that smoothly transitions a normal sentence into a dependency tree that allows you to explore deeper and focus on individual words. I will soon be implementing this with D3.js.
Perhaps my favorite part about this was the button I created that transitions between "Treeify" and "Textify" because it visually embodies the tranistion between visualization state, something that I consider very important in button and interaction design.
First, we have to look at how the data structures were organized once a sentence was placed through the sentiment analysis algorithm:
From here, we went to the drawing board and took this general strucutre to see how we could map these values into a digestable visualization.
We realized that the sentiment score of each sentence could be represented as a vector.
By multiplying the sentiment score by 255, we could end up producing a corresponding RGB value.
Very negative is very red.
Very neutral is very green.
Very positive is very blue.
Then, using the different weights of each word, we can translate that into either opacity or relative scale.
All of this put together gives us a dynamic, adaptable visualization system that produces some interesting results:
AWS Collaborates with Emory University to Develop Cloud-Based NLP Research Platform Using Apache MXNet
I ended up creating a web based interface for this in React and Redux.
Visit demo.elit.cloud for a live demo :)
In the near future, we will be conduting several user studies to explore the effectiveness of different visualization techniques on users' information comprehension.
The nature of my visualization algorithm allows it to produce some pretty interesting pieces of artwork. In particular:
Analyzing speeches without their words allows you to visualize the emotional polarization of large pieces of text in a beautiful, intuitive way.