Cabinet Tree: an orthogonal enclosure approach to visualizing and exploring big data
© Yang et al. 2015
Received: 28 February 2015
Accepted: 29 June 2015
Published: 22 July 2015
Treemaps are well-known for visualizing hierarchical data. Most related approaches have been focused on layout algorithms and paid little attention to other display properties and interactions. Furthermore, the structural information in conventional Treemaps is too implicit for viewers to perceive. This paper presents Cabinet Tree, an approach that: i) draws branches explicitly to show relational structures, ii) adapts a space-optimized layout for leaves and maximizes the space utilization, iii) uses coloring and labeling strategies to clearly reveal patterns and contrast different attributes intuitively. We also apply the continuous node selection and detail window techniques to support user interaction with different levels of the hierarchies. Our quantitative evaluations demonstrate that Cabinet Tree achieves good scalability for increased resolutions and big datasets.
Much of data we use today has a hierarchical structure. Examples of hierarchical structures include university-department structure, family tree, library catalogues and so on. Such structures not only play significant roles in their own right, but also provide means for representing a complex domain in a manageable form. Current GUI tools, such as traditional node-link diagrams or file browsers, are an effective means for users to locate information, however one major drawback of common node-link representations is that they do not use screen real estate very efficiently [1, 2].
In the real world, hierarchical structures are often very large with thousands or even millions of elements and relationships. Therefore, a capability of visualizing the entire structure while supporting deep exploration at different levels of granularity is urgently needed for effective knowledge discovery . Enclosure or space-filling visualization, such as Treemaps techniques [4, 5] propose an interesting approach to solve this problem. The Treemap algorithm ensures almost 100 % use of the space by dividing it into a nested sequence of rectangles whose areas correspond to an attribute of the dataset, effectively combining features of a Venn diagram and a pie chart . Originally designed to visualize files on a hard drive , Treemaps have been applied to a wide variety of areas ranging from financial analysis, sport reporting , image browsing  and software and file system analysis .
As an important application issue, scalability refers to the capability of effectively displaying large amounts of data . Pixel is the smallest addressable element in a display device, so screen resolutions become the limiting factor for scalable visualizations. Larger displays with higher resolutions are being developed for visualization  (e.g. the large wall at the AT&T Global Network Operations Center ). Therefore, scalability for high resolutions and large data sets become crucial for visualizing big data.
Much attention has been devoted in recent years to enhance the layout algorithm of Treemaps (e.g., [4–6, 14, 15]). Few studies, however, paid attention to the improvement of interaction techniques for navigating Treemaps or other display properties. Yet, Treemaps are not very convenient for exploring large hierarchies, especially when it is necessary to get access to details . It also requires extra cognitive effort for viewers to perceive and understand the relational structures that are implicit in the enclosure . Hence, the use of other display properties (e.g. color, label) is important for an intuitive visualization and efficient interaction techniques are necessary for navigating large Treemap to view details.
Interleaved Horizontal-Vertical and explicit drawing of branches and space-optimized layout for leaves, generating a highly compact and intuitive view;
A contrast-enhanced color strategy and color-coded sorting of leaves to reveal visual patterns;
Focus+context based interaction support at different levels of hierarchy;
Quantitative evaluation of scalability for big data (including hundreds of thousands of nodes) with increased resolutions.
Background and literature review
The design of an interactive visualization is often considered as two steps.The first step is to map the relational data into a geometrical plane. i.e. layout. The second step is interaction, i.e. changing views interactively to reach the desired information . However, display properties are also very helpful in providing insights in the hierarchical structure . We review related work on layout design, the use of display properties and interaction design.
Related work of layout
good space utilization and aspect ratios, but lost ordering
better spatial continuity
polygons are used to enhance the visual presentation
suitable for unit based inputs
Since the insight of structure information and the space utilization are both important for visualizing big hierarchies, yet it is very hard to achieve these two conflicting goals in a single visualization design, many tree visualization approaches make a trade-off between them.
In Treemap visualization, once the bounding box of a node is set, a variety of display properties determine how the node is drawn within it, such as color, texture, border, label, and etc.
Related work of display properties
blended colors, texture, shading
Although experimental studies show show that graphic displays reduce task completion time more significantly than tabular text displays, text labels remain critical in identifying elements of the display . Text labels are applied in Treemap right after it being created to help orient viewers . Excentric Labeling has been applied in Treemap to show labels in an interactive manner . However, the color and orientation of labels have not been comprehensively studied.
Conventional Treemaps allow the Zooming+Filtering interaction: clicking on a node or a subtree border zooms in and a click with the right mouse button zooms out . However, when zooming in/out a Treemap, the context is always lost. Treemap may apply the Focus+Context technique to provide a detailed view of a focused area while maintaining the global context. Keahey uses fisheye techniques to obtain seamless multilevel views in Treemap interaction . Fisheye and continuous semantic zooming is introduced to facilitate direct browsing of hierarchical content . Distortion technology emphasizes the focused area. However, the size of a node in Treemap usually encodes a quantitative attribute, so the distortion could confuse users.
Related work of interaction
Structure-aware multi-scale navigation
node selection and snap-zoom
Research design and methodology
Balanced trade-off between space utilization and clarity of relational structures;
Scalable for increased resolutions and data sizes;
Fast layout algorithm;
Proper mapping of display properties to data attributes;
Intuitive navigation and node selection at different hierarchical levels;
Views with focused details and the global context.
Previous user studies report that S&D Treemap has the best readability [6, 19], so our design follows the 2D orthogonal concept of S&D Treemap to enable a fast cognitive process. Cabinet Tree draws branches explicitly to reveal their hierarchical relationships. Moreover, a space-optimized layout algorithm is applied to leaves to obtain a compact view. Cabinet Tree is generally scalable for increased resolutions and data sizes as demonstrated by our experiments.
Comparison of attributes within a hierarchical structure is crucial for many applications, however, rectangles in S&D Treemap are hard to compare . In Cabinet Tree, a contract-enhanced color strategy is adopted to the mapping of attributes for viewers to compare them easily. We apply color-coded sorting to data items to reveal visual patterns among groups of related nodes. The color and orientation of labels are also studied in our work.
To overcome the node selection dilemma in Treemap , we support continuous node selection using mouse wheel. Furthermore, the focus window technique is implemented to show the details without distortion. The remaining part of this section addresses the design rationale and realization of Cabinet Tree.
Interleaved horizontal-vertical partitioning
The drawing starts from the bottom horizontal line that represents the root of the tree. Level-1 branches are drawn as vertical lines partitioning the space above the root. Level-2 branches partition the space horizontally between two neighboring level-1 lines, representing the children of the vertical line on the left. Similarly, level-3 branches partition the space vertically between two neighboring level-2 lines, the partitioning process continues down to the lowest level, which can be leaves or empty branches. Leaves occupy the remaining space within the surrounding level lines. In the following discussion, we will generally call a branch or a leaf as a node.
Cabinet Trees allow branches to have no leaves, as noted in Fig. 3 where G1, I1, J1, and K1 are leaf-less branches.
The layout algorithm is outlined in Algorithm 1, that is conceptually recursive but implemented iteratively:
The algorithm follows the partitioning concept of S&D Treemap for branches and adopts a space-optimized approach to allocate space for leaves. The time complexity of this algorithm is linear (O(n)) where n is the number of nodes, since it calculates the layout for each node only once. It is therefore suitable for real-time visualization of big hierarchical data .
Branches are orthogonally drawn with decreasing thicknesses from the root to the lowest level. The exact thickness of a branch is determined by the available space to the branch. Once the space for a branch is allocated, its thickness is calculated according to the space allocated for the branch with a minimal and maximum limits. All the branches are expanding upward vertically and rightward horizontally.
The space allocated for each leaf is calculated according to its weight and the space available from its branch. Each node is placed next to its siblings to achieve a high space utilization and also high proximity of sibling nodes .
Contrast-enhanced color strategy
We use HSI color space to present the color of each branch, the deeper of the level is, the lighter the intensity.
Labeling every item statically on a dense visualization is unrealistic , we therefore use the “Label-What-You-Can” technique” : label only the nodes with enough spaces. Each node is labeled with the text in the same orientation as the node’s orientation.
The color of a text label is crucial for the label to stand out from the background. We use YUV color space to generate the color for labels, because YUV works well for optimal foreground and background detection . Assuming that Y represents a node’s overall brightness or luminance, we compare Y with a threshold, and label it black or white depending on whether Y is less or larger than the threshold. This strategy ensures a high color contrast for labeling.
Continuous node selection
The interaction process of continuous node selection allows a user to go up and down in a branch of nodes. We designate mouse wheel as input device to support continuous node selection. This is because mouse wheel provides the audio and tactile feedback and fine control at short distances scrolling [29, 30] and is thus perfect for precise continuous navigation.
When the viewer moves mouse within Cabinet Tree, the smallest leaf at the mouse position is shown as the selected node and a forward (resp. backward) notch scroll drills down (resp. roll up) one level along the branch of this leaf.
Results and discussion
Binary files (in Moderate Pink) and source code files (in Soft Red),
Media files (in Soft Orange),
Small folders (bundled in Green) and files with unspecified types (in Vivid Yellow), and
Compressed files(in Light Yellow).
Cabinet Tree is able to handle big data with hundreds of thousands of nodes. For example, Fig. 2 presents a C drive contents on a personal computer with 455,940 files and 74,350 directories (including 3,520 empty directories).
Data sets and counts of nodes
# of files
# of directories
# of empty directories
C Drive 1
C Drive 2
To evaluate scalability with increased resolutions, we use two large datasets (D Drive and C Drive 1 in Table 4) on 7 wide screens (640 × 360, 960 × 540, 1,280 × 720, 1,600 × 900, 1,920 × 1,080, 2,560 × 1,440 and 3,840 × 2,160). In this evaluation, we wish to find the trend of the percentage of visible nodes (among all the nodes) and the layout time (excluding the file reading, rendering and displaying time) with increased resolutions.
In summary, Cabinet Tree is highly scalable with increased resolutions in both speed and percentage of visible nodes.
Number of visible nodes with different data sets
Total # of visible nodes
# of visible files
# of visible directories
C Drive 1
C Drive 2
Figure 20 shows that when the data sets get bigger, the space utilization for leaves by Cabinet Tree is decreasing, and drops dramatically from Office 2013 (5,064 total number) to Hadoop (13,519 total number).
In summary, Cabinet Tree is more scalable than S&D Treemap in all the cases we experimented. Due to its explicit visualization of branches, Cabinet Tree performs poorer than Squarified Treemap in scalability. However, given a space, Cabinet Tree can fill more leaves than S&D Treemap and Squarified Treemap since the average number of pixels per leaf is less than the latter two methods (Fig. 21).
This paper has presented a 2D approach for visualizing big hierarchical data, called Cabinet Tree. Using the enclosure and orthogonal drawing methods, Cabinet Tree performs a space-optimized layout for leaves and explicit branches with carefully designed color schemes for aesthetic and clear visualization. Color-coded sorting, contrast-enhanced color strategy and labeling techniques all make full use of display properties. Cabinet Tree also supports continuous node selection using the mouse wheel and Focus+Context view using the detail window.
Quantitative evaluations have indicated that Cabinet Tree is capable of visualizing huge datasets. It is anticipated that with higher screen resolutions, trees of hundreds of millions of nodes can be visualized on a single display. Being scalable for increased resolutions and data sizes and high layout speed, Cabinet Tree can be considered an effective tool for visualizing huge hierarchical structures in a wider range of applications.
- McGuffin MJ, Davison G, Balakrishnan R (2004) Expand-Ahead: A Space-Filling Strategy for Browsing Trees In: Proceedings of IEEE Symposium on Information Visualization, 119–126, Austin, TX.Google Scholar
- Blanch R, Lecolinet E (2007) Browsing Zoomable Treemaps: Structure-Aware Multi-Scale Navigation Techniques. IEEE Trans Vis Comput Graph 13(6): 1248–1253.View ArticleGoogle Scholar
- Huang W, Eades P, Hong SH, Lin CC (2013) Improving multiple aesthetics produces better graph drawings. J Vis Lang Comput 24(4): 262–272.View ArticleGoogle Scholar
- Johnson B, Shneiderman B (1991) Tree-maps: a space-filling approach to the visualization of hierarchical information structures In: Proceedings of IEEE Conference on Visualization, Visualization 91, 284–291, San Diego, CA.Google Scholar
- Bruls M, Van Wijk JJ, Van Wijk JJ, Huizing K (1999) Squarified Treemaps In: Proceedings of the Joint Eurographics and IEEE TCVG Symposium on Visualization, 33–42. doi:10.1007/978-3-7091-6783-0_4.
- Bederson BB, Shneiderman B, Wattenberg M (2002) Ordered and quantum treemaps: Making effective use of 2D space to display hierarchies. ACM Trans Graph 21(4): 833–854.View ArticleMATHGoogle Scholar
- Shneiderman B (1992) Tree visualization with tree-maps: 2-d space-filling approach. ACM Trans Graph 11(1): 92–99.View ArticleGoogle Scholar
- Jin L, Banks DG (1997) TennisViewer: a browser for competition trees. IEEE Comput Graph Appl 17(4): 63–65.View ArticleGoogle Scholar
- Bederson BB (2001) Quantum Treemaps and Bubblemaps for a Zoomable Image Browser In: Proceedings of User Interface Systems and Technology, 71–80, New York, NY, USA.Google Scholar
- Baker MJ, Eick SG (1995) Space-filling Software Visualization. Journal of Visual Languages & Computing 6(2): 119–133.View ArticleGoogle Scholar
- Eick SG, Karr AF (2002) Visual Scalability. J Comput Graph Stat 11(1): 22–43.MathSciNetView ArticleGoogle Scholar
- Yost B, North C (2006) The Perceptual Scalability of Visualization. IEEE Trans Vis Comput Graph 12(5): 837–844.View ArticleGoogle Scholar
- Wei B, Silva C, Koutsofios E, Krishnan S, North S (2000) Visualization research with large displays. IEEE Comput Graph Appl 20(4): 50–54.View ArticleGoogle Scholar
- Balzer M, Deussen O (2005) Voronoi treemaps In: Proceedings of IEEE Symposium on Information Visualization, 2001. INFOVIS 2001, 49–56, Minneapolis, MN.Google Scholar
- Zhao S, McGuffin MJ, Chignell MH (2005) Elastic hierarchies: combining treemaps and node-link diagrams In: Proceedings of IEEE Symposium on Information Visualization. INFOVIS 2005, 57–64. doi:10.1109/INFVIS.2005.1532129.
- Nguyen QV, Huang ML (2003) Space-optimized tree: a connection+enclosure approach for the visualization of large hierarchies. Inform Vis 2(1): 3–15.View ArticleGoogle Scholar
- Nguyen QV, Huang ML (2005) EncCon: an approach to constructing interactive visualization of large hierarchical data. Inform Vis 4(1): 1–21.View ArticleGoogle Scholar
- Van Wijk JJ, Van de Wetering H (1999) Cushion treemaps: visualization of hierarchical information In: Proceedings of 1999 IEEE Symposium on Information Visualization, INFOVIS 99, 73–78, San Francisco, CA.Google Scholar
- Tu Y, Shen HW (2007) Visualizing Changes of Hierarchical Data using Treemaps. IEEE Trans Vis Comput Graph 13(6): 1286–1293.View ArticleGoogle Scholar
- Lü H, Fogarty J (2008) Cascaded treemaps: examining the visibility and stability of structure in treemaps In: Proceedings of Graphics Interface, 259–266, Toronto, Ont. Canada.Google Scholar
- Fekete JD, Plaisant C (1999) Excentric labeling: dynamic neighborhood labeling for data visualization In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 99, 512–519, New York, N Y, USA.Google Scholar
- Turo D, Johnson B (1992) Improving the visualization of hierarchies with treemaps: design issues and experimentation In: Proceedings of the 3rd Conference on Visualization, VIS 92, 124–131. CA, USA.Google Scholar
- Fekete JD, Plaisant C (2002) Interactive information visualization of a million items In: IEEE Symposium on Information Visualization, INFOVIS 2002, 117–124. doi:10.1109/INFVIS.2002.1173156.
- Keahey TA (2001) Getting along: composition of visualization paradigms In: Proceedings of IEEE Symposium on Information Visualization, 2001. INFOVIS 2001, 37–40. doi:10.1109/INFVIS.2001.963278.
- Shi K, Irani P, Li B (2005) An evaluation of content browsing techniques for hierarchical spacefilling visualizations In: Proceedings of IEEE Symposium on Information Visualization. INFOVIS 2005, 81–88. doi:10.1109/INFVIS.2005.1532132.
- Yang Y, Dou N, Zhao S, Yang Z, Zhang K, Nguyen QV (2014) Visualizing large hierarchies with drawer trees In: Proceedings of the 29th Annual ACM Symposium on Applied Computing, SAC 2014, NY, USA.Google Scholar
- Liang J, Simoff S, Nguyen QV, Huang ML (2013) Visualizing large trees with divide & conquer partition In: Proceedings of the 6th International Symposium on Visual Information Communication and Interaction, VINCI 13, NY, USA.Google Scholar
- Fan J, Yau DKY, Elmagarmid AK, Aref WG (2001) Automatic image segmentation by integrating color-edge extraction and seeded region growing. IEEE Tran Image Process 10(10): 1454–1466.View ArticleGoogle Scholar
- Wherry E (2003) Scroll ring performance evaluation. Extended Abstracts on Human Factors in Computing Systems: 758. doi:10.1145/765891.765973.
- Moscovich T, Hughes JF (2004) Navigating documents with the virtual scroll ring In: Proceedings of 17th Annual ACM Symposium, 57, New York, USA.Google Scholar
- Bostock M, Ogievetsky V, Heer J (2011) D3: Data-Driven Documents. IEEE Trans Vis Comput Graph 17(12): 2301–2309.View ArticleGoogle Scholar
This is an Open Access article distributed under the terms of the Creative Commons Attribution License(http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.