Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Multimodal analogs to infer humanities visualization requirements

Richard Brath
Uncharted Software Inc.

ABSTRACT overview across the entire range of goods


Gaps and requirements for multi-modal interfaces throughout the entire store from front to back [2].
for humanities can be explored by observing the Initial view and initial movement in the space
configuration of real-world environments and the provide overviewing [3] of the detail goods and
tasks of visitors within them compared to digital reveals structure (e.g. gender is left and right,
environments. Examples include stores, museums, new/old are front/back, basics/specialties are
galleries, and stages with tasks similar to side/center). Object positioning (orientation and
visualization tasks such as overview, zoom and location) and lighting (brightness) direct attention
detail; multi-dimensional reduction; collaboration; to aid tasks such as navigation, exploration and
and comparison; with real-world environments planning. Guidance may be provided by creating a
offering much richer interactions. Some of these hierarchy of pathways thereby directing
capabilities exist with the technology and movement (e.g. Ikea). Zoom and filter is the simple
visualization research, but not routinely available motion of walking to the proximity of a group of
in implementations. goods, with details on demand either by direct
investigation of the artifact, or by conversation
Keywords: Multimodal visualization, high- with the real-world agent (i.e. the staff). Note that
dimensional data visualization, real-world interfaces. the details are myriad, including close visual
inspection, texture and form, meta-data (e.g. label
Index Terms: K.6.1 [Management of Computing and indicating material, care, country of origin, price,
Information Systems]: Project and People and so on).
Management—Life Cycle; K.7.m [The Computing
Profession]: Miscellaneous—Ethics
1 INTRODUCTION
Multimodal visualization for humanities subjects
may seem to be a challenge due to the limitations
of modalities – such as touch, smell and taste –
however there are additional experiential aspects
of cultural interpretation that are also lost in
digitization, data transformation and
computational representation. By observing some
of these experiential aspects in the existing analog
environments we can both confirm existing
visualization interaction patterns and determine
potential additional requirements for interactive
representations of data and properties associated Figure 1: Fashion retail: Overview on entry, zoom by walking,
with multimodal cultural artifacts. details by inspection or verbal query.

The contribution shows opportunities for Note that the digital equivalent of this experience
extensions to visualization techniques, such as (e.g. Amazon) provides interaction with the meta-
improved overview and clustering (Section 2); data (filters) but no haptic interaction (What does
cues and collaboration from other visitors (3), and the material feel-like? How will the material
broader comparison tasks (4). These are discussed drape? Does it crease? Does it stretch?); nor
and situated to prior and future research (5). derived attributes (How does this look in
sunlight?). Furthermore, post-Covid evidence
2 SOME CONFIRMATIONS indicates people returning to real-world retail
Some real-world techniques match patterns experiences, presumably indicating a desire for
promoted in visualization design. different experience in analog environments [4,5].
Arguably, current digital retail experiences provide
2.1 Overview, Zoom, Details a poor overview: consider the number of goods
visible upon entry in Figure 1 vs. the online
The heavily cited Shneiderman visualization experience.
mantra is, “Overview first, zoom and filter, then
details-on-demand,” [1]. This pattern occurs in 2.2 Multi-dimensional reduction
many real-world environments, such as the
fashion retailer in Figure 1. which has a physical Clustering, and more broadly multi-dimensional
3D arrangement organized to create a visual reduction of high dimensional spaces into 2 or 3
dimensions, is a popular topic in visualization (e.g. 3 SOCIAL CUES AND COLLABORATION
[6,7,8]). It is also a commonly solved problem in In many real-world analogs there are strong
real-world analogs. Figure 2, shows a series of collaborative interactions.
bottles with a wide variety of characteristics, e.g.
brand, alcohol content, source grain, frequency of 3.1 Serendipitous Eavesdropping or
use, etc., organized so that highly similar contents Collaborative Interpretation
are closest together [9]. In this real-world analog, a
human curator has organized the content to suit Figure 4 shows an art gallery with many visitors
the needs (presumably facilitating fast visual [10]. Movement through the gallery requires
search, quick retrieval of most popular items, and recognition of other visitors, to avoid collision, and
overviewing the breadth of beverages available). more generally provides additional information,
such as which art works are more popular
(number of people at a work), how to interact with
a particular piece (are people appreciating it from
afar or up close?), what content within an artwork
is being engaged with (where is the visitor’s
attention with regards to a work?), what might
people be saying about a particular work (how are
individuals commenting about a particular piece?),
and so on. As opposed to serendipity across
cultural artifacts [11]; this is serendipity across
viewers, visible to the information flaneur [12].

Figure 2: Bottles organized by muiltivariate similarity. Figure 4: Crowded galleries for art appreciation and serendipitous
This clustering curation is central to organizing interpretation.
analog collections of artifacts such as museums Although galleries can be viewed with less people
and art galleries. Figure 3 shows a museum (e.g., early morning, or in the last few days of a
floorplan with rooms labeled across orthogonal show), the rich interaction between humans is
topics such as Korea, dinosaurs, toy soldiers, important—private galleries are particularly busy
biodiversity and florals, yet a visitor would have a at show openings: people want to be with other
high degree of confidence regarding the items they people in addition to the art, e.g. [13].
would find in each of these rooms.
3.2 Collaborative Selection
Content engagement may be directed and
encouraged by trusted relationships, such as
friends or staff, as shown in Figure 5 [14,15].

Figure 3: Musuem collection curation. Rooms are disparate, such


as, Korea, Dinosaurs, Toy Soldiers, Biodiversity, and Florals.

While visual analytic systems may use


unsupervised clustering, there are many unsolved
problems, such as determining the number of
relevant clusters to the task, generating
appropriate labels, and creating interpretable
visual layouts. Figure 5: Collaborative selection.
The visitor may be hesitant but may be otherwise direct comparison, such as paired artworks in
encouraged to make and commit to a selection Figure 7 [21].
with the additional input of a peer. This is a form of
synchronous collaborative user guidance e.g. [16].
3.3 Collaborative Sorting
Content can be directly engaged with, and visitors
may reorganize content, if permitted. Figure 6
shows a woman selecting and organizing fruit [17].
With each selection, various criteria are evaluated,
such as size, color, texture, firmness, surface
anomalies, shape, smell and so forth. This is not
simple filtering—a confluence of metadata is
evaluated, and a single piece of fruit could be
accepted or rejected from a combination of criteria
rather than thresholding on values. Over time,
successive rejections by successive visitors will
cause items to become sorted—one area in the bin Figure 7: The curator has paired art works facilitating comparison.
will likely contain highly rejected fruit (e.g. rotten), In Figure 8, the curator has deliberately placed
whereas another portion of the bin may contain mid-century indigenous art by Karoo Ashevak
less desirable fruit (e.g. blemished). This fine-grain (foreground sculpture) juxtaposed with
sorting is completely unavailable in digital contemporary mainstream art (paintings on wall)
counterparts. [22]. This uncommon pairing by the curator forces
criticality onto the viewer to consider indigenous
art in the context of its broader culture, not merely
in comparison to other indigenous art.

Figure 6: Collaborative sorting. Figure 8: Sculptures by indigenous artists curated together with
contemporary mainstream art forcing the viewer to compare
4 COMPARISON AND COORDINATION the relation between the two.
Comparison is a frequent task in visualization, with Symmetry and repetition invite comparison, such
subjects required to assess the relative sizes, as choreographed routines, as in Figure 9 [23].
positions, and so forth between marks within a While choreographed attributes could be directly
visualization e.g., [18,19,20]. The richness, visualized and compared, e.g., such as positions
complexity, and nuances of cultural objects invites and angles and deviations among performers; the
many comparisons and are thus frequently placed viewer may also attend to other comparisons, such
in proximity to facilitate comparison – although as performers’ expression, lighting, hairstyle,
not necessarily comparison of quantitative data. complexion, and so forth.
4.1 Curated comparison
A museum curator or shopping display may
deliberately place items adjacent to each other to
world analog can literally be dragging objects
overtop each other.

Figure 9: Comparison could be between choreographed attributes


such as pose and body angles, or features such as
expression, lighting, hair style, and so on.

4.2 Ad hoc comparison


The participant, however, may wish to do their Figure 11: The non-curator has assembled to disparate objects to
own comparisons. In real-world analogues, these assess potential compositions.
comparisons can be done easily in situations, such
as retail stores, by selecting and dragging objects 5 DISCUSSION, PRIOR AND FUTURE WORK
of interest into proximity to each other, such as Aspects such as serendipity, generosity, criticality
Figure 10 left [24]. In addition to side-by-side and guidance have been discussed in [28].
comparison, the person can view objects from Generous interfaces by Whitelaw [29] describes
many angles, lighting conditions, and so on. rich interactive interfaces with many similarities
to the examples above, such as “show everything”;
multivariate overviews, samples and context, and
high-quality primary content.
Multi-media and labels. Prior work in multimedia
indicates improved task performance when using
multiple channels. For example, learning and
problem-solving tasks are improved when textual
instructions are directly integrated into diagrams
[30]. Other studies in multimedia have shown
improved performance when information is
provided through both text and images with close
spatial positioning and/or visual linkages between
the two [31].
At a minimum, prior multi-media work indicates
the importance of textual labels which can be
utilized at the level of individual items or across
clusters. Whitelaw’s examples use simple term
frequency to extract single word labels for clusters,
Figure 10: Object comparisons, e.g., against each other in bright but this is an ongoing technical challenge.
light; in relation to subject. Techniques such as Latent Dirichlet Allocation
This comparison can be between subject and (LDA) return ranked word lists not singular
object, such as viewing the object superimposed on descriptive words. Summarization techniques may
the subject in a mirror to assess how the object be challenged to reduce to one-word labels such as
appears in relation to the subject’s skin tone, hair Figure 3. Even reduced to singular words,
color, body shape and so on (Figure 10 right [25]). challenges remain for integrating labels directly
Dudley [26] notes the importance of direct into the visualization while maintaining the
experience with the materiality of museum legibility of labels and not obscuring the data
artifacts, e.g. understanding the artifact in relation visualization marks.
to self. Game-Quality Rendering. Visualizations typically
4.3 Coordination and Composition transform data into a small palette of visual
attributes, such as size, color, brightness and
The task may be the assembly of dissimilar objects position – e.g., libraries such as D3.js, and end-user
into a composition. This may be challenging in tools such as PowerBI or Tableau. Advanced
digital environments and visualizations as the properties such as lighting, shading and materials
metadata may be quite different, such as jackets, are not used.
shirts, ties and belts in Figure 11 [27]. The real-
However, attributes of cultural artifacts, such as Comparison and Workspaces: Comparison is a
material or motion properties important in their frequent task in visualization, but is typically
analog interactions and comparisons (e.g., Figure highly constrained to objects within a singular
9,10). These could be retained and depicted visualization components. With many visualization
realistically within modern high-performance systems made of multiple panels and coordinated
game engines, even if abstracted from their interactions, the notion of dragging an object in
original objects into swatches and visualization one panel to another panel is impossible. In a
layouts. For example, the myriad of cloth museum or gallery, the curator has similarly
properties associated with historic clothing could imposed non-comparison, but the retail shopping
directly represented as a virtual physical example indicate visitors require agency to
visualization using an underlying game engine. assemble and compare disparate artifacts.
Some VR/AR visualizations use game engines, and Furthermore, the visitor may wish to compare
as advanced effects are built-in to these engines, aspects of items across stores, not simply solved in
features such as flames, particle systems, glow and digital environments. Some workspaces allow for
shadows are used, e.g. [32,33]; and some content to move between panes (e.g. cut and
humanities visualizations create more sensuous paste), although it is non-obvious how this might
effects such as animated sorting of piles of coin work in visualization systems and this seems to be
images [34]. an area for future research. There are
opportunities for visual interfaces to create
Gaming engines with high quality 3D, motion, unexpected pairings to induce correlations for the
material properties, lighting effects, and so forth observers [40], helping to create a critical context.
might be dismissed as unnecessary 3D [35] and
distracting for low dimensional data; yet, the 6 CONCLUSION AND NEXT STEPS
highly multivariate textural and motion qualities The purpose of this paper is to raise a breadth of
associated dance, music, costumes and other issues in the gap between real-world analog
objects could be directly depicted, interpreted and environments and digital analytical visualization
compared. environments; largely through observation of
Augmented Reality: Willet et al [36] outline tasks in real-world tasks and goals as a means for
superpowers as a source of inspiration for prompting potential interactions and
interactive visualization techniques. Some of these requirements. Some of these issues are already
interactions are highly relevant to digital solved in digital environments and prior research,
humanities visualizations. For example, although not necessarily used in routine
humanities artifacts may not be directly accessible visualization. To bring these capabilities routinely
for close (touch) inspection, but could be into visualization, is perhaps more of an issue of
investigated using AR x-ray vision. ease of implementation (visualization toolsets do
not have these features); as well as cultural
Or, RealitySketch [37], an AR system which expectations – e.g. art websites don’t necessarily
generates predictive overlays, such as paths have an expectation to reveal visitors to other
associated with physics (e.g. a pendulum), could be visitors particularly in a period of increasing
extended to other physical and temporal oriented privacy legislation.
humanities applications, e.g. dance prediction, to
show expected and un-expected movements.
Communication: Engaging with cultural artifacts REFERENCES
is highly enriched in the presence of other [1] Shneiderman, B. "The eyes have it: A task by data type taxonomy
individuals, changing our collective understanding for information visualizations." The craft of information
and response to these artifacts. Understanding visualization. Morgan Kaufmann, 2003. 364-371.
other individuals’ engagement with artifacts
becomes important, ranging from visible voyeur to [2] Claire, R. Clothing Store Interior. Pexels, 2020.
direct simultaneous engagement with the pexels.com/photo/clothing-store-interior-5531541
individuals and the artifacts.
[3] Hornbæk, K., & Morten H. "The notion of overview in information
There are many existing systems that provide visualization." International Journal of Human-Computer Studies
some capabilities, e.g. Disqus, a comment service 69.7-8 (2011): 509-525.
for web pages, allowing users to read and post
commentary; or Discord, allowing real-time [4] Howland, D. Amazon’s e-commerce retail sales fall 3% in Q1 as
instant messaging service enabling discussions consumers return to stores. Retail Dive, April 29, 2022.
across a topic; or Unity, a game engine with a retaildive.com/news/amazons-e-commerce-retail-sales-fall-3-
service for real-time cross-platform in-q1-as-consumers-return-to-store/622920/
communication. In all cases, the artifact being
discussed is separate from the conversation [5] Daleo, J. Amazon cancels or delays plans for at least 16
requiring descriptions to cross reference. Direct warehouses this year. FreightWaves, July 3, 2022.
freightwaves.com/news/amazon-cancels-or-delays-plans-for-at-
collaborative markup of the visualization allows
more directed conversations to particular aspects least-16-warehouses-this-year
of the content, e.g. [38,39]. [6] Van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-
SNE. Journal of machine learning research, 9(11).
[7] Saeed, N., Nam, H., Haq, M. I. U., & Muhammad Saqib, D. B. (2018). [25] Nilov, M. A woman filling a brown long sleeves shirt looking
A survey on multidimensional scaling. ACM Computing Surveys through a mirror. Pexels, 2021. pexels.com/photo/a-woman-
(CSUR), 51(3), 1-25. fitting-a-brown-long-sleeves-shirt-looking-through-a-mirror-
6969910/
[8] Liu, Y., Jun, E., Li, Q., & Heer, J. (2019, June). Latent space
cartography: Visual analysis of vector space embeddings. In [26] Dudley, S. Materiality Matters: Experiencing the displayed object.
Computer Graphics Forum (Vol. 38, No. 3, pp. 67-78). University of Michigan Working Papers in Museum Studies, 8: 1–
9. 2012. https://hdl.handle.net/2027.42/102520
[9] Kowalievska, EVG. Wine bottle on table. Pexels, 2018.
pexels.com/photo/wine-bottle-on-table-1128257 [27] Author. Photo of clothing coordination. 2022.

[10] Nery, J. Paintings in Side Room. Pexels, 2019. [28] Windhager, F., Federico, P., Schreder, G., Glinka, K., Dörk, M.,
pexels.com/photo/paintings-in-side-room-1839919 Miksch, S., & Mayr, E. Visualization of cultural heritage collection
data: State of the art and future challenges. IEEE transactions on
[11] Thudt, A., Hinrichs, U., & Carpendale, S. The bohemian bookshelf: visualization and computer graphics, 25(6), 2018. 2311-2330.
supporting serendipitous book discoveries through information
visualization. In Proceedings of the SIGCHI conference on human [29] Whitelaw, M. Generous Interfaces for Digital Cultural Collections.
factors in computing systems. 2012. pp. 1461-1470. In: DHQ: Digital Humanities Quarterly. Vol. 9, No. 1. 2015.

[12] Dörk, M., Carpendale, S., & Williamson, C. The information [30] Chandler, P. and Sweller, J. Cognitive Load Theory and the
flaneur: fresh look at information seeking. In Proceedings of the Format of Instruction. In Cognition and Instruction: 8(4) 1991,
SIGCHI conference on human factors in computing systems. 293-332.
2011. pp. 1215-1224).
[31] Narayanan, H. and Hegarty, M. Multimedia design for
[13] Whitehead, F. Why art students should go to gallery openings. communication of dynamic information. In International Journal
The Guardian, 2013. of Human Computer Interaction Studies,57, pp. 279-315. 2002.
theguardian.com/education/2013/jun/26/why-art-students-go-
gallery-openings [32] Sicat, R., Li, J., Choi, J., Cordeil, M., Jeong, W. K., Bach, B., & Pfister,
H. DXR: A toolkit for building immersive data visualizations.
[14] Thirdman. A woman fitting a black blazer. Pexels, 2021. IEEE transactions on visualization and computer graphics, 25(1),
pexels.com/photo/a-w0oman-fitting-a-black-blazer-8485730/ 2018. 715-725.

[15] Lion, S. Happy Asian women trying on hat. Pexels, 2020. [33] Kloiber, S., Settgast, V., Schinko, C., Weinzerl, M., Fritz, J., Schreck,
pexels.com/photo/happy-asian-women-trying-on-hat-5710454/ T., & Preiner, R. . Immersive analysis of user motion in VR
applications. The Visual Computer, 2020. 36(10), 1937-1949.
[16] Brodlie, K. W., Duce, D. A., Gallop, J. R., Walton, J. P., & Wood, J. D.
Distributed and collaborative visualization. In Computer graphics [34] Gortana, F., von Tenspolde, F., Guhlmann, D., & Dörk, M. (2018).
forum. 2004. Vol. 23, No. 2, pp. 223-251. Off the grid: Visualizing a numismatic collection as dynamic piles
and streams. Open Library of Humanities, 4(2), 1-25.
[17] Sparrow, J. Woman buying apples. Pexels, 2020.
pexels.com/photo/woman-buying-apples-4198223/ [35] Munzner, T. (2014). Visualization analysis and design. CRC press.

[18] Amar, R, Eagan, J. and Stasko, J. Low-Level Components of [36] Willett, W., Aseniero, B. A., Carpendale, S., Dragicevic, P., Jansen,
Analytic Activity in Information Visualization. InIEEE Y., Oehlberg, L., & Isenberg, P. Superpowers as inspiration for
Symposium on Information Visualization (2005), IEEE, pp. 111– visualization. IEEE TVCG, 2021.
117
[37] Suzuki, R., Kazi, R. H., Wei, L. Y., DiVerdi, S., Li, W., & Leithinger, D.
[19] Schulz, H. J., Nocke, T., Heitzler, M., & Schumann, H. (2013). A Realitysketch: Embedding responsive graphics and
design space of visualization tasks. IEEE Transactions on visualizations in AR through dynamic sketching. In Proceedings
Visualization and Computer Graphics, 19(12), 2366-2375. of the 33rd Annual ACM Symposium on User Interface Software
and Technology, 2020. pp. 166-181.
[20] Brehmer, M. and Munzner, T. A multi-level typology of abstract
visualization tasks. IEEE Transactions on Visualization and [38] Heer, J., Viégas, F. B., & Wattenberg, M. Voyagers and voyeurs:
Computer Graphics (Proceedings of InfoVis) 19, 12 (2013), Supporting asynchronous collaborative visualization.
2376–2385. Communications of the ACM, 2009, 52(1), 87-97.

[21] Bertelli, M. Photographs in art gallery exhibition. Pexels, 2022. [39] Willett, W., Heer, J., Hellerstein, J., & Agrawala, M.
pexels.com/photo/photographs-in-art-gallery-exhibition- CommentSpace: structured support for collaborative visual
11489982 analysis. In Proceedings of the SIGCHI conference on Human
Factors in Computing Systems, 2011. pp. 3131-3140.
[22] Author. Photo of mid-century indigenous sculptures in room
with corresponding contemporary paintings. 2022. [40] Drucker, J. Humanities Approaches To Interface Theory. Culture
Machine, 2011. Vol. 12.
[23] Gouw, T. Five women in white dress dancing under grey sky
during sunset. Pexels, 2016. pexels.com/photo/5-women-in-
white-dress-dancing-under-gray-sky-during-sunset-175658/

[24] Summer, L. Anonymous woman choosing outfit in store. Pexels,


2020. pexels.com/photo/anonymous-woman-choosing-outfit-in-
store-6347546/

You might also like