Professional Documents
Culture Documents
Article 504 2
Article 504 2
www.emeraldinsight.com/1741-0398.htm
Retrieving
Retrieving relevant information: relevant
traditional file systems versus information
tagging
79
Thomas W. Jackson and Stephen Smith
Department of Information Science, Loughborough University, Loughborough, Received 21 January 2011
UK Revised 21 March 2011
10 May 2011
Accepted 10 May 2011
Abstract
Purpose – The aim is to determine, in a business context, if tagging is a more effective method of
discovering relevant information when compared to traditional hierarchical filing systems.
Design/methodology/approach – A five-step interpretive hybrid approach of using both a focus
group, questionnaires and SWOT analysis was used to test the proof of concept of tagging files
compared to a traditional hierarchical filing system. The approach taken was chosen because of the
difficulties and tradeoffs that had to be made between the number of champions and people available
to take part in the research; the time that they could allow; and because transcription or recording of
the participants was not permitted. The participants were encouraged to use the questionnaires and
the SWOT analysis to record their thoughts anonymously whilst the focus groups allowed elaboration
and discussion to help understand the true feelings and thoughts of the group collaboratively.
Findings – Traditional hierarchical filing systems can lead to the retrieval of irrelevant information,
or to none at all, even though the information exists. The study shows that tagging could provide a
cost-effective solution by providing a better structured filing system that can help reduce duplication
and the retrieval of irrelevant information.
Research limitations/implications – One limitation of the study was the limited number of
participants from just one organisation. Thus, generalisation of the results of this study to the wider
population must be done with great care.
Practical implications – Organisations should evaluate the functionality of their chosen operating
system and information store software in light of the potential benefits offered by tagging, and costly
limitations of traditional file stores.
Originality/value – The paper contributes to the information retrieval and information overload
literature by studying the effect tagging files has on an organisation. It provides an insight to the
future of filing systems for management and triggers future empirical work into reducing information
overload in the workplace.
Keywords Tagging, Information overload, Information storage, Information retrieval, Semantics,
Information management, Classification schemes
Paper type Research paper
1. Introduction
Searching for relevant files within an organisation has been a source of problems for
employees for a number of years (Kelly et al., 2010; Kobayashi et al., 2006). For this
reason many companies invest substantial amounts of money in searchable portals Journal of Enterprise Information
and document storage facilities. Ineffective searches and wasting time looking for Management
Vol. 25 No. 1, 2012
information has been said to cost up to 10 per cent of an employee’s time (Dubie, 2006). pp. 79-93
Many organisations employ highly sophisticated search engines to search the full text q Emerald Group Publishing Limited
1741-0398
of all of the files stored within their systems (Smith, 2010). However, many search DOI 10.1108/17410391211192170
JEIM systems cannot retrieve documents that have not already been indexed by that search
25,1 system such as those within a user’s corporate network profile. In addition, web-based
search systems, such as Google or even a corporate intranet search engine do not help
users locate files stored on their local computers. The user can resort to using their
operating systems search facility but this suffers from the same problems as the web
based search systems (Smith, 2010). Using such search engines also requires users to
80 visit the search page and wait for results to be returned. Further to this, if the user
wishes to browse for a file whilst continuously narrowing down their search as they go,
they are restricted to a traditional hierarchical file structure. Ultimately there are a
number of barriers for users to overcome when attempting to find information using
traditional search approaches (Smith, 2010). This increases the difficulty in discovering
relevant information and can lead to an inability to find what the user is looking for,
which can also be compounded by task uncertainty (Nunez and Giachetti, 2009).
To illustrate the severity of this problem Nelson (1994) references Naisbitt, arguing
that: “Inundated with technical data, some scientists claim it takes less time to do an
experiment than to find out whether or not it has been done before.” It is not surprising
that in a world that is increasingly dominated by computer-mediated communication
systems, it appears that the volume and pace of information can become overwhelming
as shown by even earlier studies in the 1980s (Hiltz and Turoff, 1985; Kerr and Hiltz,
1982). Further studies have gone on to define information overload and the impact it
has. For example in the 1990s Meglio and Kleiner looked at ways of managing
information overload. They looked at time management, communication and the idea
that individuals can reduce information overload as they are part of the problem
(Meglio and Kleiner, 1990). The research showed that users of information all
contribute towards information overload and if these issues can be realised a conscious
effort can be made to reduce the overload, thus improving the effectiveness of
communication. A decade later both Farhoomand and Drury showed that information
overload is still a problem and it affects managers in organisations on a daily basis and
can have dramatic effects within an organisation (Farhoomand and Drury, 2002; Kirsh,
2000). Continuing the journey on the timeline, a decade later in 2010, Soucek and Moser
reported that information overload causes problems in email communication within
the workplace and that the issue of information overload still needs to be addressed
and training might provide a potential solution (Soucek and Moser, 2010). Other
research has taken more of a holistic view in identifying the factors surrounding
information overload, and in particular the main factors contributing to
technology-based productivity losses through information overload, communication
overload, and system feature overload (Karr-Wisniewski and Lu, 2010). To help
visualise the sources of information and potential overload an employee might
encounter, the authors have created a conceptual map of the sources of information
that an employee can access to retrieve information and knowledge as shown by
Figure 1. This study investigates a solution to reducing the information overload
problem in organisations by focusing on improving the filing and retrieving of
electronic documents (denoted by the red).
In the problem domain of information storage and retrieval, the main research
question of this research is, can tagging files save employees time retrieving
information compared to a traditional hierarchical file system. An enterprising
innovation to aid in the storage and retrieval of business information is critical in
Retrieving
relevant
information
81
Figure 1.
How an employee locates
information/knowledge
sources
3. Methods
An interpretive philosophy was chosen as the underlying framework and the research
adopted a systems development methodology that recognises three main stages:
concept development, system building, and system evaluation (Burstein, 2002). In
terms of system evaluation a system-centred approach was taken. Wang and
Forgionne’s (2008) comprehensive approach was considered but rejected due to the
time constraints of the research project.
The authors developed a proof of concept, called TagDav, to enable users to tag the
files they use. Details of the system are described in the “Proof of Concept – TagDav”
section. A number of different methods were used to test TagDav at SoftwareCo, which
are detailed later in this section. SoftwareCo, is one of the largest software
organisations in the world and is in the top ten of all of the major software rankings
including the Forbes 2000 and Research Foundation’s top 100. The organisation
employs over 50,000 people in over 50 countries. The company develops a range of
software solutions in house and its products are used globally. The SoftwareCo
department that participated in this work was one of the rapid development and value
prototyping divisions. The department’s employees are highly skilled within their
respective field and the department has a very unique structure. The department has
attempted to create an environment specifically suited to rapid application
development, testing and deployment. The department aims to have only a limited
hierarchical structure, with all members of the department seen as equals and
interacting with each other to take advantage of their respective skills. The nine
participants for this research study were selected by one of the champions within the
organisation, who selected on the criteria of someone who would give valuable input,
and their knowledge of tagging (experienced and inexperienced). The nine participants
were also chosen to include the most varied views including those who would be
expected to be advocates of such a system and those expected to be against it.
Figure 2 shows the approaches used to determine the effectiveness of a tag-based
filing system:
Figure 2.
The assessment of tag
based file storage
JEIM .
Step one – an eight question questionnaire was given to the nine participants to
25,1 assess the barriers to tagging, before they were given a demonstration of
TagDav.
.
Step two – a demonstration of TagDav was given to the employees at
SoftwareCo within the focus group environment.
.
Step three – the second questionnaire contained 16 questions and focused on the
84 way that users currently store and retrieve files and questions about the TagDav
system. For example, how easy it is to use and whether they felt that training
would be necessary in order to use the system effectively. The aim of these
questions was to establish if the participants had a common system, or method of
working, or if they all used different filing schemes.
.
Step four – the focus group consisted of nine members from the SoftwareCo. A
focus group was chosen to enable an open discussion to enable participants to
cover a large range of aspects that are relevant to them as a group.
.
Step five – participants were provided with a matrix on which they could record
their thoughts within four key areas, strengths, weaknesses, opportunities and
potential risks that might be introduced by the use of the tag based approach to
document retrieval.
The study took a hybrid approach using both focus groups and questionnaires, which
is similar to, but did not follow, a Delphi style approach. The approach taken was
chosen because of the difficulties and tradeoffs that had to be made:
.
between the number of champions and people available to take part in the
research;
.
between the time that they could allow; and
.
because transcription or recording of the participants was not permitted by
SoftwareCo.
The participants were allowed to use the questionnaires and the SWOT analysis to
record their thoughts anonymously whilst the focus groups allowed elaboration and
discussion to help understand the true feelings and thoughts of the group
collaboratively. Although transcription or recording was not allowed by the
organisation a number of notes and “sound bites” were taken that were approved
by the participants.
Name Tags
86
Figure 3.
TagDav with close folders
issue if there were thousands of tags entered into the system this would not be the case.
The tags would be ordered alphabetically and could quickly be found. In many
operating systems simply beginning to type the directory name would highlight that
directory and in this case the tag. Once one tag had been chosen, the list of remaining
tags would be significantly reduced. The process could continue until the desired file
was found. This root folder also has the option to show all of the files that exist in the
system but for simplicity it is not shown in Figure 3. The tags from Table I can be seen
as directories within the file system that contain the files.
It is then possible to expand and navigate within the folders to see the files that are
tagged by that content. Figure 4 shows the tag “tag_both” expanded as a folder
Figure 4.
TagDav with one folder or
tag expanded
showing the two files within this folder and also the two tags that those files are also Retrieving
tagged with.
The tags may be combined to further refine the search within the folders. Figure 5
relevant
shows the full expansion of the example files and their folders. information
Although in Figure 5 there are nine files listed there only actually three physical
files stored within the system. Each file is only stored within the system once and each
of the files shown in Figure 5 is actually a virtual reference to a file that will be 87
retrieved at run time should the user wish to actually download or open this file. In
order to work with this file, the user simply has to click the file as they would if the file
were in any traditional file system.
Figure 5.
Fully expanded view
JEIM users (40 per cent) stating that they used them, three (60 per cent) stating that they did
25,1 not and the other using them and some not. Only one participant (20 per cent) used
synonyms of a word when creating tags in order to help them to retrieve the content in
future.
During the focus group participants said that they were not aware of the issues
associated with tagging. Many of the participants had only thought of tagging as being
88 of use for them personally and not considered that other people would be using these
tags to retrieve content. The participants also commented that many different systems
ask users to tag differently or do not give any guidelines as to how tags should be
entered into the system. All participants agreed that with the correct training and if all
participants were to make use of the same tagging scheme, then the benefits of tagging
would be far greater.
The questionnaire helped to identify the way that users were currently tagging and
through discussions during the focus group it was possible to expand upon the
potential issues with tagging. One of the group members stated “although there are
barriers towards tagging if they can be overcome then this could actually lead to
increased consistency, ease of finding docs/data” with another user stating that “after
training tagging might present a powerful option but that they were not aware of all of
the potential problems”. These issues highlight the need for training but also for a
system to remind users of the benefits of considering these options when tagging
systems are in place.
The practical implications of this research are that information rich enterprises, similar
to the one studied in this research, should evaluate the functionality of their chosen
operating system and information store software in light of the potential benefits
offered by tagging, and costly limitations of traditional file stores. The evaluation will
provide information rich enterprises with the “best practice” when it comes to
information storage and retrieval. It can be concluded that the TagDav approach
provides a mechanism to aid in the storage and retrieval of business information which
can lead to generating greater cost effectiveness and increased performance of
information rich enterprises.
A limitation of the study is TagDav was a proof of concept and the employees were
not able to use it on a daily basis to see if it really made a difference to their information
capabilities. A further limitation is the number of participants and any further research
in this area would benefit from a larger group of participants. Overall, the study has
JEIM shown that traditional filing systems should be reviewed in the light of the potential
25,1 benefits offered by tagging and the costly limitations of traditional file stores.
References
Burstein, F. (2002), “System development in information systems research”, in Williamson, K. (Ed.),
Research Methods for Students, Academics and Professionals: Information Management and
92 Systems, 2nd ed., Centre for Information Studies, Charles Sturt University, Wagga Wagga,
pp. 147-58.
De Kunder, M. (2011), “The size of the world wide web”, available at: www.WorldWideWebSize.
com (accessed 10 May 2011).
Dubie, D. (2006), “Time spent searching cuts into company productivity”, Network World,
available at: http://www.networkworld.com/news/2006/102006-search-cuts-productivity.
html (accessed 10 May 2010).
Farhoomand, A.F. and Drury, D.H. (2002), “Managerial information overload”, Communications
of the ACM, Vol. 45 No. 10, p. 127.
Flickr (2006), available at:www.flickr.com/photos/tags/ (accessed April 2006)
Golder, S.A. and Huberman, B.A. (2006), “Usage patterns of collaborative tagging systems”,
Journal of Information Science, Vol. 32 No. 2, p. 198.
Hayman, S. and Lothian, N. (2007), “Taxonomy directed folksonomies”, available at: http://
archive.ifla.org/IV/ifla73/papers/157-Hayman_Lothian-en.pdf (accessed January 2011).
Hiltz, S.R. and Turoff, M. (1985), “Structuring computer-mediated communication systems to
avoid information overload”, Communications of the ACM, Vol. 28 No. 7, pp. 680-9.
Jackson, T.W. and Culjak, G. (2006), “Can seminar and computer-based training improve the
effectiveness of electronic mail communication within the workplace?”, in Spencer, S. and
Jenkins, A. (Eds), Proceedings of the 17th Australasian Conference on Information Systems.
Karr-Wisniewski, P. and Lu, Y. (2010), “When more is too much: operationalizing technology
overload and exploring its impact on knowledge worker productivity”, Computers in
Human Behavior, Vol. 26 No. 5, pp. 1061-72.
Keller, P. (2008), “deli.ckoma.net Del.icio.us stats”, available at: http://deli.ckoma.net/stats#tab_
tags (accessed December 2008).
Kelly, D., Fu, X. and Shah, C. (2010), “Effects of position and number of relevant documents
retrieved on users evaluations of system performance”, ACM Transactions on Information
Systems (TOIS), Vol. 28 No. 2, pp. 1-29.
Kerr, E. and Hiltz, S. (1982), Computer-mediated Communication Systems: Status and Evaluation,
Academic Press, New York, NY.
Kirsh, D. (2000), “A few thoughts on cognitive overload”, Intellectica, Vol. 1 No. 30, pp. 19-51.
Kobayashi, T., Misue, K., Shizuki, B. and Tanaka, J. (2006), “Information gathering support
interface by the overview presentation of web search results”, Proceedings of the Asia
Pacific Symposium on Information Visualisation, Vol. 60, pp. 103-8.
Lux, M. (2010), “Flickr uploads per minute – mean uploads per hour”, available at: www.
semanticmetadata.net/2010/03/12/flickr-uploads-per-minute-mean-uploads-per-hour/
(accessed 10 April 2011).
Mathes, A. (2004), “Folksonomies-cooperative classification and communication through shared
metadata”, paper presented at Computer Mediated Communication, LIS590CMC (Doctoral
Seminar), Graduate School of Library and Information Science, University of Illinois
Urbana-Champaign, Urbana, IL, December.
Meglio, C. and Kleiner, B. (1990), “Managing information overload”, Industrial Management Retrieving
& Data Systems, Vol. 90 No. 1, pp. 23-5.
Nelson, M.R. (1994), “We have the information you want, but getting it will cost you!: held
relevant
hostage by information overload”, Crossroads, Vol. 1 No. 1, pp. 11-15. information
Nunez, A.N. and Giachetti, R.E. (2009), “Quantifying coordination work as a function of the task
uncertainty and interdependence”, Journal of Enterprise Information Management, Vol. 22
No. 3, pp. 361-76. 93
Rosacker, K. and Rosacker, R. (2010), “Information technology project management within public
sector organizations”, Journal of Enterprise Information Management, Vol. 23 No. 5,
pp. 587-94.
Smith, S. (2010), “Reducing information overload by optimising information retrieval
approaches”, PhD thesis, Loughborough University, Loughborough.
Soucek, R. and Moser, K. (2010), “Coping with information overload in email communication:
evaluation of a training intervention”, Computers in Human Behavior, Vol. 26 No. 6,
pp. 1458-66.
Wang, Y. and Forgionne, G. (2008), “Testing a decision-theoretic approach to the evaluation of
information retrieval systems”, Journal of Information Science, Vol. 34 No. 6, pp. 861-76.
Corresponding author
Thomas W. Jackson can be contacted at: t.w.jackson@lboro.ac.uk