The Augmented Web: Rationales, Opportunities, and Challenges On Browser-Side Transcoding

The Augmented Web: Rationales, Opportunities, and Challenges
on Browser-Side Transcoding
OSCAR DÍAZ and CRISTÓBAL ARELLANO, ONEKIN Research Group,
University of the Basque Country (UPV/EHU)
Today’s web personalization technologies use approaches like user categorization, configuration, and cus-
tomization but do not fully support individualized requirements. As a significant portion of our social and
working interactions are migrating to the web, we can expect an increase in these kinds of minority require- 8
ments. Browser-side transcoding holds the promise of facilitating this aim by opening personalization to
third parties through web augmentation (WA), realized in terms of extensions and userscripts. WA is to the
web what augmented reality is to the physical world: to layer relevant content/layout/navigation over the
existing web to improve the user experience. From this perspective, WA is not as powerful as web personal-
ization since its scope is limited to the surface of the web. However, it permits this surface to be tuned by
developers other than the sites’ webmasters. This opens up the web to third parties who might come up with
imaginative ways of adapting the web surface for their own purposes. Its success is backed up by millions
of downloads. This work looks at this phenomenon, delving into the “what,” the “why,” and the “what for”
of WA, and surveys the challenges ahead for WA to thrive. To this end, we appraise the most downloaded
45 WA extensions for Mozilla Firefox and Google Chrome as well as conduct a systematic literature review
to identify what quality issues received the most attention in the literature. The aim is to raise awareness
about WA as a key enabler of the personal web and point out research directions.
Categories and Subject Descriptors: D.2 [Software]: Software Engineering
General Terms: Standardization, Design
Additional Key Words and Phrases: Personalization, adaptation, transcoding, JavaScript
ACM Reference Format:
Oscar Dı́az and Cristóbal Arellano. 2015. The augmented web: Rationales, opportunities, and challenges on
browser-side transcoding. ACM Trans. Web 9, 2, Article 8 (May 2015), 30 pages.
DOI: http://dx.doi.org/10.1145/2735633
1. INTRODUCTION
Web personalization refers to making a website more responsive to the unique and
individual needs of each user [Cingil et al. 2000]. Similar to other software efforts,
traditional personalization scenarios prioritize the most demanded personalization
requirements while minority requests are pushed aside. However, as a significant por-
tion of our social and working interactions are migrating to the web, we can expect an
increase in minority personalization petitions. Unfortunately, today’s personalization
technologies (e.g., user categorization, configuration, and customization) do not fully
support individualized requirements [Ng et al. 2010]. This lack of individualization
This work is cosupported by the Spanish Ministry of Education, and the European Social Fund under contract
TIN2011-23839 (Scriptongue).
Authors’ addresses: O. Dı́az and C. Arellano, Facultad de Informática, Paseo M. Lardizabal, 1 - 20.018 San
Sebastián (Spain); emails: {oscar.diaz, cristobal.arellano}@ehu.es.
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted
without fee provided that copies are not made or distributed for profit or commercial advantage and that
copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for
components of this work owned by others than ACM must be honored. Abstracting with credit is permitted.
To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this
work in other works requires prior specific permission and/or a fee. Permissions may be requested from
Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212)
869-0481, or permissions@acm.org.
c 2015 ACM 1559-1131/2015/05-ART8 $15.00
DOI: http://dx.doi.org/10.1145/2735633
ACM Transactions on the Web, Vol. 9, No. 2, Article 8, Publication date: May 2015.
8:2 O. Dı́az and C. Arellano
is increasingly becoming an issue as users push for a user model for the web that is
aware to the user’s specific needs. This discrepancy between what application develop-
ers can build and what individual end-users might need could be alleviated by opening
personalization.
Traditionally, the website master (the “who”) decides the personalization rules (the
“how”), normally at the inception of the website (the “when”), preferentially using a
server-centric approach (the “where”). By contrast, opening personalization implies
empowering third parties (the “who”) to tune websites once in operation (the “when”)
by acting on the rendering code (the “how”) at the client side (the “where”). This article
surveys how this is being achieved using web augmentation (WA).
WA sits within the general category of transcoding techniques, that is, transforming
content or a program on the fly to other formats. Here, the code to be transformed
generally refers to HTML code (better said, its runtime manifestation: the DOM tree).
Traditionally, transcoding takes place in a dedicated server or a proxy that intercepts
the HTTP requests. More recently, this transformation process is being moved to the
client, where the DOM is transformed on the fly [Bouvin 2002]. Initially, these tech-
niques were mainly used to improve the accessibility (e.g., Accessmonkey [Bigham
and Ladner 2007; Richards and Hanson 2004]) or multidevice support of web appli-
cations (see Asakawa and Takagi [2008] for an overview). More recently, a new crop
of augmenters have striven to address users’ demands for more sophisticated ways of
controlling the web experience, leading to the so-called personal web [Whitney 2009].
In this personal web, users will have the freedom to adapt their daily websites, and WA
might well become the ultimate personalization technique. Userscripts and browser ex-
tensions are the two most common mechanisms to support WA. As for January, 20151 ,
and just focusing on Mozilla Firefox, over 14 billion downloads of extensions evidence
the vitality of this movement.
Despite these figures, WA is still in its infancy, being largely conducted in an ad hoc
way, and largely overlooked by the academic community. Interestingly enough, no main
web conference lists WA as a topic of interest!2 This article seeks to increase awareness
of the rationales, opportunities, and challenges brought by WA. To this end, the article
is articulated around four main questions as presented in the next paragraphs.3
What is WA and what is it not? (Section 2). The first obstacle is the terminology
itself. Web augmentation, browser augmentation, client-side transcoding, and browser
refactoring are all terms being used to refer to a similar phenomenon. Besides syn-
onyms, we also have to face homonyms, that is, same word but different meanings. In
this case, some definitions tend to blur the difference between web augmentation and
related approaches such as web mashups, web personalization, or web appropriation.
This section characterizes WA in terms of both the means and the ends.
Why WA? (Section 3). WA initiates from the impossibility/reluctance of web owners
to anticipate/support some user needs. Accessibility requirements are a case in point.
Despite W3C guidelines, the fact is that it is not possible to specify one-page rendering
that will be easily readable by anyone with visual impairment. In addition, website
owners are not certain about the economic incentive of this additional development.
Here, WA can help by “outsourcing” this effort to third parties (where the payoff comes
from an economy of scale), nonprofit organizations (spurred by the public good), or,
simply, generous and skilful hobby programmers. But to what extent is this situation
real? Are users enhancing websites’ rendering with third-party augmenters? If so, this
1 https://blog.mozilla.org/theden/2015/01/19/firefox-add-ons-hit-4-billion-downloads/.
2 The revision was conducted for the 2011–2013 editions of the following conferences: WWW, ICWE, WISE,
Hypertext, and SAC. The closest topics include “User Driven and Automatic Adaptation of User Interfaces,”
“Personalization, Adaptivity and Context-Awareness,” and “Narrative on the Web and in Social Media.”
3 This article is based on a keynote given at the ComposableWeb ’12 conference [Dı́az 2012].
The Augmented Web: Rationales, Opportunities, and Challenges 8:3
would sustain WA as a down-to-earth approach. To this end, the article analyzes the
45 most-used WA-related browser extensions according to Mozilla Firefox and Google
Chrome registries and collects their numbers as of January 2013.
What is WA being used for? (Section 4). WA changes the perception of browsers:
from mere means for web rendering to platforms for web development. This shift in per-
ception fuels companies and individuals alike to rush to satisfy their own requirements,
in some cases, behind the web owner’s back. But what for? What are the motivations
for augmenting? To answer this question, we elaborate on a classification grid to pi-
geonhole the 45 analyzed extensions. Three major purposes were identified, namely,
refactoring (i.e., improving nonfunctional attributes), customization (i.e., catering for
the functional requirements of a minority within the website’s domain), and mod-
ding (i.e., addressing functions not originally conceived or intended by the website’s
designer).
What WA issues are most commonly cited in the literature? (Section 5). Pre-
vious questions sustain WA in terms of both the number of beneficiaries and range
of potential applications. This grounds the interest in addressing WA hurdles. To this
end, we conduct a systematic literature review with the following research questions:
What are the quality issues that received the most attention on WA, to what extent are
they covered, and how is coverage evolving? The survey identifies 42 relevant contri-
butions that were classified based on the ISO 25010 quality characteristics that were
mentioned as concerns in the contribution. The article ends with a discussion on the
main issues ahead (Section 6). We start by defining WA.
2. WHAT WA IS AND IS NOT

WA can be characterized in terms of both the means and the ends. As the means,
WA can be regarded as client-side web transcoding, that is, transforming content or a
program on the fly when loaded on the client.
WA comes in two flavors: extensions (a.k.a. “add-ons” in Internet Explorer and Fire-
fox parlance) and userscripts. Extensions are packed sets of files where their logic is
composed by JavaScript with access to the DOM API and browser-specific APIs. This
makes extensions browser specific. By contrast, userscripts are single self-contained
files with agnostic JavaScript that, besides DOM API, resort to a weaver API for ab-
stracting from browser specifics. Weavers (realized as browser extensions themselves,
e.g., Firefox’s Greasemonkey, or natively supported, e.g., Google’s Chrome) shelter user-
scripts from browser specifics at the expense of a more limited access to the browser
core.
No matter whether extensions or JS scripts are used (hereafter, jointly referred to
as “augmenters”), both perceive HTML pages as DOM documents. Augmenters can be
triggered by user interface events (UI events) on this DOM document (e.g., on load, on
click). Event payloads hold the data that feed the augmenter handlers, which, in turn,
update the DOM document and, in so doing, achieve the augmentation as such.
Based on the means, any client-side script could be regarded as an example of WA.
This is technically true in the sense that augmenter scripts transform the DOM docu-
ment. However, the essence of WA rests on what this transformation is about. Broadly,
we regard WA as pursuing similar goals as those of web personalization: improving the
user experience by departing from a one-size-fits-all characterization. It shares the end,
but augmentation differs in the who, what, where, and when of how personalization is
achieved (see Table I), namely:
—The where refers to the place where the adaptation takes place (server, proxy based,
client). We focus on client-side WA.
—The who. WA puts the focus on the website consumer rather than the website pro-
ducer. By consumer we mean not only the final users (e.g., hobby programmers) but
Table I. Web Personalization Versus Web Augmentation

Web Personalization Web Augmentation
Where Server/proxy/client Client
Who Webmaster Webmaster & third parties & hobby
programmers
How User involvement might not be required Users need to explicitly download the
Records available for the website community augmenter
Records kept locally for the user installation
What Rendering/navigation/layout of a single Limited to the rendering of potentially
website different websites
Table II. Characterizing Augmenters Based on Matter and Scope

(Numbers Refer to Augmenter Identifiers in Appendix A)
Single Site Domain-Based Cluster Concept-Based Cluster
Content [21],[39],[43] [AVG],[24],[44],[45] [3],[14],[16],[19],[22],
[27],[29],[33],[36],[37]
Layout [28] - [1],[15]
Navigation [BetterRTM] - [Skype],[7]
Functionality [11],[12],[20],[23],[35] [8],[13],[17] [26],[34],[42]
Multiple Aspects [31],[40] [2],[6],[9],[10],[18],[25],[38],[41] [4],[5],[30],[32]
also third parties. WA opens personalization to consumers and, in so doing, departs

from more traditional personalization scenarios where the website itself either caters
to the adaptation (i.e., adaptive web) or provides the means for registered users to
configure their web experience (i.e., adaptable web) [Brusilovsky and Maybury 2002].
—The how for traditional personalization is based on a user model generated from
implicit (adaptive approach) or explicit (adaptable approach) user input. The system
can make automatic predictions about the interests of a user by collecting prefer-
ence information from many users (a.k.a. collaborative filtering). WA differs in user
involvement and available information. First, personalization cannot be totally im-
plicit since the user has to explicitly participate by installing the augmenter. Second,
though not a technical mandate, WA tends to rest on locally stored information.
—The what for WA is confined to transformations of the DOM document. This
principally implies layering additional content, navigation, or functionality and,
in so doing, might also need to change the layout or the rendering. That said,
the augmentation space can still accommodate a large number of scenarios (see
Section 4). Though sharing similar motivations (i.e., improving the user experience),
WA departs from web personalization in that it might be conducted by different
developers from those of the host, following different life cycles, targeting different
audiences, and adopting distinct business models from those of the host (see
later). To illustrate WA, we resort to the most-used WA-related browser extensions
according to the registries of Mozilla Firefox and Google Chrome(see Appendix A).
A possible way to classify these examples is along matter and scope (see Table II).
Matter refers to the aspect of the rendering that is being impacted. This includes
content augmenters (e.g., add price history chart into an online shop as achieved by
The Camelizer [24]), layout augmenters (e.g., change the appearance of Tumblr dash-
board as supported by the Nice Tumblr [28]), navigation augmenters (e.g., add links to
Amazon products into online shops as realized by the Amazon for Chrome [6]), and func-
tionality augmenters (e.g., add an Auto Replay button to Youtube as obtained through
the Auto Replay [12]). As for scope, it refers to the portion of the websphere being
affected by the augmenter (hereafter, this is referred to as the host). Some extensions
include the whole web. For instance, Linkification4 turns any URL-like text into a
hyperlink no matter what website holds the URL. Likewise, SkypeButton5 converts
any phone number found in a web page into a button that launches Skype to call that
number. However, the more complex the augmenter is, the more difficult its reuse
throughout different websites. Options include a single website (e.g., BookBurro6 and
BetterRTM 7 are single-site extensions that focus on Amazon and Remember-the-Milk
as hosts, respectively), a domain-based clusterof websites (e.g., sites for online book
shopping that share similar functionality and concepts), and a concept-based cluster of
websites where the affected websphere portion is characterized by exhibiting a concept
in common (e.g. sites rendering book data, regardless of whether they sell books (e.g.,
Amazon) or not (e.g., Goodreads)).
Our vision is for WA to be another approach to web personalization where actors,
architecture, and means change w.r.t. traditional web personalization, but the end
remains: being more responsive to the unique and individual needs of each user. As
complementary to personalization-in-the-large, WA promotes personalization-in-the-
small.
2.1. What WA Is Not

WA shares approaches and aims with other areas, namely, web scraping, web mashups,
and web appropriation. This subsection clarifies differences and similitudes to better
set the boundaries.
Web scraping (a.k.a. web harvesting or web data extraction) is a technique of
extracting information from websites [Wikipedia 2014]. Though WA might extract
data from the underlying websites (i.e., the host), the final purpose is quite different.
Scraping tends to see websites as mere sources of data. By contrast, WA does not aim
to depart from the host, rather, the opposite: WA sticks to the host by leveraging it
to better account for the user specifics. In some scenarios, WA can be a useful ally for
webmasters (see Section 4).
Web mashups are web applications that take information from distinct sources and
present it in a new way or with a unique layout [Daniel and Matera 2014]. Similar to
mashups, WA also taps into someone else’s resources (i.e., the host). However, mashups
are fully fledged web applications, while WAs are not applications as such (i.e., they
are not autonomous) but extensions upon existing applications. To highlight this point,
let’s compare BookBump,8 a mashup application, with BookBurro, an augmenter, both
acting on top of Amazon. BookBump permits registered users to manage booklists.
Users provide basic details about the book of interest, and the site automatically collects
book-specific information for them (e.g., reviews, author quotations, price comparisons,
etc.) by resorting to open APIs such as that of Amazon. The benefits go to providers
and consumers alike: API providers (e.g., Amazon) might find new revenue channels,
while consumers (e.g., BookBump) can easily leverage their offerings by turning their
websites into hubs for an audience niche (e.g., booklist lovers). BookBump is a full-
fledged autonomous web application, even to the point that the target audience of this
mashup might or might not overlap with that of the resource provider (e.g., Amazon).
By contrast, BookBurro sticks to Amazon: the same website but different experience.
BookBurro augments Amazon; it does not substitute for Amazon.
4 https://addons.mozilla.org/firefox/addon/linkification.
5 http://www.skype.com/en/download-skype/click-to-call.
6 https://addons.mozilla.org/firefox/addon/book-burro.
7 https://addons.mozilla.org/firefox/addon/a-bit-better-rtm.
8 http://www.bookbump.com/.
Appropriation has been defined as “the way in which technologies are adopted,
adapted and incorporated into working practice. This might involve customisation in
the traditional sense (that is, the explicit reconfiguration of the technology in order
to suit local needs), but it might also simply involve making use of the technology
for purposes beyond those for which it was originally designed, or to serve new ends”
[Dourish 2003]. Appropriation proponents claim that it is through appropriation pro-
cesses that a user completes the work of designers by making interactive systems
functional within the scope of their activity [Belin and Prié 2012]. This resembles our
notion of personalization-in-the-small. There exists, however, a subtle difference be-
tween appropriation and WA as introduced in this work. In his seminal paper, Dourish
stated as research questions “what features of technological design support appropri-
ation? And so, how could systems be designed in order to accommodate, support and
encourage the process of appropriation?” That is, appropriation is somehow built into
the system. A system can then be qualified as appropriable or not. By contrast, WA
looks at the system (i.e., the host) as the reality: it exists as such. Websites are not
designed for appropriation. Augmentation can be achieved, be the system augmentable
or not.
Now that it has been clarified what WA is and is not, the next section moves on to
assess the success of WA and its causes.
3. WHY WA
Broadly, WA originates in the impossibility/reluctance of webmasters to anticipate/
support some user requirements. In this setting, the term “user” encompasses not only
stakeholders (i.e., those participating in the design of the website) but also third parties
and final users of the website. In a setting where Internet usage has skyrocketed 566%
since 2000 [Miniwatts Marketing Group 2013], idiosyncratic needs are likely to pop
up time and again. WA allows us to cope with this growth by opening personalization
to the communities where the need arises. But, have such needs really emerged? And
how successful has WA been in addressing those needs? This section looks for hints
along three aspects: downloads, usages, and market impact.
Downloads. Fortunately, browser companies keep extension stores/repositories. We
analyzed the 45 most-downloaded WA-related browser extensions in Mozilla Firefox
and Google Chrome and collected their numbers as of January 2013. According to
these registries, the number of downloads goes from 5,535 for the extension with
the lowest numbers (i.e., Manga One [45]) to 2,033,853 for Stylish [1], the first-most-
downloaded extension (see Appendix A for the numbers). If we look at the other
means to support augmentation (i.e., userscripts), to date (June 2012), and just fo-
cusing on Mozilla Firefox, over 1 billion downloads are recorded for userscripts kept
at http://www.userscripts.org (or its companion http://userscripts-mirror.org/). How-
ever, download figures don’t necessarily reflect the actual number of people using an
extension. Next, we look at the actual usages.
Usages. Besides downloads, we attempt to provide some insights on the evolution
of current users (i.e., people with the extension installed, not just downloaded). Un-
fortunately, this turned out not to be an easy task since these evolution figures are
available only for some extensions in Mozilla Firefox. Based on this subset,9 Figure 1
depicts the evolution from September 2009 to September 2012. The continuous line
stands for #users of the selected extensions. It can be observed that the tendency is an
(almost) steady soft increase in numbers. Though the growth is limited, these numbers
should be weighted by the 9% decrease in the browser share that Mozilla Firefox has
9 Numbers were available for the following extensions: Stylish [1], Flashblock [3], Text Link [7], SkipScreen
[8], gTranslate [16], AutoPager [17], and The Camelizer [24].
Fig. 1. Evolution of the number of users for the analyzed Mozilla Firefox’s augmenters (continuous line) in
comparison with the browser share of Mozilla Firefox (dotted line).
experienced (dotted line). Nevertheless, the tendency, though subtle, is for an increase
in the number of browser extensions installed. The important point is that this small-
ish act of installing an extension entails a power shift from web producers to web
consumers who are now empowered to fine-tune their web experience. This might have
a large market impact.
Market impact. The ultimate test of the strength of WA would be whether this
power shift is beginning to challenge any web markets. And this is likely the case for
two areas: online advertisements and online gaming. The former refers to augmenters
that prevent advertisements from popping up (a.k.a. ad blocking). A 2013 report on the
state of ad blocking shows that “up to 30% of web visitors are blocking ads, and that
the number of ad-blocking users is growing at an astonishing 43% per year” [PageFair
2013b]. The same report states that 84% of the top 100 websites rely on advertising to
generate revenue, and that extensions such as Adblock are causing increasing losses.
Though advertisements can be a nuisance, they are what pays the bills. Specifically,
estimations are that ad blocking cost Google Adwords $887 million in 2012 [PageFair
2013a]. So it comes as no surprise that Google has just removed the popular Adblock
Plus extension from Google Play [Bilton 2013]. Though this is just an example, it
illustrates that empowering consumers (i.e., freedom of installing whatever extensions
they want) might change the rules of the game. New deals need to be established
between web consumers and web producers. In this respect, one of the initiatives
states eloquently the endeavor: “create a more sustainable advertising ecosystem, one
in which publishers can focus on loyalty and engagement instead of traffic and clicks,
and make money without depleting their audience’s goodwill” [PageFair 2014b].
Another area impacted by WA is online gaming. Here, techy users might get an un-
fair advantage by tweaking the website (e.g., sniping hats, auto-entering raffles, etc.).
This might imply an additional effort for webmasters to prevent these unfair practices.
Alternatively, a wiser approach is expressed by one of these administrators as part of
a discussion thread at the Bazaar.tf forum: “as long as scripts don’t give you an advan-
tage, that’s fine. If you want to make a script that moves buttons around, that’s fine,
however I’d prefer you push UI ideas forward so we can make them instead. The TF2
Price Check integration is one of those features that I turned from a user’s userscript
into an actual feature” [Bazaar.tf 2013]. Notice the new deal: webmasters “paving the
cowpaths” that their audiences are effortfully walking through by implementing their
own augmenters. Webmasters become attentive observers of their audience’s practices
that might end up being integrated as part of a website’s offerings. WA brings consumer
empowerment, and this, in turn, is changing the status quo between web providers and
their audiences. This is ultimately why WA is important. The next section assesses WA
in terms of the wide range of scenarios in which augmenters are being used.
4. WHAT WA IS BEING USED FOR
Unlike traditional software architectures, web architecture facilitates tuning after
manufacturing. In this way, users can install augmenters and, in so doing, customize
the rendering manufactured by someone else. But what for? What are the spurs for
augmenting? Back in 1999, Bouvin indicated the purpose of WA as “to help users
structuralise their Web work” [Bouvin 1999], where revised tools were classified as
annotation/discussion support, link creation and traversal, guided tours, and structur-
ing helpers. This reflects the origins of WA within the hypertext community. Ten years
later, Chilton et al. indicated that “users customize the web for some of the same rea-
sons users usually customize: to better suit their needs by making tasks less repetitive
or aspects of browsing less annoying. . . . Lastly, an important reason to customize the
web is to ‘bend the rules’ - circumvent the legal restrictions or the intentions of the
site” [Chilton et al. 2010]. This section revises these insights in light of the 45 analyzed
extensions (see Appendix A) along three main spurs for WA (see Figure 2):
—Refactoring: restructuring an existing code to improve nonfunctional attributes
(e.g., usability)
—Customization: leveraging an existing code to fit the requirements of a minority
within the same functional domain of the host (e.g., shortcuts to easy repetitive tasks)
—Modding: modifying an existing code to perform a function not originally conceived
or intended by the host designer (e.g., supporting intersite tasks or outwitting legal
restrictions)10
What makes modding depart from customization is not the implementation means but
the ends. Modding does not aim to adapt host functionality. Rather, it introduces new
twists. For instance, legal restrictions prevent Wikipedia from housing copyrighted con-
tent. The Reflect augmenter mods Wikipedia by detecting chemical compounds within
articles and, next, inlaying drawings and formulae (some subject to copyright) [Pafilis
et al. 2009]. Reflect supplements the Wikipedia description but for personal use, hence
circumventing Wikipedia’s legal limitation. Notice that Reflect’s functionality will never
be offered by Wikipedia, and hence qualifying Reflect as a customization would have
been misleading. Rather, we prefer to borrow the term “modding” to characterize these
scenarios that might complement or even conflict with, but rarely extend, the host’s
functionality. By distinguishing “customization” from “modding,” we attempt to high-
light a distinctive aspect of WA; that is, users are now empowered not just to align with
the host (i.e., customization) but also to conflict with the host (i.e., modding). Modding
is not an evolution of traditional server-side customization but might likely be a revo-
lution on how users relate to the web. The next subsections delve into each category,
using examples from our sample extensions.
4.1. Refactoring
Refactoring code is the process of restructuring an existing code without changing
its external behavior. “Existing code” refers to the HTML document returned as a
result of an HTTP request. “External behavior” is understood as keeping the content
10 We borrow this term from video games [Wikipedia 2014].
Fig. 2. Examples of WA usage.
and offerings of the website unchanged. However, other aspects such as aesthetics
or layouts can be changed for the sake of improving nonfunctional attributes. This
category can next be refined based on the nonfunctional attribute being addressed,
namely, accessibility, usability, and performance.
Accessibility (i.e., removing barriers that prevent access to websites by people with
disabilities). This area was among the very first to investigate the use of WA. Talking
browsers (i.e., allowing users to hear the content of a web page read aloud) go back
to the 1990s [Takagi et al. 2008]. In our sample, this classification is represented
by NoSquint, an extension for adjusting zoom levels and colors on a per-page basis.
Other popular examples include setting a minimum font size for all web content (No
Small Text), resizing/scaling any image by dragging (Image Resizer/Scaler), enabling
browsing entirely with the keyboard (Mouseless Browsing), and, finally, making sound
responses while events happen (Noise).
Usability (i.e., enhancing the ease of use and learnability of websites). Examples
include turning URL texts found in webpages to be loaded by double clicks (TextLink),
introducing new themes and skins (Stylish), browsing multipage articles in one seam-
less view (Clearly), automatically loading the next pages when you reach the end of a
page (AutoPager), ad blocking (Updated Ad Blocker), and, finally, dynamically resizing
Adobe Flash content (FlashResizer).
Performance (i.e., improving the speed at which pages are loaded). Examples in-
clude blocking Flash content for good (FlashControl), introducing click-aware place-
holders for on-demand retrieval of Flash content (Flashblock), and setting Flash objects
to “Low Quality” (Low Quality Flash) to speed up loading.
4.2. Customization
For traditional software, customization mechanisms include inheritance and wrapping.
By contrast, browser-based customization is achieved through transcoding, decoupled
from the host code, and might well be conducted by different teams and following dif-
ferent life cycles than those of the host. Rationales for these decoupling/externalization
include webmasters’ impossibility to foresee some requirements, legal matters, budget
shortages, or lack of business interest on the part of the host webmaster.
Browser-based customization has to do with personalization-in-the-small and tends
to be audience driven; that is the audience is first set. Next, fringe functionality follows
from this audience. Hence, we distinguish three kinds of target audiences: shadow
stakeholders, secondary stakeholders, and trailbrazers.
Shadow stakeholders. Requirements from different stakeholders might be diver-
gent or even conflicting. When hidden or conflicting goals exist, studies report that
people will resist articulating them [Ackerman 2000]. Under such circumstances, it
is a challenge to converge on a consistent set of requirements and deliver a working
application at all. In this setting, business units might deliver their own solutions on
top of the host infrastructure. This is not a new problem. Handel and Poltrock [2011]
coined the term “shadow applications” to denote “applications introduced by business
units to satisfy requirements not met by official applications.” Shadow applications are
then characterized by their purpose (i.e., working around the limitations of an offi-
cial application) and their ownership (i.e., owned by the business unit rather than the
whole organization). For web-based applications, WA becomes a conduit for lightweight
shadow requirements (i.e., those that can be supported in the client). We suffer this
problem in our university. So far, the university’s intranet provides basic contact infor-
mation about the members in the university. Our department wanted this information
to be extended with a link that directly calls to the member’s LinkedIn account. Due to
disagreements or delays, the department itself developed a userscript that augments
the rendering of the university’s intranet with this link. Figure 3 provides a glimpse
with and without the augmentation.
Secondary stakeholders. Even if no conflict exists, prioritizing requirements is not
easy. Traditionally, user-centered design distinguishes between primary stakeholders
and secondary stakeholders to order user petitions. Secondary stakeholders are not
the main target of the design, but their needs should be met and problems solved if
possible. Current personalization techniques focus on either variations on content or
low-level aspects of user performance. However, this might not be enough for some
web applications. Web-based tools are a case in point [Toker et al. 2012]. The need
for personalized tools has long been recognized [Gwizdka and Chignell 2007]. Now,
these tools are being migrated to the web, and hence, tool personalization becomes a
web issue. Rather than a one-size-fits-all solution, WA might serve to accommodate
behavioral differences or supporting changes in behavior over time. WA emerges as
a technique to tackle different workflows and approaches to conduct the very same
activity by distinct audiences. An example is provided by Remember-the-Milk,11 a tool
for to-do list management. This website accommodates the primary stakeholder’s con-
cern. However, minority usage patterns might not be worth considering in the general
release of a website but even so might be catered to as WA extensions to be deployed on
11 https://www.rememberthemilk.com/.
Fig. 3. Augmenter samples taken from a department website (as a separated business unit), RTM (as the
host company), and Mr. Cosmic Shovel (as an end-user). Augmented outputs have dotted lines.
an individual basis. Indeed, A Bit Better Remember-the-Milk (RTM) is a WA extension

that adds a side navigation bar as a shortcut to access some to-do tasks (see Figure 3).
Most users will be satisfied by the RTM website, but this does not prevent secondary
requirements from being served by augmenting the RTM website. In our sample set,
this approach serves for focusing on HD-minded people by telling YouTube to load
videos in HD straight away (Auto HD), adding an HD option to “YouTube.com/watch”
URLs (YouTube HD), and, finally, focusing on bloggers, recommending images, links,
articles, and tags while writing in WordPress, Tumblr, or Drupal (Content by Zemanta).
Trailblazers. So far, scenarios tackle situations foreseen by the host. In contrast,
trailblazers anticipate some needs or niches within the host’s domain that have not yet
been considered. Trailblazers embody the notion of appropriation as “a way in which
technologies are adopted, adapted and incorporated into working practice” [Dourish
2003]. From this perspective, WA can be regarded as an appropriation technique. No-
tice, however (as highlighted in Section 2.1), that Dourish understands appropriation
as a characteristic of the tool being appropriated, whereas WA does not imply any com-
plicity with the host being augmented. For instance, The Camelizer depicts price evo-
lution for Amazon items (see Figure 3). This extension, akin to Amazon business, fore-
sees shoppers’ interest in knowing price evolution. Should this tendency strengthen,
Amazon might include it as part of its website.
4.3. Modding
When modding, transcoding aims to perform a function not originally conceived or
intended by the designer or to achieve a bespoke specification [Wikipedia 2014]. Unlike
customization, modding tends to be function driven; that is, the function is set at the
start. Next, the audience might follow. While both refactoring and customization benefit
the host (i.e., improving the web experience of its customer base), the modding impact is
manifold. We further characterize WA modding based on the host’s impact, specifically,
threats, allies, and neutral.
Threats. Here, augmenters somehow jeopardize the business model of the host.
This can be realized in different ways: blocking banners (AdBlock), preventing count-
downs for nonregistered users so that they are not spurred to upgrade their account
(SkipScreen), potentially deviating customers to competitors’ websites (e.g., price com-
parison augmenters like Amazon Chrome, InvisibleHand, Preisspion, TheBestPrice),
or downloading the host’s content (SaveFrom.net, Download Youtube, TubeEnhancer).
This is likely to raise legal matters (more in Section 6.3.2).
Allies. Here, augmenters add value and endorse the host’s business model. If the
host is a search engine (e.g., Google, Yahoo, Bing, etc.), then additional information
that complements the search ranking permits the host’s customers to make a better-
informed decision. This additional information can be about URLs’ safety (LinkScan-
ner12 ) or reputation (WebOfTrust). If the host is Wikipedia, then added value is given
by detecting chemical compounds within articles and offering drawings and formulae
that complement the Wikipedia description (Reflect [Pafilis et al. 2009]).
Neutral. Here, augmenters use the host as a mere conduit to contextualize services.
If this service is Skype, theSkypeButton augmenter turns any phone number found in
a web page into a button that launches Skype to call that number. This button impels
users to request Skype services with no effect on the underlying website (see Figure 3).
These scenarios showcase the opportunity for WA to become a main enabler of the
vision described in Raman [2009] where the web is evolving toward a cloud of customiz-
able applications and data. This section provides a first outline of the driving forces
behind this vision in terms of refactoring, customization, and modding. The rest of the
article looks into the stumbling blocks.
5. WHAT THE MOST COMMONLY CITED WA ISSUES IN THE LITERATURE ARE

WA is far from being a mature practice. It started off as a grassroots approach where
practitioners were the first to sense its potential. We wanted to pinpoint the difficulties
faced by WA developers, specifically those concerns that set WA apart from other web
developments. Unfortunately, userscripts/extensions, though serving us well to assess
the WA impact in terms of usage, tend to be badly documented with almost no record
of the design decisions taken or difficulties encountered. We now turn to the academic
literature. Specifically, this section discusses our systematic literature review (SLR).
An SLR is a means of identifying, evaluating, and interpreting all available research
relevant to a particular research question or topic area [Kitchenham and Charters
2007]. The analysis is focused on answering specific questions, usually related to the
identification and coverage of certain areas. We are interested in detecting what quality
issues received the most attention in the WA literature. The aim is to raise awareness
and point to research directions.
12 http://linkscanner.avg.com/ww.sals-how-it-works.html.
Fig. 4. Pie chart representing the distribution of research contributions excluded from the study.
5.1. Research Method

This section realizes the common SLR steps, that is setting the research scope, the
search process, the definition of the classification scheme, the mapping of contributions
according to the classification scheme, and the data analysis. We cover these aspects
in the next paragraphs (for a more detailed description about the SLR process, refer to
Kitchenham and Charters [2007]).
Research scope. The aim of this SLR is to identify issues faced in achieving quality
augmenters. That is:
What are the quality issues that received the most attention in the WA literature, to
what extent are they covered, and how is coverage evolving?
Search process. We focus on the search engines of the three major computer science
publishing houses: Springer (SpringerLink), IEEE (IEEE Xplore), and ACM (ACM
Digital Library). The following search strings were used as a query criterion:
“web augmentation” or “web transcoding” or “web automation” or “browser augmen-
tation” or “client-side refactoring”
All searches were conducted June 25, 2014, with the following number of hits: 107 for
ACM, 88 for IEEE, and 103 for Springer. This initial set was next refined by introducing
some inclusion and exclusion criteria. First, papers were included based on mentioning
any of the query terms (298 papers). Query term appearance was not limited to the title
or the abstract but to the entire text. This explains the need for an SLR as opposed to a
mapping study. Unlike a mapping study, an SLR does not stop at the title, abstract, and
keywords. It might require scanning the whole paper. This is needed since, so far, WA
tends to be an ancillary issue, subordinated to an end (e.g., accessibility, customization,
etc.). Papers’ titles or abstracts tend to focus on the end, whereas the quality issues
encountered during WA are hidden in the text.
Next, the exclusion criteria. Figure 4 depicts the different criteria: research contri-
butions with neither competitive nor peer review process (4%), full text was not avail-
able (4%),13 duplicated (2%), delta contributions that were later upgraded in a second
13 Thisexcludes PhD and master theses, technical reports, and the like that did not go through a competitive
peer-review process.
publication (5%),14 out of scope (36%), the proposed architecture was not purely browser
based (39%),15 and addressed production but with no reference to maintenance or other
product quality characteristic (10%).16 Neither the date nor the venue was an excluding
factor.
The screening process was performed separately by each of the two authors. Thirteen
contributions needed discussion, of which four were excluded and nine included in the
final set. In this way, the initial set was reduced to 42 relevant research contributions
(see Appendix B). Since the initial set consists of the whole number of hits returned by
the search engines, no additional search was conducted (e.g., the transitive closure of
the initial contributions using their references).
Definition of the classification scheme. The resulting publications are next clas-
sified along a schema of significance for assessing the research question, in this case, the
ISO 25010 quality characteristics, specifically the internal measures of software qual-
ity: efficiency, security, compatibility, portability, installability, and maintainability.
Mapping of contributions to the classification scheme. This was a lengthy pro-
cess since quality characteristics were ancillary considerations arising while achieving
a main goal, that is, accessibility. Two researchers analyzed the abstract, introduction,
and conclusion/future work of each publication returned, and based on the criteria for
inclusion and exclusion of papers, the papers were selected or not for a more thorough
analysis. Twenty-one contributions required discussion, in most cases due to a fuzzy
use of the ISO terminology.
Threats to validity. The main jeopardies to SLR validity include bias in the se-
lection of valid contributions and errors in properly classifying contributions. The for-
mer was addressed by referring to the main publishing houses in computing science
(i.e., ACM, IEEE, and Springer) and filtering all hits w.r.t. exclusion criteria based on
suggested practical issues [Kitchenham and Charters 2007]. As for faults in the clas-
sification, the process was independently conducted by each of the authors, and next,
mismatches were resolved. Finally, the classification schema itself might be improved.
Specifically, it would have been interesting to arrange papers along a two-dimensional
space: quality characteristics times what-for options (as introduced in Section 4). How-
ever, the limited numbers and the lack of public code would have limited the usefulness
of the outcome. Nevertheless, should WA mature and academic interest spread, this
study is worth conducting. Specifically, the impact of quality characteristics based on
the what-for schema would be most enlightening to allocate resources based on the
importance of this quality characteristic for the development at hand.
5.2. Data Analysis

From the accumulated results shown in Figure 5, we observe that “maintainability”
(48%) and “installability” (24%) are the most commonly arising issues. Other charac-
teristics are covered significantly less, with “efficiency” and “security” being mentioned
in one and two papers, respectively. This might be explained since WA has so far been
14 How are delta papers detected? We first normalized and listed the papers’ authors ordered by frequency.
Name normalization proceeded as follows: remove all the diacritics, convert all the characters to ASCII, and,
finally, normalize to the format “Surname FirstInitialName.” Authors’ names were checked for misspellings.
For authors with more than one publication, all of their papers were clustered by addressed issue and
contribution, and all but the longest/newest one was excluded. This was the one included in the study.
15 This mainly refers to proxy-based transcoding architectures where a large number of works exist for mobile
resizing.
16 This includes mechanisms to abstract the way in which augmenters are developed and includes JavaScript
APIs such as Chickenfoot [Bolin et al. 2005]; domain-specific languages such as those used in Marmite [Hong
and Wong 2006], CoScripter Tables [Cypher 2012], or Procedures [Firmenich et al. 2013]; or graphical user
interfaces such as Platypus [Turner 2005].
Fig. 5. Bar chart representing the quality characteristic concern over time: from least (efficiency) to broadest
coverage (maintainability).
Fig. 6. Pie chart representing the distribution of research contributions over publication venues.
used for lightweight developments that imply neither an important consumption of

resources (CPU, memory) nor security breaches. Remarkably, portability only accounts
for 12%, while its lack is severely limiting the usage of most augmenters to a single
browser. From the stacked bar chart, we see mixed contribution types from 2007 with
a peak in 2013.
As a byproduct, we also obtained the publication venues WA research has been
published in. From Figure 6, we observe that the International Conference of Web
Engineering (ICWE) and the International Cross-Disciplinary Conference on Web Ac-
cessibility (W4A) account for 10% each, the latter due to its focus on accessibility, cer-
tainly the refactoring aspect where WA has been more abundantly proposed. Almost
half of the research contributions (19) were unique in the venue they were published
in, indicated as “Others” in Figure 6. This piecemeal distribution confirms the diverse
settings where WA is being used, as pointed out in Section 2.
Table III. Impact of Browser Releases on WA Extension (a.k.a. Add-On) Maintenance, Reflected
in Terms of #comments Referring to This Issue in Both “Producer Comments” (i.e., WA
Developers) and “Consumer Comments” out of the Total Number of Comments
Extension Extension-ID Producer Comments Consumer Comments
Stylish stylish 17/41 6/104
WOT wot-safe-browsing-tool 7/50 1/306
Flashblock flashblock 10/26 21/109
Flagfox flagfox 16/100 0/107
Text Link text-link 7/38 1/19
SkipScreen skipscreen-incredible-rapidsha 3/26 5/19
InsivibleHand invisiblehand 0/21 0/21
Download Youtube download-youtube 6/61 3/113
SaveFrom.net helper savefromnet-helper 6/40 7/63
NoSquint nosquint 11/17 10/84
6. DISCUSSION
This section delves into the previous SLR by substantiating the specifics brought by
WA in meeting ISO quality characteristics. For each of these characteristics, we provide
the ISO definition. Next, we discuss what makes it important for WA. And finally, we
present an outline about the alternatives and conclusiveness of the proposed solutions.
Quotes from the analyzed contributions in the SLR are used to back our statements.
6.1. Maintainability
Maintainability refers to the degree of effectiveness and efficiency with which the
product (i.e., augmenters) can be modified. Here, we focus on adaptive maintenance
(i.e., modifying the system to cope with changes in the software environment). For
augmenters, this environment includes two moving targets: the browser and the host.
6.1.1. Coping with Browser Upgrades. Browsers tend to suffer frequent releases. For in-
stance, from January 2010 to December 2013, Mozilla Firefox and Google Chrome
underwent 89 and 29 stable releases, respectively (counting only desktop versions).
This fact might impact on augmenters. As anecdotal evidence, Table III provides some
numbers for the Mozilla Firefox community. For the interval between January 2013
and July 2014, we collected the #comments referring to browser upgrade issues in both
“Producer Comments” (i.e. WA developers) and “Consumer Comments” out of the to-
tal number of comments.17 Except for InvisibleHand, the rest of the augmenters hold
an important number of comments referring to keeping the pace of Firefox’s new re-
leases. It is fair to say that this nuisance is common to any web development. The
problem is, however, exacerbated for WA due to the limited resources available. Un-
like mission-critical websites, augmenters tend to be developed by small teams, even a
single person. As anecdotal evidence, 11 out of the 20 analyzed extensions indicating
their authors are single authored: Download YouTube, NoSquint, Updated Ad Blocker,
ShowIP, Youtube HD, Low Quality Flash, Quick Translator, Tube Enhancer Plus, Puz-
zle, and FlashResizer. As a result, it is not rare to find augmenter authors who gave
up, unable to keep pace with browser releases (see, for instance, comments concerning
17 Producer comments and consumer comments were obtained from addons.mozilla.org/firefox/addon/

{extension-ID}/reviews and addons.mozilla.org/firefox/addon/{extension-ID}/versions/, respectively, where
{addon-ID} stands for the WA identifier as reflected in the second column of the table. A sample of the kind
of comments collected follows: “Great add-on, but having issues after FF13 update” for Stylish concerning a
Firefox upgrade.
the following extensions: adbar,18 Wikalong,19 KeywordSelection20 ). We believe this is

a main hurdle for WA to thrive.
6.1.2. Coping with Host Upgrades. Websites are reckoned to be modified frequently. The
challenge of developing resilient locators (i.e., objects that select host elements) is
well known in the fields of web automation, web data extraction, and metasearch
[Kordomatis et al. 2013]. This problem is exacerbated for WA. Augmenters not only
extract data but also might provide functionality to be seamlessly integrated into the
host. This makes augmenters even more sensitive to host upgrades since the synchrony
should be sought not only in the data but also in the interactive narrative of the host.
That is, augmenters can be disturbed even if the data (and its rendering) are kept
untouched but the navigation logic changes. This problem is general to WA but has been
specially highlighted by the annotation community. Here, host upgrades might make
web annotations become misplaced or even orphans [Kawase et al. 2010]. Indeed, this
issue is actually considered as one of the most challenging problems for web annotation
[Cockburn and McKenzie 2001; Bouvin et al. 2002; Hupp and Miller 2007].
There are multiple methods to address content inside a webpage. Traditionally, three
mechanisms have been used based on page structure, pixel, or image recognition.
Structure-based solutions see webpages as DOM trees where XPath expressions are
used to pinpoint the right node. Pixel approaches process the page as an image com-
posed by pixels, where (x, y) coordinates are used to store the position inside the image.
Finally, image recognition uses images as locators and image similarity algorithms to
determine the presence of the locator within the image counterpart of the page at hand.
These approaches offer different payoffs in terms of maintenance [Leotta et al. 2014].
Location maintenance is addressed through both preventive and curative approaches
(see Grace et al. [2011] for an overview). The former attempts to avoid locator failure
by providing more robust XPath expressions [Kowalkiewicz et al. 2006; Paz and Dı́az
2010]. As an alternative to using XPath (i.e., the DOM structure), Chickenfoot uses
the surrounding text as a clue to anchor the content [Bolin et al. 2005]. For instance,
Chickenfoot hides complex heuristics about how to extract some data based on nearby
text. For instance, retrieving the ISBN from Amazon pages is expressed as “after(text
isbn-10)”. In this example, “text” is an HTML type, whereas “isbn-10” is a literal.
Functions “after” and “before” are available to retrieve the content of the node that
follows/precedes the node identified by this expression. This shelters Chickenfoot
scripts from changes in the structure of the page, though not from changes in the
page’s content.
Alternatively, curative approaches monitor the host for changes and automatically
attempt to reconstruct broken locators. Approaches include schema-guided mainte-
nance [Meng et al. 2003] and automatic adaptation [Raposo et al. 2007; Ferrara and
Baumgartner 2011]. The former is based on the assumption that changes in web pages
often preserve syntactic features (e.g., data patterns, string lengths, etc.), hyperlinks,
and annotations (i.e., metadata about the semantic meaning of an HTML element us-
ing, e.g., CSS classes or microformats). On these grounds, the augmenter includes a
module, the maintainer, which checks for any potential extraction issue and provides an
automatic repairing protocol based on the aforementioned assumptions. The repairing
protocol might be successful, and in that case, the data extraction continues. Other-
wise, warnings and notifications arise. On the other hand, automatic adaption relies on
the augmenter keeping helpful structural hints (a.k.a. “signatures” in Lixto [Ferrara
18 https://addons.mozilla.org/firefox/addon/adbar.
19 https://addons.mozilla.org/firefox/addon/wikalong.
20 https://addons.mozilla.org/firefox/addon/keywordselection.
and Baumgartner 2011]). At enactment time, the augmenter first checks whether the
current DOM is compliant with these hints and, if not, uses these hints to reconstruct
the pattern. All in all, no conclusive solution exists yet.
6.2. Transferability
Transferability refers to the degree to which a system or component can be effectively
and efficiently transferred from one hardware, software, or other operational or usage
environment to another. ISO includes here aspects such as portability and installability.
6.2.1. Portability. Augmenters are hindered by browser differences in extension sup-
port.21 That is, an augmenter developed in Mozilla Firefox will require some adaptation
before being deployed in Internet Explorer. Portability is important due to the loyalty of
users to their browsers. Indeed, augmenters not available for the user’s browser imply
additional time and mental effort that will most likely lead to refusing the augmenter
in the first place.
Browser differences stem from (1) the APIs available for extension programming,
(2) the extension model, (3) the package internal structure, or (4) the package format. A
detailed description of these differences for Mozilla Firefox, Google Chrome, and Opera
can be found at Parashuram [2011]. A standard for extensions holds the promise of
easier development for programmers, as they wouldn’t have to delve into the specifics
of each browser. Encouraging enough, Opera’s chief technology officer stated his belief
that extensions “are ripe for standardization” [Shankland 2010].
The lightweight alternative to extensions (i.e., userscripts) presents far fewer porta-
bility problems. Broadly, userscripts are limited to JavaScript and the DOM API, both
technologies being standardized by the W3C. HTML5 is leveraging these technologies,
reducing the dependency from the browser agent and, hence, increasing the number
of augmenters that can be realized as userscripts. However, most browsers do not yet
fully support HTML5 (refer to http://caniuse.com/ for the current state of practice).
This situation is described in Bigham and Ladner [2007].
It will take some time for HTML5 to be deployed in users’ browsers. For the time
being, developers can resort to browser polyfills [Sharp 2010]. Polyfills are libraries
that offer a compatibility layer on top of the browser API. Browser holes (i.e., HTML5
functions not yet supported) are filled up through polyfills so that one can code without
coping with what the browser lacks. When the browser is upgraded, one only needs to
remove the polyfills for his or her code, invoking directly the browser APIs (as a matter
of fact, polyfills inhibit themselves once the browser is upgraded).
In addition to facing portability via standardization, other approaches offer interim
solutions. Kango22 provides a driver-like solution. Developers create extensions on top
of Kango’s proprietary API. Kango “drivers” are then available for Google Chrome, Sa-
fari, Mozilla Firefox, Internet Explorer, and Opera. Alternatively, TOMODO23 uses
a proxy-based solution to bypass portability. Browser-agnostic programming takes
place in TOMODO, where the augmenter is addressed by a TOMODO-generated
URL. Requesting this URL from no matter which browser will return the host page
once augmented at the proxy. Notice that this leads to the real web and the aug-
mented web being addressed through different URLs (e.g., http://xkcd.tomodo.me ver-
sus http://xkcd.com).
21 We do not consider here portability among different devices. WA has also been proposed for context-aware
web content adaptation [Laakko 2008]. Compared with proxy-based transcoding, WA benefits from direct
access to information on the device’s capabilities.
22 http://kangoextensions.com/.
23 http://tomodo.com. TOMODO is on the way to move to a new domain: https://convertifire.com/.
6.2.2. Installability. Installability has to do with the ease with which something can
be successfully installed in its production environment. When talking about WA, we
frequently refer to concepts such as the personal web, end-user empowerment, do it
yourself, and the like. In this setting, where users are on their own, installability
becomes the main issue. The challenges involved in identifying, installing, using, or
administering augmenters have been identified by some authors as the main WA down-
side: “browser extensions are not without their downsides. The largest of these is the
requirement that users download and install some software. For experienced computer
users, this may not be a problem. For inexperienced users, however, even seemingly
simple downloads and installations can be daunting” [Hanson et al. 2008]. Notice that
this is in contrast with traditional personalization techniques that require almost no
user involvement.
When users are on their own, identifying the right augmenter becomes a paramount
issue. Similar to mobile apps, augmenter identification is achieved via WA reposito-
ries.24 These repositories provide search facilities to locate the desired artifact as well
as comments and rankings to elucidate the most appropriate one. However, as observed
in Jackson et al. [2011], download counts and ratings are not available until after a pro-
gram has been downloaded and/or rated, and few are ever rated by these repositories’
users. Scaffidi et al. [2010] observed that a typical repository “might contain thousands
of these programs, but as few as 10% of these might be of value to the community at
large.” This leads the authors to propose a machine-learning model for distinguishing
between high-value and low-value scripts [Jackson et al. 2011]. Departing from the
push model with users directly accessing augmenter repositories, a pull approach is
proposed in Krulewitz and Vold [2009]. Here, users are made aware of userscripts
available for the current page. Awareness is achieved by highlighting the Greasemon-
key icon. Users can right click this icon to see and install the scripts available for the
current page. Once the script is installed, maintenance begins. The relatively small
size of augmenters makes the existence of fix packs unnecessary. Rather, maintenance
implies reinstalling the upgraded augmenter. By default, this is automatically achieved
through weavers/browsers.
6.3. Compatibility
Compatibility refers to the degree to which two or more systems or components can
exchange information and/or perform their required functions while sharing the same
hardware or software environment. For WA, the environment to be shared is the DOM
tree, and those who one shares with are the companion augmenters and the host.
6.3.1. Companion-Augmenter Compatibility. This refers to augmenters targeting the same
host. The very same web page can be subject to different augmenters. This problem
was identified as early as 1999: “many of the tools modify Web pages, and most of the
implementations of this functionality would have a hard time interoperating with each
other, as they in turn would modify pages and links, quite possibly corrupting each
other’s data” [Bouvin 1999]. As an example, consider Amazon. To date, 268 augmenter
scripts are reported to be available for Amazon at userscripts.org. A regular Amazon
visitor might have several augmenters installed. These augmenters will be enacted
simultaneously when one visits Amazon. It is important to notice that augmenter exe-
cution is not in parallel but in sequence; that is, augmenters are launched in the order
in which they were installed. This implies that the first augmenter acts on the original
24 Browsers keep their own extension repositories (e.g., Mozilla Add-ons, Chrome Web Store). For userscripts,
specific community websites are created: https://www.userscripts.org/, https://openuserjs.org/, https://
greasyfork.org/ or https://monkeyguts.com/.
DOM tree, the second augmenter consults the DOM tree but once updated by the first
augmenter, and so on. The problem is that programmers develop augmenters for the
original DOM, being unaware of changes conducted by other companion augmenters.
This can end up to be a real nightmare when code developed by different authors with
different aims is mixed together with unforeseen results. Even worse, the final DOM
tree can be dependent on the order in which augmenters are enacted! This problem is
reported in Garrido et al. [2013].
A first strategy to shelter from changes in the environment is the use of media-
tors. We explored this strategy by introducing “scripting interfaces” [Dı́az et al. 2010].
Scripting interfaces abstract the host in terms of their main entities (e.g., Amazon
pages’ content is described in terms of books, authors, comments, etc.), while entity
manipulation is conducted through “conceptual events” (e.g., addBook). The interface
hides how conceptual events are mapped into low-level DOM events. By being event
based, scripting interfaces ensure a smooth transition for JavaScript programmers.
That is, scripts subscribe to or notify of conceptual events in the same way they did
before for DOM events. There is no change in the programming model. However, scripts
are now sheltered from changes in the host. Ideally, the impact of host upgrades should
be limited to the scripting interface. This work also suggests the scripting interface to
be provided by the host itself as metalinks in the pages’ header [Dı́az et al. 2008]. This
begs the question of how websites can facilitate third-party augmenter development.
6.3.2. Host Compatibility. Most web designers are mainly concerned with how content
is presented on screen, rather than its structure and meaning. The target audience is
readers. However, WA introduces another kind of audience for HTML pages: augmenter
developers. If usability is about caring for your readers, “augmentability” is about
cherishing your augmenter developers. This section addresses the question of how to
improve site augmentability.
First, augmentability can be easily improved by using consistent CSS classes. For
instance, in Harper et al. [2006], visual rendering and code structure of the DOM
document and CSS is used to derive an annotation based on the author’s ontology. This
annotation is later used for WA, specifically to reorganize the website’s content. A step
forward is when websites themselves provide the annotation. Annotation techniques
such as those pioneered by microformats [Khare and Çelik 2006] and, next, leveraged
by the semantic web [Berners-Lee et al. 2001] are facilitating the consumption of web
data by programs, augmenters being a case in point. The use of standardized tags/terms
to describe what the content stands for (e.g., a date, an ISBN, an address) certainly
eases accessing this data. In the same vein, the usage of HTML5 as a contrast to Adobe
Flash also favors WA [Wikipedia 2013]. This standard includes the <video>, <audio>,
and <canvas> elements, as well as the integration of scalable vector graphics (SVG)
content, and, in so doing, handles multimedia and graphical content on the web without
having to resort to proprietary plugins and APIs.
However, augmentability (i.e., the degree to which a host is amenable to WA) is not
so much a technical aspect but a “state of mind.” Though the existence of standards
can eventually facilitate WA development, the fact of making a website augmentable
very much depends on how website owners perceive augmentation: as an opportunity
or as a threat. As an opportunity, WA offers a means for customer cocreation [Witell
et al. 2011]. From this perspective, WA can be regarded as a twist of the Web2.0
mandate: design for hackability and remixability. Current society is characterized by
user participation. Companies aware of this shift can tap their customers to develop
new products and services. For instance, game companies develop tools for enabling
game mod development by their user communities. The Bazaar.tf example in Section 3
illustrates this kind of win-win relationship that can be set among WA developers
and website owners. A pioneer case is that of GMail. As an alternative to HTML

scrapping, GMail promotes WA through an API that provides accessor methods for
getting common screen elements. In this way, augmenter functions can be enacted
through this API’s callbacks when specific events occur [Parparita 2012], and hence,
augmenters no longer need to resort to XPath to get screen elements.
The latter raises the question of whether augmentation achieved with the help of the
host is still WA. In its comparison with augmented reality, the term “augmentation”
seems to suggest that the addition is achieved without the involvement of the subject
being augmented, whether it be the reality or the existing web. Websites are unaware of
being augmented. Otherwise, we are moving toward “web appropriation,” where web-
sites are designed to accommodate, support, and encourage users to tune the website
within the scope of their situated activity. This also looks like a good idea. But both the
challenges and the actors are different. In web appropriation, the burden falls on the
side of the website designer to make “an architecture of participation” that encourages
users to add value to the website in terms of augmenters [Arellano et al. 2012].
But a less friendly face also exists. WA can also be a threat to websites’ business
models. Filman wrote an editorial note stating that “Greasemonkey and DVRs both
have the potential to distort the economic model underlying the way the technology
is used. . . . Faced with the Greasemonkey threat of removing the parts of Web pages
that pay the bills, we can expect similar responses from Web page creators. . . . Tech-
nologically, we’ll have more complex presentation mechanisms (for example, mixing
the advertising with the context in executable form, or Web pages that check to see
that they haven’t been modified before they’ll present their information) . . ., frequent
changes of Web page formats to catch naı̈ve scripts, and more complex intermixing
of desired information with the economically sustaining information” [Filman 2006].
This arms race has already begun, at least for ad blocking. The first counterattack
is awareness, that is, sensitizing users about the importance of online advertisement
to sustain a free web. An experiment was conducted to measure its effectiveness: an
appeal popped up to ad blockers, asking them to re-enable the ads or to make a small
donation [PageFair 2014a]. The experiment ran 576 appeals on 220 different websites.
Only 0.33% of ad blockers that were shown an appeal added an ad block exception. Only
three users per million who were given the option to make a donation did so! It comes
as no surprise that some websites decide on a second strategy: stop working until the
user disables adblock extensions. Outside the advertising industry, WIRED reported
the case of an augmenter that inlays a button into Amazon music and video pages,
linking to the website The Pirate Bay, where the merchandise can be downloaded for
free [Kravets 2008]. The site hosting the extension was reportedly taken down after
being sued. Nevertheless, WIRED concluded with an open ending: “while the website
hosting the add-on is down, the add-on is still available via TorrentFreak.” The ques-
tion still remains: how can existing business models harness the user empowerment
brought by WA?
6.4. Security
Security refers to the degree of protection of information and data so that unauthorized
persons or systems cannot read or modify them. A distinctive aspect of WA is that it
opens development to third parties. Third parties can be entrenched companies but also
amateur programmers. Amateurism is good since it provides the flexibility, willingness,
and awareness to satisfy minority needs that would otherwise fall outside the interest
of more business-oriented teams. But amateurism is not without risk, better said, the
consumption of amateur-provided solutions. First, amateur programmers tend to focus
on functionality, leaving out other aspects like input validation. This means that if
a malicious website identifies the existence of a vulnerable extension, it can make
use of such vulnerability to overpass the security of the browser. Second, amateur
programmers might be approached by malicious entities to take over their widely
consumed augmenters so that they can inject malware in the following augmenter
upgrades. The next paragraphs delve into these issues.
Security leaks. These partly stem from extensions having the same privileges as
those of the browser (i.e., access to user preferences, user passwords, filesystem, etc.)
while such extensive privileges are hardly required. Different studies corroborate this
[Barth et al. 2010; Liverani and Freeman 2009]. Specifically, Barth et al. [2010], after
revising 25 popular Mozilla Firefox extensions, found that 88% of these extensions
need less than the full set of available privileges. These objections might also apply
to augmenters. The only clear example we found that required additional privileges
was the Skype extension that accesses the Skype local application.25 This reduced
use of browser privileges warns against giving full privileges by default. This leads
to the proposal of new browser extension systems that improve security by using
less privilege, privilege separation, and strong isolation [Felt 2010]. We believe future
browsers will reduce the severity of extension exploits by narrowing the privilege gap
between what is needed and what is granted. This trend is already illustrated by Google
Chrome’s static privilege system where extensions have to enumerate the privileges
required rather than give them full access by default.26
Malicious augmenters. In January 2014, the Wall Street Journal reported how
owners of popular extensions had been offered money to incorporate adware code into
their augmenters [Winkler 2014]. The issue is that even for augmenters downloaded
from browser stores, there is no guarantee that an initially “safe” augmenter won’t be-
come malicious in following upgrades. Unwary users might update their augmenters
(even worse, updates might happen automatically) without being aware that now their
browsing habits might start being tracked. How to defend from these abuses? First, the
community can flag extensions that are malicious (e.g., Chrome’s “Report Abuse” but-
ton). This might prevent additional brand-new downloads, but these warnings might
go unnoticed for existing installations. Second, browser vendors should increase the de-
mands for more transparency in multiple fronts: the information being collected,27 the
developers’ credential (so far, no more than a support page and an email is required),
and even ownership transferal. As recently as February 2015, Firefox has followed
Chrome’s steps in requiring most extensions to be submitted for signing to check com-
pliance with Firefox guidelines.28 A few days later, 310 responses heavily contested
this measure as penalizing free Internet. Third, users themselves should be aware
of the risks, offering them affordable means for extension management. In this vein,
Chrome’s “Reset Browser Settings” button permits users to easily get settings back to
default in case malicious software has managed to hijack the browser. All in all, this
recent upheaval about extension malware is also a symptom of maturity in terms of
the number of offerings and beneficiaries reached by browser extensions. These figures
are becoming large enough to deserve the attention of the bad guys.
25 We can also envisage this need for augmenters that might benefit from accessing the browsing history.
For instance, an augmenter for an online newspaper can tap into the browsing history to highlight those
sections previously visited, perhaps offering additional links to related stories in other newspapers. We are
not aware of any augmenter providing this functionality.
26 In addition, the Google Chrome extension model forces extensions to be split into two components, the con-
tent scripts and the parent extension (http://developer.chrome.com/extensions/content_scripts.html). Content
scripts can access the DOM but have no access to the browser internal API. For parent extensions, it works
the other way around. These two components communicate via message exchanging. This architecture com-
plicates security overpassing: the malware needs to deceive the content script to reach the parent extension
and use its privileges.
27 https://developer.chrome.com/webstore/program_policies.
28 https://blog.mozilla.org/addons/2015/02/10/extension-signing-safer-experience/.
6.5. Efficiency
Efficiency refers to the relationship between the level of performance of the software
and the amount of resources used, under stated conditions. This includes aspects such
as time behavior and resource utilization. WA, understood as a client-side technique,
should carefully reflect on the consumption of both time and memory. This concern has
been mainly highlighted by web annotation systems that are consumption intensive
[Salomoni et al. 2008]. First, annotations solely kept on the client without server backup
might face storage scalability problems. Second, in-context placement of annotation
within the current page might be time consuming for heavily annotated pages. This
serves to illustrate the limitations of WA due to its client-side nature. Either process-
or storage-demanding functionalities are not good candidates for being turned into
augmenters.
7. CONCLUSIONS
This article characterizes web augmentation as a personalization technique but now
conducted on the browser side by third parties. By elaborating on the most popular
45 augmenters, we feature WA in terms of the “what,” the “why,” and the “what for.”
Specifically, two facts make WA depart from traditional web development: (1) WA does
not start in a vacuum but is framed within an existing website, and (2) WA tends to
be conducted by third parties. Looking at the impact of those specifics, we conduct a
systematic literature review to get the state of affairs in achieving quality augmenters.
Though the opportunities brought by WA are large, so are the challenges ahead: website
upgrade resilience, interbrowser portability, smooth augmenter coexistence, improving
security and trustworthiness, ways for websites to harness the user empowerment
brought by WA, and so forth. By shedding light on those issues, we hope to move WA
onto the research agenda.
We believe this research is important for WA to become a main enabler of the personal
web and, hence, fulfilling one of the tenants of the web: giving the agent consumer final
control over visual presentation and user interaction. Even at the expense of some
cases of malware, we believe WA has the potential to benefit all: website owners might
tap hobby programmers to fulfill some minority requirements; third-party companies
can benefit from the WA marketplace that popular websites can create; and, most
important, end-users can find in WA the conduit toward “personalization on demand,”
making the daily work experience of millions of web users more rewarding.
APPENDIX
A. AUGMENTATION-BASED EXTENSIONS
Table IV. Most-Used Augmentation-Based Extensions as of January 2013. Column “Users” Stands for Users
Currently Having the Extension Installed. The Figure Aggregates the Numbers from Mozilla Firefox plus Google
Chrome, should the Extension Be Available for Both. Column “Devs” Represents the Number of Developers as
Obtained from the install.rdf File. Translators Are Excluded.
ID Name Users Devs Description
1 Stylish 2,033,853 4 Applies themes to multiple sites
2 WOT - Safe Surfing 1,473,364 N/A Shows website trust information
3 Flashblock 1,195,181 8 Blocks Flash
4 Flagfox 1,180,880 2 Displays a country flag of the website’s, etc.
5 Clearly 653,229 N/A Cleans blogs and articles, easing the reading
(Continued)
Table IV. Continued

ID Name Users Devs Description
6 Amazon for Chrome 525,647 N/A Shows Amazon information of current
product
7 Text Link 438,572 2 Allows URI texts to be loaded by double click
8 SkipScreen 375,476 4 Skips unnecessary pages on download sites
9 InvisibleHand 315,889 N/A Shows the lowest price of current product
10 ShopAtHome.com 294,636 N/A Alerts about coupons at ShopAtHome.com
11 Download YouTube 284,392 1 Adds download links to Youtube videos
12 Auto Replay 261,948 N/A Adds an AutoReplay button to Youtube player
13 SaveFrom.net helper 240,491 N/A Helps in the download process of many sites
14 FlashControl 228,782 N/A Removes Flash content from pages
15 NoSquint 211,266 1 Allows to set zoom and color per site or global
16 gTranslate 187,269 4 Allows to translate selection with two clicks
17 AutoPager 185,842 7 Loads next page when reaching the page’s
end
18 Updated Ad Blocker 153,639 1 Blocks ads (AdSense, DoubleClick)
19 ShowIP 116,596 1 Shows the IP address of the website
20 4chan Plus 115,481 N/A Adds “hover image to enlarge” and more to
4chan
21 AdBlock Pirate Bay 95,059 N/A Removes popup ads on The Pirate Bay
22 Image Properties 83,438 N/A Shows the image’s properties with two clicks
23 Youtube HD 80,383 1 Changes default play quality to HD in
Youtube
24 The Camelizer 77,224 3 Adds price history charts to shops
25 Preisspion 73,664 1 Notifies when product is cheaper in other
shops
26 Low Quality Flash 66,586 1 Sets Flash objects to “Low Quality”
27 Quick Translator 64,921 1 Allows to translate selection with two clicks
28 Nice Tumblr 53,978 N/A Applies a custom theme to Tumblr
29 WorldIP 38,201 3 Adds information about the domain site
30 Puzzle for Chrome 36,129 N/A Converts an image into a puzzle
31 Tube Enhancer Plus 35,885 1 Adds “download videos” and more to Youtube
32 Puzzle 35,668 1 Converts an image into a puzzle
33 Mustachio 31,192 N/A Adds moustaches to the faces of the images
34 FlashResizer 29,474 1 Makes Flash objects resizable
35 Auto HD 25,790 N/A Changes default play quality to HD in
Youtube
36 EXIF Viewer 23,725 N/A Shows EXIF information of an image
37 XJZ Survey Remover 21,652 N/A Removes “XJZ Survey” ads
38 The Best Price 20,525 N/A Shows price comparison the products
39 Oogle 19,693 N/A Highlights selected image at Google Images
40 Fabulous 13,187 N/A Adds sound notifications to Facebook
41 Content by Zemanta 12,576 N/A Adds Zemanta content to blog posts
42 Refresh Monkey 11,974 N/A Refreshes a page at specified intervals
43 Carbon Footprint 9,841 N/A Adds CO2 emissions for a route at
GoogleMaps
44 SwiftPreview 9,264 N/A Adds link preview at multiple sites
45 MangaOne 5,535 N/A Shows all pages of a manga in a single page
B. REVIEWED PAPERS
Table V. Relevant Research Contributions as a Result of Systematic Literature Review on Web Augmentation
(The Papers Are Classified Along ISO 25010 Quality Characteristics)
Issue Paper
Maintainability Kordomatis et al. [2013], Bigham and Ladner [2007], Borodin et al. [2010], Amershi
et al. [2013], Harper et al. [2006], Bolin and Miller [2005], Casteleyn et al. [2010],
Bouvin et al. [2002], Hupp and Miller [2007], Burel and Cano [2010], Kranzdorf
et al. [2012], Sellers [2011], Furche et al. [2011], Garrido et al. [2013], Khiem et al.
[2007], Li et al. [2011], Montoto et al. [2008], Losada et al. [2012], Montoto et al.
[2009a], Montoto et al. [2009b], Furche et al. [2013b], Akpinar and Yesilada [2013],
Furche et al. [2012], Furche et al. [2013a]
Portability Bigham and Ladner [2007], Yesilada et al. [2007], Borodin et al. [2010], Hernández
[2009], Lunn et al. [2011], Hanson et al. [2008]
Installability Trusty and Truong [2011], Hanson [2009], Hupp and Miller [2007], Puzis et al.
[2011], Parmanto et al. [2005], Dı́az et al. [2013], Dı́az [2012], Jackson et al. [2011],
Nguyen and Schumann [2013], Hanson et al. [2008], Nebeling and Norrie [2011],
Arellano and Dı́az [2013]
Compatibility Bouvin [1999], Borodin et al. [2010], Dı́az et al. [2013], Lunn et al. [2008], Garrido
et al. [2013]
Security Dı́az et al. [2013], Arellano et al. [2010]
Efficiency Salomoni et al. [2008]
ACKNOWLEDGMENTS
Thanks are due to Sergio Firmenich for his comments in an early version of this article.
REFERENCES
Mark S. Ackerman. 2000. The intellectual challenge of CSCW: The gap between social requirements and
technical feasibility. Human-Computer Interaction 15, 2 (Sept. 2000), 179–203. DOI:http://dx.doi.org/
10.1207/S15327051HCI1523_5
M. Elgin Akpinar and Yeliz Yesilada. 2013. Vision based page segmentation algorithm: Extended and per-
ceived success. In Proceedings of the 13th International Conference on Web Engineering (Workshops).
Springer, Berlin, 238–252. DOI:http://dx.doi.org/10.1007/978-3-319-04244-2_22
Saleema Amershi, Jalal Mahmud, Jeffrey Nichols, Tessa Lau, and German Attanasio Ruiz. 2013. LiveAction:
Automating web task model generation. ACM Transactions on Interactive Intelligent Systems 3, 3,
Article 14 (Oct. 2013), 23 pages. DOI:http://dx.doi.org/10.1145/2533670.2533672
Cristóbal Arellano and Oscar Dı́az. 2013. Lightweight end-user software sharing. In Proceedings of
the 4th International Symposium on End-User Development (IS-EUD’13). Springer, Berlin, 241–246.
DOI:http://dx.doi.org/10.1007/978-3-642-38706-7_20
Cristóbal Arellano, Oscar Dı́az, and Jon Iturrioz. 2010. Crowdsourced web augmentation: A security model.
In Proceedings of the 11th International Conference on Web Information Systems Engineering (WISE’10).
Cristóbal Arellano, Oscar Dı́az, and Jon Iturrioz. 2012. Opening personalization to partners: An architecture
of participation for websites. In Proceedings of the 12th International Conference on Web Engineering
(ICWE’12). Springer, Berlin, 91–105. DOI:http://dx.doi.org/10.1007/978-3-642-31753-8_7
Chieko Asakawa and Hironobu Takagi. 2008. Transcoding. In Web Accessibility - A Foundation for Research.
Adam Barth, Adrienne Porter Felt, Prateek Saxena, and Aaron Boodman. 2010. Protecting browsers from ex-
tension vulnerabilities. In Proceedings of the 17th Network and Distributed System Security Symposium
(NDSS’10). The Internet Society.
Bazaar.tf. 2013. Greasemonkey scripts allowed or not? (April 2013). Retrieved April 2013 from http://bazaar.
tf/thread/1447.
Amaury Belin and Yannick Prié. 2012. DIAM: Towards a model for describing appropriation processes
through the evolution of digital artifacts. In Proceedings of 2012 ACM Conference on Designing Interac-
tive Systems (DIS’12). ACM, New York, NY, 645–654. DOI:http://dx.doi.org/10.1145/2317956.2318053
Tim Berners-Lee, James Hendler, and Ora Lassila. 2001. The semantic web. Scientific American Magazine
284, 5 (2001), 29–37.
Jeffrey P. Bigham and Richard E. Ladner. 2007. Accessmonkey: A collaborative scripting framework for web
users and developers. In Proceedings of the 2007 International Cross-Disciplinary Conference on Web
Accessibility (W4A’07). ACM, New York, NY, 25–34. DOI:http://dx.doi.org/10.1145/1243441.1243452
Ricardo Bilton. 2013. Google yanks Adblock Plus from Google Play, surprising nobody. (March 2013). Re-
trieved June 2014 from http://venturebeat.com/2013/03/13/adblock-plus-removed-google-play-store/.
Michael Bolin and Robert C. Miller. 2005. Naming page elements in end-user web automation. In Proceed-
ings of the 1st Workshop on End-user Software Engineering (WEUSE’05). ACM, New York, NY, 1–5.
DOI:http://dx.doi.org/10.1145/1082983.1083233
Michael Bolin, Matthew Webber, Philip Rha, Tom Wilson, and Robert C. Miller. 2005. Automation and
customization of rendered web pages. In Proceedings of the 18th Annual ACM Symposium on User Inter-
face Software and Technology (UIST’05). ACM, New York, NY, 163–172. DOI:http://dx.doi.org/10.1145/
1095034.1095062
Yevgen Borodin, Jeffrey P. Bigham, Glenn Dausch, and I. V. Ramakrishnan. 2010. More than meets the
eye: A survey of screen-reader browsing strategies. In Proceedings of the 7th International Cross-
Disciplinary Conference on Web Accessibility (W4A’10). ACM, New York, NY, Article 13, 10 pages.
Niels Olof Bouvin. 1999. Unifying strategies for web augmentation. In Proceedings of the 10th
ACM Conference on Hypertext and Hypermedia (HYPERTEXT’99). ACM, New York, NY, 91–100.
Niels Olof Bouvin. 2002. Augmenting the web through open hypermedia. The New Review of Hypermedia
and Multimedia 8 (July 2002), 3–25. DOI:http://dx.doi.org/10.1080/13614560208914733
Niels Olof Bouvin, Polle Zellweger, Kaj Grønbæk, and Jock D. Mackinlay. 2002. Fluid annotations through
open hypermedia: Using and extending emerging web standards. In Proceedings of the 11th International
World Wide Web Conference (WWW’02). ACM, New York, NY, 160–171. DOI:http://dx.doi.org/10.1145/
511446.511468
Peter Brusilovsky and Mark T. Maybury. 2002. From adaptive hypermedia to the adaptive web. Communi-
cations of the ACM 45, 5 (May 2002), 30–33. DOI:http://dx.doi.org/10.1145/506218.506239
Grégoire Burel and Amparo Elizabeth Cano. 2010. Understanding web documents using semantic overlays.
In Proceedings of the 15th International Conference on Intelligent User Interfaces (IUI’10). ACM, New
York, NY, 405–406. DOI:http://dx.doi.org/10.1145/1719970.1720044
Sven Casteleyn, William Van Woensel, and Olga De Troyer. 2010. Assisting mobile web users: Client-side
injection of context-sensitive cues into websites. In Proceedings of the 12th International Conference
on Information Integration and Web-Based Applications and Service (iiWAS’10). ACM, New York, NY,
443–450. DOI:http://dx.doi.org/10.1145/1967486.1967555
Lydia B. Chilton, Robert C. Miller, Greg Little, and Chen-Hsiang Yu. 2010. Why we customize the web. In No
Code Required: Giving Users Tools to Transform the Web. Morgan Kaufmann Publishers, San Francisco,
CA, 39–152.
Ibrahim Cingil, Asuman Dogac, and Ayca Azgin. 2000. A broader approach to personalization. Communica-
tions of the ACM 43, 8 (Aug. 2000), 136–141. DOI:http://dx.doi.org/10.1145/345124.345168
Andy Cockburn and Bruce J. McKenzie. 2001. What do web users do? An empirical analysis of web use.
International Journal of Human-Computer Studies 54, 6 (March 2001), 903–922. DOI:http://dx.doi.org/
10.1006/ijhc.2001.0459
Allen Cypher. 2012. Automating data entry for end users. In Proceedings of the 2012 IEEE Sympo-
sium on Visual Languages and Human-Centric Computing (VL/HCC’12). 23–30. DOI:http://dx.doi.org/
10.1109/VLHCC.2012.6344474
Florian Daniel and Maristella Matera. 2014. Mashups: Concepts, Models and Architectures. Springer, Berlin,
DOI:http://dx.doi.org/10.1007/978-3-642-55049-2
Oscar Dı́az. 2012. Understanding web augmentation. In Proceedings of the 4th International Workshop
on Lightweight Integration on the Web (ComposableWeb’12). Springer, Berlin, 79–80. DOI:http://dx.doi.
org/10.1007/978-3-642-35623-0_8 Keynote
Oscar Dı́az, Cristóbal Arellano, and Maider Azanza. 2013. A language for end-user web augmentation:
Caring for producers and consumers alike. ACM Transactions on the Web 7, 2, Article 9 (May 2013), 51
pages. DOI:http://dx.doi.org/10.1145/2460383.2460388
Oscar Dı́az, Cristóbal Arellano, and Jon Iturrioz. 2008. Layman tuning of websites: Facing change resilience.
In Proceedings of the 17th International Conference on World Wide Web (WWW’08). ACM, New York, NY,
1127–1128. DOI:http://dx.doi.org/10.1145/1367497.1367689
Oscar Dı́az, Cristóbal Arellano, and Jon Iturrioz. 2010. Interfaces for scripting: Making greasemonkey scripts
resilient to website upgrades. In Proceedings of the 10th International Conference on Web Engineering
Paul Dourish. 2003. The appropriation of interactive technologies: Some lessons from placeless docu-
ments. Computer Supported Cooperative Work 12, 4 (Sept. 2003), 465–490. DOI:http://dx.doi.org/10.1023/
A:1026149119426
Adrienne Porter Felt. 2010. Least Privilege for Browser Extensions. Master’s thesis. University of California,
Berkeley.
Emilio Ferrara and Robert Baumgartner. 2011. Intelligent self-repairable web wrappers. In Proceedings
of the 12th Conference of the Italian Association for Artificial Intelligence (AI*IA’11). Springer, Berlin,
274–285. DOI:http://dx.doi.org/10.1007/978-3-642-23954-0_26
Robert E. Filman. 2006. From the editor in chief: Taking back the web. IEEE Internet Computing 10, 1
(2006), 3–5. DOI:http://dx.doi.org/10.1109/MIC.2006.6
Sergio Firmenich, Gustavo Rossi, and Marco Winckler. 2013. A domain specific language for orchestrating
user tasks whilst navigation web sites. In Proceedings of the 13th International Conference on Web
Engineering (ICWE’13). Springer, Berlin, 224–232. DOI:http://dx.doi.org/10.1007/978-3-642-39200-9_20
Tim Furche, Georg Gottlob, Giovanni Grasso, Xiaonan Guo, Giorgio Orsi, and Christian Schallhart. 2011.
Real understanding of real estate forms. In Proceedings of the 1st International Conference on Web Intelli-
gence, Mining and Semantics (WIMS’11). ACM, New York, NY, Article 13, 12 pages. DOI:http://dx.doi.org/
10.1145/1988688.1988704
Tim Furche, Georg Gottlob, Giovanni Grasso, Xiaonan Guo, Giorgio Orsi, and Christian Schallhart. 2013a.
The ontological key: Automatically understanding and integrating forms to access the deep Web. VLDB
Journal 22, 5 (2013), 615–640. DOI:http://dx.doi.org/10.1007/s00778-013-0323-0
Tim Furche, Georg Gottlob, Giovanni Grasso, Christian Schallhart, and Andrew Jon Sellers. 2013b. OXPath:
A language for scalable data extraction, automation, and crawling on the deep web. VLDB Journal 22,
1 (2013), 47–72. DOI:http://dx.doi.org/10.1007/s00778-012-0286-6
Tim Furche, Giovanni Grasso, Andrey Kravchenko, and Christian Schallhart. 2012. Turn the page: Au-
tomated traversal of paginated websites. In Proceedings of the 12th International Conference on Web
Engineering (ICWE’12). Springer, Berlin, 332–346. DOI:http://dx.doi.org/10.1007/978-3-642-31753-8_27
Alejandra Garrido, Sergio Firmenich, Gustavo Rossi, Julián Grigera, Nuria Medina-Medina, and Ivana
Harari. 2013. Personalized web accessibility using client-side refactoring. IEEE Internet Computing 17,
4 (2013), 58–66. DOI:http://dx.doi.org/10.1109/MIC.2012.143
L. K. Joshila Grace, V. Maheswari, and Dhinaharan Nagamalai. 2011. Analysis of web logs and web user in
web mining. International Journal of Network Security & Its Applications 3, 1 (Jan. 2011), 99–110.
Jacek Gwizdka and Mark H. Chignell. 2007. Individual differences in personal information management. In
Personal Information Management. University of Washington Press, 206–220.
Mark J. Handel and Steven Poltrock. 2011. Working around official applications: Experiences from a large
engineering project. In Proceedings of the 2011 ACM Conference on Computer Supported Cooperative
Work (CSCW’11). ACM, New York, NY, 309–312.
Vicky L. Hanson. 2009. Age and web access: The next generation. In Proceedings of the 6th International
Cross-Disciplinary Conference on Web Accessibility (W4A’09). ACM, New York, NY, 7–15. DOI:http://
dx.doi.org/10.1145/1535654.1535658
Vicki L. Hanson, John T. Richards, and Calvin Swart. 2008. Browser augmentation. In Web Accessibility - A
Foundation for Research. Springer, Berlin, 215–229. DOI:http://dx.doi.org/10.1007/978-1-84800-050-6_13
Simon Harper, Sean Bechhofer, and Darren Lunn. 2006. SADIe: Transcoding based on CSS. In Proceedings
of the 8th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS’06).
ACM, New York, NY, 259–260. DOI:http://dx.doi.org/10.1145/1168987.1169044
Imma Hernández. 2009. Intelligent web navigation. In Proceedings of the 3rd BCS-IRSG Conference on
Future Directions in Information Access (FDIA’09). British Computer Society, Swinton, UK, 117–124.
Jason I. Hong and Jeffrey Wong. 2006. Marmite: End-user programming for the web. In Proceedings of the
2006 Conference on Human Factors in Computing Systems (CHI’06). ACM, New York, NY, 1541–1546.
Darris Hupp and Robert C. Miller. 2007. Smart bookmarks: Automatic retroactive macro recording on the
web. In Proceedings of the 20th Annual ACM Symposium on User Interface Software and Technology
(UIST’07). ACM, New York, NY, 81–90. DOI:http://dx.doi.org/10.1145/1294211.1294226
Jarrod Jackson, Christopher Scaffidi, and Kathryn T. Stolee. 2011. Digging for diamonds: Identifying valu-
able web automation programs in repositories. In Proceedings of the 2011 International Conference
on Information Science and Applications (ICISA’11). IEEE Computer Society, Los Alamitos, CA, 1–10.
DOI:http://dx.doi.org/10.1109/ICISA.2011.5772326
Ricardo Kawase, Eelco Herder, George Papadakis, and Wolfgang Nejdl. 2010. In-context annotations for re-
finding and sharing. In Proceedings of the 6th International Conference on Web Information Systems and
Technologies (WEBIST’10). Springer, Berlin, 85–100. DOI:http://dx.doi.org/10.1007/978-3-642-22810-0_7
Rohit Khare and Tantek Çelik. 2006. Microformats: A pragmatic path to the semantic web. In Proceedings
of the 15th International Conference on World Wide Web (WWW’06). ACM, New York, NY, 865–866.
Vu Hong Khiem, Kibong Kang, and Keung Hae Lee. 2007. Miniwap: Navigating WAP with minimo. In
Proceedings of the 31st International Computer Software and Applications Conference (COMPSAC’07).
Springer, Berlin, 63–68. DOI:http://dx.doi.org/10.1109/COMPSAC.2007.150
Barbara Kitchenham and Stuart Charters. 2007. Guidelines for Performing Systematic Literature Reviews
in Software Engineering. Technical Report EBSE 2007-001. Keele University and Durham University
Joint Report.
Iraklis Kordomatis, Christoph Herzog, Ruslan R. Fayzrakhmanov, Bernhard Krüpl-Sypien, Wolfgang
Holzinger, and Robert Baumgartner. 2013. Web object identification for web automation and meta-
search. In Proceedings of the 3rd International Conference on Web Intelligence (WIMS’13). ACM, New
York, NY, Article 13, 12 pages. DOI:http://dx.doi.org/10.1145/2479787.2479798
Marek Kowalkiewicz, Tomasz Kaczmarek, and Witold Abramowicz. 2006. myPortal: Robust extraction and
aggregation of web content. In Proceedings of the 32nd International Conference on Very Large Data
Bases (VLDB’06). VLDB Endolwment, 1219–1222.
Jochen Kranzdorf, Andrew Jon Sellers, Giovanni Grasso, Christian Schallhart, and Tim Furche. 2012. Visual
OXPath: Robust wrapping by example. In Proceedings of the 21st International Conference on World Wide
Web (WWW’12). ACM, New York, NY, 369–372. DOI:http://dx.doi.org/10.1145/2187980.2188051
David Kravets. 2008. Amazon.com tossed into Pirate Bay jungle. (April 2008). Retrieved March 2013 from
http://www.wired.com/2008/12/amazoncom-tosse/.
Steve Krulewitz and Erik Vold. 2009. Greasefire. (January 2009). Retrieved July 2014 from https://addons.
mozilla.org/firefox/addon/greasefire/.
Timo Laakko. 2008. Context-aware web content adaptation for mobile user agents. In Evolution of the Web in
Artificial Intelligence Environments. Studies in Computational Intelligence, Vol. 130. Springer, Berlin,
69–99. DOI:http://dx.doi.org/10.1007/978-3-540-79140-9_4
Maurizio Leotta, Diego Clerissi, Filippo Ricca, and Paolo Tonella. 2014. Visual vs. DOM-based web locators:
An empirical study. In Proceedings of the 14th International Conference on Web Engineering (ICWE’14).
Qingcheng Li, Zhan-Ying Zhang, Jie Ma, and Jin Zhang. 2011. Web page layout adaptation based on
webkit for e-paper device. In Proceedings of the 14th IEEE International Conference on Compu-
tational Science and Engineering (CSE’11). IEEE Computer Society, Los Alamitos, CA, 495–502.
DOI:http://dx.doi.org/10.1109/CSE.2011.90
Roberto Suggi Liverani and Nick Freeman. 2009. Abusing Firefox extensions. Defcon (2009).
José Losada, Juan Raposo, Alberto Pan, and Paula Montoto. 2012. Efficient execution of web navigation
sequences. In Proceedings of the 13th International Conference on Web Information Systems Engineering
(WISE’12). Springer, Berlin, 340–353. DOI:http://dx.doi.org/10.1007/978-3-642-35063-4_25
Darren Lunn, Sean Bechhofer, and Simon Harper. 2008. A user evaluation of the SADIe transcoder. In
Proceedings of the 10th International ACM SIGACCESS Conference on Computers and Accessibility
(ASSETS’08). ACM, New York, NY, 137–144. DOI:http://dx.doi.org/10.1145/1414471.1414498
Darren Lunn, Simon Harper, and Sean Bechhofer. 2011. Identifying behavioral strategies of visually impaired
users to improve access to web content. ACM Transactions on Accessible Computing 3, 4, Article 13 (April
2011), 35 pages. DOI:http://dx.doi.org/10.1145/1952388.1952390
Xiaofeng Meng, Dongdong Hu, and Chen Li. 2003. Schema-guided wrapper maintenance for web-data extrac-
tion. In Proceedings of the 5th ACM International Workshop on Web information and Data Management
(CIKM’13). ACM, New York, NY, 1–8. DOI:http://dx.doi.org/10.1145/956699.956701
Miniwatts Marketing Group. 2013. Internet usage statistics: The Internet big picture world Internet users
and population stats. (2013). Retrieved June 2014 from http://www.internetworldstats.com/stats.htm.
Paula Montoto, Alberto Pan, Juan Raposo, Fernando Bellas, and Javier López. 2009a. Automating navigation
sequences in AJAX websites. In Proceedings of the 9th International Conference on Web Engineering
Paula Montoto, Alberto Pan, Juan Raposo, Fernando Bellas, and Javier López. 2009b. Web navigation se-
quences automation in modern websites. In Proceedings of the 20th International Conference on Database
and Expert Systems Applications (DEXA’09). Springer, Berlin, 302–316. DOI:http://dx.doi.org/10.1007/
978-3-642-03573-9_25
Paula Montoto, Alberto Pan, Juan Raposo, José Losada, Fernando Bellas, and Javier López. 2008. A workflow-
based approach for creating complex web wrappers. In Proceedings of the 9th International Conference
on Web Information Systems Engineering (WISE’08). Springer, Berlin, 396–409. DOI:http://dx.doi.org/
10.1007/978-3-540-85481-4_30
Michael Nebeling and Moira C. Norrie. 2011. Tools and architectural support for crowdsourced adaptation
of web interfaces. In Proceedings of the 11th International Conference on Web Engineering (ICWE’11).
Joanna W. Ng, Mark H. Chignell, James R. Cordy, and Yelena Yesha. 2010. Motivation. In The Smart Internet.
Dinh-Quyen Nguyen and Heidrun Schumann. 2013. Visualization to support augmented web browsing. In
Proceedings of the 2013 IEEE/WIC/ACM International Conferences on Web Intelligence (WI’13). IEEE
Computer Society, Los Alamitos, CA, 535–541. DOI:http://dx.doi.org/10.1109/WI-IAT.2013.75
Evangelos Pafilis, Seán I. O’Donoghue, Lars J. Jensen, Heiko Horn, Michael Kuhn, Nigel P. Brown, and
Reinhard Schneider. 2009. Reflect: Augmented browsing for the life scientist. Nature Biotechnology 27,
6 (2009), 508–510. DOI:http://dx.doi.org/10.1038/nbt0609-508
PageFair. 2013a. Acceptable Ads Soothe Google Pain. Retrieved June 2014 from http://blog.pagefair.com/
2013/acceptable-ads-soothe-google-pain/.
PageFair. 2013b. The Rise of Adblocking. Retrieved June 2014 from http://blog.pagefair.com/2013/the-rise-
of-adblocking/.
PageFair. 2014a. Introducing PageFair Ads. Retrieved June 2014 from http://blog.pagefair.com/2014/
introducing-pagefair-ads/.
PageFair. 2014b. We help Websites Survive the Rise of Adblock. Retrieved June 2014 from http://pagefair.com/
about/.
N. Parashuram. 2011. Writing browser extensions - Comparing Firefox, Chrome and Opera. (October
2011). Retrieved March 2013 from http://blog.nparashuram.com/2011/10/writing-browser-extensions-
comparing.html.
Bambang Parmanto, Reza Ferrydiansyah, Andi Saptono, Lijing Song, I. Wayan Sugiantara, and Stephanie
Hackett. 2005. AcceSS: Accessibility through simplification & summarization. In Proceedings of the
2nd International Cross-Disciplinary Workshop on Web Accessibility. ACM, New York, NY, 18–25.
Mihai Parparita. 2012. Gmail Greasemonkey API 1.0. (December 2012). Retrieved March 2013 from
https://github.com/mihaip/gmail-greasemonkey/wiki/Gmail-Greasemonkey-API-1.0 .
Iñaki Paz and Oscar Dı́az. 2010. Providing resilient XPaths for external adaptation engines. In Proceed-
ings of the 21st ACM Conference on Hypertext and Hypermedia (HT’10). ACM, New York, NY, 67–76.
Yury Puzis, Eugene Borodin, Faisal Ahmed, Valentyn Melnyk, and I. V. Ramakrishnan. 2011. Guide-
lines for an accessible web automation interface. In Proceedings of the 13th International ACM
SIGACCESS Conference on Computers and Accessibility (ASSETS’11). ACM, New York, NY, 249–250.
T. V. Raman. 2009. Toward 2W, beyond web 2.0. Communications of the ACM 52, 2 (Feb. 2009), 52–59.
Juan Raposo, Alberto Pan, Manuel Álvarez, and Justo Hidalgo. 2007. Automatically maintaining wrap-
pers for semi-structured web sources. Data Knowledge Engineering 61, 2 (July 2007), 331–358.
DOI:http://dx.doi.org/10.1016/j.datak.2006.06.006
John T. Richards and Vicki L. Hanson. 2004. Web accessibility: A broader view. In Proceedings of the 13th In-
ternational Conference on World Wide Web (WWW’04). ACM, New York, NY, 72–79. DOI:http://dx.doi.org/
10.1145/988672.988683
Paola Salomoni, Silvia Mirri, Stefano Ferretti, and Marco Roccetti. 2008. A multimedia broker to support
accessible and mobile learning through learning objects adaptation. ACM Transactions on Internet
Technology 8, 2, Article 4 (2008), 23 pages. DOI:http://dx.doi.org/10.1145/1323651.1323655
Christopher Scaffidi, Christopher Bogart, Margaret M. Burnett, Allen Cypher, Brad A. Myers, and Mary
Shaw. 2010. Using traits of web macro scripts to predict reuse. Journal of Visual Languages & Computing
21, 5 (August 2010), 277–291. DOI:http://dx.doi.org/10.1016/j.jvlc.2010.08.003
Andrew Jon Sellers. 2011. The OXPath to success in the deep web. In Proceedings of the 20th Interna-
tional Conference on World Wide Web (WWW’11). ACM, New York, NY, 409–414. DOI:http://dx.doi.org/
10.1145/1963192.1963352
Stephen Shankland. 2010. Opera calls for browser extension standard. (October 2010). Retrieved March
2013 from http://news.cnet.com/8301-30685_3-20019579-264.html.
Remy Sharp. 2010. What is a polyfill? (October 2010). Retrieved June 2014 from http://remysharp.com/
2010/10/08/what-is-a-polyfill/.
Hironobu Takagi, Shinya Kawanaka, Masatomo Kobayashi, Takashi Itoh, and Chieko Asakawa. 2008. Social
accessibility: Achieving accessibility through collaborative metadata authoring. In Proceedings of the
10th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS’08). ACM,
New York, NY, 193–200. DOI:http://dx.doi.org/10.1145/1414471.1414507
Dereck Toker, Cristina Conati, Giuseppe Carenini, and Mona Haraty. 2012. Towards adaptive infor-
mation visualization: On the influence of user characteristics. In Proceedings of 20th International
Conference User Modeling, Adaptation, and Personalization (UMAP’12). Springer, Berlin, 274–285.
DOI:http://dx.doi.org/10.1007/978-3-642-31454-4_23
Andrew Trusty and Khai N. Truong. 2011. Augmenting the web for second language vocabulary learning.
In Proceedings of the 29th International Conference on Human Factors in Computing Systems (CHI’11).
ACM, New York, NY, 3179–3188. DOI:http://dx.doi.org/10.1145/1978942.1979414
Scott R. Turner. 2005. Platypus. (2005). Retrieved March 2013 from http://platypus.mozdev.org/.
Lance Whitney. 2009. Average net user now online 13 hours per week. (December 2009). Retrieved June
2014 from http://www.cnet.com/news/average-net-user-now-online-13-hours-per-week/.
Wikipedia. 2013. Comparison of HTML5 and Flash. (2013). Retrieved March 2013 from http://en.wikipedia.
org/wiki/Comparison_of_HTML5_and_Flash.
Wikipedia. 2014. Modding. (2014). Retrieved June 2014 from https://en.wikipedia.org/wiki/Modding.
Wikipedia. 2014. Web scraping. (2014). Retrieved June 2014 from http://en.wikipedia.org/wiki/Web_scraping.
Rolfe Winkler. 2014. Google removes two Chrome extensions amid ad uproar. (January 2014). Retrieved
February 2015 from http://blogs.wsj.com/digits/2014/01/19/google-removes-two-chrome-extensions-
amid-ad-uproar/.
Lars Witell, Per Kristensson, Anders Gustafsson, and Martin Lofgren. 2011. Idea generation: Customer
co-creation versus traditional market research techniques. Journal of Service Management 22, 2 (2011),
140–159.
Yeliz Yesilada, Robert Stevens, Simon Harper, and Carole A. Goble. 2007. Evaluating DANTE: Semantic
transcoding for visually disabled users. ACM Transactions on Computer-Human Interaction 14, 3, Article
14 (Sep. 2007). DOI:http://dx.doi.org/10.1145/1279700.1279704
Received October 2013; revised December 2014; accepted February 2015

The Augmented Web: Rationales, Opportunities, and Challenges On Browser-Side Transcoding

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The Augmented Web: Rationales, Opportunities, and Challenges On Browser-Side Transcoding

Uploaded by

Copyright:

Available Formats

The Augmented Web: Rationales, Opportunities, and Challenges

2. WHAT WA IS AND IS NOT

Table I. Web Personalization Versus Web Augmentation

Table II. Characterizing Augmenters Based on Matter and Scope

also third parties. WA opens personalization to consumers and, in so doing, departs

2.1. What WA Is Not

Fig. 2. Examples of WA usage.

an individual basis. Indeed, A Bit Better Remember-the-Milk (RTM) is a WA extension

5. WHAT THE MOST COMMONLY CITED WA ISSUES IN THE LITERATURE ARE

5.1. Research Method

5.2. Data Analysis

used for lightweight developments that imply neither an important consumption of

17 Producer comments and consumer comments were obtained from addons.mozilla.org/firefox/addon/

the following extensions: adbar,18 Wikalong,19 KeywordSelection20 ). We believe this is

and website owners. A pioneer case is that of GMail. As an alternative to HTML

Table IV. Continued

Received October 2013; revised December 2014; accepted February 2015

You might also like