Problems With P3P 1.0 and Proposals For P3P 2.0

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Problems with P3P 1.0 and Proposals for P3P 2.

Position paper for "Future of P3P" workshop, Dulles, Virginia, USA, 12-13 November 2002.

Giles Hogben, Joint Research Centre


Contact:giles.hogben@jrc.it cc to: marc.wilikens@jrc.it

This paper represents the views of the Joint Research Center of the European Commission and has been broadly endorsed by
the Internet Task Force of the Article 29 working group of the European Commission.

Table of Contents

1. Introduction............................................................................................ 2
2. Background ............................................................................................ 7
3. Short Term Issues ................................................................................... 2
3.1 Compliance with DPD principles. ........................................................... 2
3.1.1. Cookie Management ..................................................................... 2
3.1.2.Point of provision of purpose specification
3.1.3. Consent issues............................................................................. 8
3.1.4. Geographical Semantics ................................................................ 3
3.2 User interface issues ........................................................................... 3
3.3 Short term vocabulary issues................................................................ 4
4. Long Term Issues Summary

Annexe 1 Background

Annexe 2 Long Term Issues


5. Long Term Issues .................................................................................... 8
5.1 Long term vocabulary issues. ............................................................... 8
5.2. Expression of security measures .......................................................... 9
5.3 Identity Management and management of data flows............................. 10
5.4 APPEL issues ...................................................................................... 5
5.5 Repudiatability of policies..................................................................... 9
5.6 Non-enforceability - automated audit trail systems. ............................... 11
REFERENCES ............................................................................................ 15
P3P technology and policy issues from a EU perspective

1. Introduction
This paper discusses problem areas in P3P 1.0 and proposes possible solutions. For completeness, it contains a full list of what
we regard as the most important problem areas. It covers: compliance with EU data protection principles, user interface
issues and vocabulary issues, management of data flows and identities, APPEL issues and non-repudiatability.
In keeping with the theme of the workshop, we have focused on the issues which lend themselves to a more immediate
solution. Since awareness of the long term issues is crucial to the planning process for short term issues, we include a
summary of longer term issues with a full description in an annexe. We plan to present these in detail at a workshop planned
for spring 2003 in Europe.

Long term issues are contained in an annexe. We have, however included a short summary of each issue in the main section
of the paper because we wish to stress that many of these issues need to be addressed even when planning how to work
towards the short term issues.

2. Short Term Issues


2.1 Compliance with DPD principles.

2.1.1. Cookie Management


Problem:
There are two crucial data transfer events in the lifetime of a cookie. Firstly when it is set by a remote server using a "set-
cookie" http header. At this point the data, which the data controller wishes to make persistent is stored on the user's
computer. The second event is called "replay", which refers to the time when the information stored in the cookie is sent back
to the server. In P3P, policy evaluation may be applied just before either of these 2 events takes place. According to P3P,
when a server stores information on a user's own computer, no data transfer event has yet occurred.
However, according to the EU evaluation group [3], this event already constitutes data processing and the rules of the
directive must therefore be applied at set time. If data is stored, even if on the user's own computer, it is an act on behalf of
the controller which constitutes data processing. According to EU policy therefore, Cookie policies should be evaluated in
relation to the time when they are set, not the time they are replayed.
This requirement considerably complicates the issue. Its satisfaction is not a simple matter of applying P3P at the point of
"cookie-set", because there are also counter-motivations for NOT requiring application of P3P at cookie-set time. These relate
to the fact that cookies can be set at a domain level by multiple sub-hosts. For example geocities.com controls 1000's of
subhosts (x.geocities.com), each of which may set a cookie, which could be replayed to every host in the geocities domain. If
P3P policies for cookies must be applied at set-time, the setting host must be responsible for whatever is done by any of the
1000's of hosts in the geocities domain. If, however policies may be applied at send time only, then each host is responsible
for what it does with the cookie data.
It is worth mentioning in this section that no current P3P user agent implementation except the JRC proxy implements a
cookie management feature using full P3P policies (MS IE6 uses compact policies). Neither do they base their decisions in any
way on the content of the purpose or recipient flags within a compact policy. This is a serious omission for the implementation
of a specification, which must be able to operate within an environment which aims to protect users according to the statutes
of European Law. Services contravening European data protection law may be physically outside of European jursidiction but
are still subject to European law if requested by users within the EC. It is also an obstacle to the progress of P3P
implementation as cookies are one of the areas where the full weight of P3P should be applied. We suggest that steps should
be taken in future to ensure that implementations give at least the possibility of imposing some purpose and recipient criteria
on cookies.
Proposed Solution:
Short of altering the cookie concept itself, there is no way around the fact that cookies may be replayed to hosts within the
same domain, other than those, which set them.
In fact the P3P 1.0 specification already contains the sentence:
"Any host to which the cookie may be replayed MUST be able to honor all the policies associated with the cookie, regardless of
whether that host declares a policy for that cookie." ,which covers this problem. However it still allows for P3P to be applied at
cookie send time, which according to the EU working groups contravenes the rights of the data subject. Despite the
restrictions this puts on companies which host large sets of hosts, the only solution to this problem appears to be to specify
that P3P must be applied at cookie set-time and the above caveat maintained in the specification. This is not as draconian as
it might appear. If a controller does not want a cookie to be applied to an entire domain, it can easily restrict its application. If
it does, and the cookie contains sensitive information, then the company should be prepared for the consequences of this
information getting into the wrong hands.

2.1.2. Point of provision of purpose information to users


Problem:
[It is required] "to mention clearly the existence of automatic data collection procedures, before using such a method to
collect any data." [1]

2
P3P technology and policy issues from a EU perspective

Human readable information on data processing purposes should be conveyed to users prior to any data collection act. P3P1.0
does not provide mechanisms which adequately satisfy EU requirements in this respect. Although P3P policies can be used to
pre-inform the user of privacy practices, the default implementation (for example in IE6) and the sense of the specification, is
for the user to be informed ex post factum.
Proposed solution:
The solution to this problem is to emphasize somewhere in the P3P specification, and in any implementation guides that in
order to satisfy European law for acts of data collection involving PII, display of human readable elements must not be ex post
facto. This involves user interface issues because it is potentially very invasive to present all the information on data collection
purpose prior to every collection event. We believe however that these issues can be solved and we present a potential
solution in section 5 below.

2.1.3. Geographical and Jurisdictional Semantics


Problem Description:
" Where it is anticipated that the data will be transmitted by the controller to countries outside the European Union, to indicate
whether or not that country provides adequate protection of individuals with regard to the processing of their personal data
within the meaning of Article 25 of Directive 95/46/EC. In that case, specific information must be provided on the identity and
address of the recipients (physical and/or electronic address)"[1]
But P3P does not allow controllers to inform data subjects about envisaged transfers to third countries because there is not
any taxonomy of geographical or jurisdictional information, which can be attached to recipients.
Proposed solution:
Pragmatically speaking what is required is the ability for a user agent to distinguish whether a recipient complies to the level
of privacy protection provided within a particular jurisdiction, or whether instead a transaction should be blocked because the
data is taken outside a "safe harbour". In order to achieve this for the purposes of EU law, a simple binary attribute such as
Euleveljurisdiction="Yes/No" indicating whether the recipient operates under a jurisdiction which protects the user's data to
the same level or above the level of European law would suffice.
Another quick solution is to use the provision existing in P3P 1.0 for text addresses within the ENTITY element. For example.

<ENTITY>
<DATA-GROUP>
<DATA ref="#business.name">CatalogExample</DATA>
<DATA ref="#business.contact-info.postal.country">USA</DATA>
</DATA-GROUP>
</ENTITY>

In conjunction with a list of countries and information about their legal systems, this could allow the agent to determine
whether the controller was outside EU jurisdiction. If the same semantics could be included within <RECIPIENTS/> elements,
then this would offer a quick fix for this problem. It is clearly not ideal, because it requires, but does not provide a
standardization of CDATA elements within the DATA elements. It would also not be able to tell us for example whether a
country was within a supranational area of jurisdiction such as the EU.
This solution however makes it difficult to determine jurisdiction, although it can be deduced to a certain extent from any
country information contained. It would not for example be able to express the fact that a piece of information had passed
with a US "safe harbour" zone. We suggest therefore that in the longer term, P3P should move to an RDF [5] + OWL [6]
ontology solution. As mentioned below, regional and jurisdictional ontologies could then be plugged in or as may be
necessary in the case of the jurisdictional ontology, created from scratch. These would then provide the semantic richness to
make this kind of distinction, as well as other useful distinctions which would follow from such information. For example a
taxonomy of jurisdictions and a dynamic ranking system which can be referred to in order to determine the relative levels of
privacy protection would address the unsolved problems in expressing the EU directives. In the longer term, a specification
should not show such obvious regional bias as the "EUleveljurisdiction" attribute mentioned above. Therefore a more neutral
means of expressing the same thing must be developed.

2.2 User interface issues


The problem of user interface is a very important issue to solve in the immediate future. P3P provides a wealth of information,
which by its nature of improving transparency of data processing practices, should somehow be available to users. It also
provides a framework for specifying privacy preferences in a very granular way. However studies have shown that most users
are willing to sacrifice very little time in order to protect their privacy [7]. This gives serious challenges to anyone developing
an agent implementation for P3P user preference specification and checking.

P3P 1.0 makes no mention of user interface issues, neither does the P3P implementation guide. There are good reasons for
this, as it is generally outside of the scope of a specification to specify how it will be implemented. However, there are
considerable problems to be overcome in this area and there is considerable interplay between the specification and its
implementations. For example, P3P 1.0 provides several attributes to enable a user interface to display a human readable
distillation of a policy. It is therefore equally crucial for subsequent versions of P3P to make concessions to the user interface,
which take into account likely solutions. To this end, we would make the general remark that it is important for the
specification group to work alongside implementers and researchers on such issues.

3
P3P technology and policy issues from a EU perspective

We now describe some of the specific challenges to be overcome and some possible solutions:

2.2.1. How to create meaningful preference interfaces.


In matching P3P policies, there is a huge range of options, which an agent can try to look for. The W3C note, APPEL [8] gives
a suggested specification for a matching algorithm and interchangeable XML rule format, which is in fact the only existing
interoperable format for preference files. However, to implement a user interface to the full range of possibilities within APPEL
results in an extremely complex interface and in fact there is only one utility existing, designed as part of the JRC P3P project.
The experience of designing this interface has suggested several points, which are relevant to further development:

An interchangeable format for preference rules is very important as it allows data protection professionals to disseminate
minimum guidelines and default privacy protection levels to users who have neither the time nor the knowledge to create
themselves.
The user interface for this would be much improved by a move of P3P to a formalized ontology. If P3P were expressed within a
formal ontology, tools for visualizing this could be used within user interfaces, and translations of interfaces, which are natural
to users could be made into RDF based query languages for matching purposes. Ontology interfaces such as OILED [9] allow
natural representations of conceptual frameworks to be made.
Interfaces should be sharply divided into sections for expert users and users who can choose between a small number of
predetermined preference sets. However, it should be possible for users to make more advanced choices if they wish.

2.2.2. How to create meaningful feedback systems.


P3P 1.0 provides several different types of information, which can be fed back to the user during or after the P3P evaluation
process. These types are:

Specific human readable sections in policies.


Translations of the semantics of policies.
Policy and PRF evaluation feedback from rule matching engines.

P3P allows for a range of levels of feedback into the user experience - from a simple red light system as in Internet Explorer
6.0 - to a full page explanation of what happened in the evaluation in the resource. Users should be given a choice between
levels of information they want to receive. This is achieved to a certain extent in existing implementations, however we feel
that it could be improved upon if P3P were incorporated into a "tab" system such as is used to provide history information in
IE. A “tab” system is a collapsible frame on the left hand side of every page, which can provide real-time information relating
to pages displayed opposite, or other information, independent but simultaneous to that of the page being browsed. Netscape
now makes a lot of use of a system of "tabs" to provide information such as bookmarks, news, history. This system is not
currently used in any P3P system but would seem to be an ideal mode of providing feedback. The difference between this and
providing information as IE and Netscape currently do is that if the user wishes, the tab can be a source of information which
provides instantly available fresh information on the resources being accessed, simultaneous to page viewing. Such a tab
could be configured, rather as the IE search tab to provide different levels of information.

The JRC's proxy implementation makes use of a similar system, which was implemented after experimentation with other
systems. It uses an expandable floating privacy tab, which expands on mouseover and provides information on pages which
have been evaluated in the current session. As it is part of a proxy server implementation, it cannot alter the browser
interface and is inserted as a DHTML tab. However, it has many of the advantages of a tab

2.3.3. Specification Issues for User Interfaces.


From the point of view of the specification, we would suggest 2 points for improvement, which are relevant to the user
interface:
1. A formalized ontology would also help in the presentation of information to the user. It would allow implementers to
leverage conceptual visualization techniques, which have been developed within these systems, with minimal effort. A clearer
class structure, with a transparent relationship to natural language, would also help.
2. If the specification could be designed so that the semantics were clearly divided between into 2 levels of detail, one meant
for quick summaries, and the other for detailed presentations. For example, there could be a system of policy meta-data,
which would provide key descriptive terms according to a metadata schema. Another example is to provide an XSL stylesheet
for summarizing policies. The current compact policy specification has gone some way towards this.

2.3 Short term vocabulary issues


Apart from user interfaces, the issue of a semantically transparent taxonomy is perhaps the greatest challenge for P3P and is
inseparable from many of the issues above. Despite incompleteness in certain respects, P3P 1.0 provides a sound foundation
for such a taxonomy. Although P3P is not expressed in standard ontology syntax, it represents, through the W3C processes,
which have underpinned it, a five year process for a data protection vocabulary. We feel however that the taxonomy
represented by the existing version of P3P could be improved in the following ways.
3.3.1. An issue of urgency is P3P's schema [16] for specifying categories of personal data, which currently has the
following problems.

4
P3P technology and policy issues from a EU perspective

It is defined in a very cumbersome way (by a P3P specific pointer mechanism within a flat definition scheme) - this is highly
problematic for implementations and could easily be rectified by a move to a formal ontology, or at least to a standard XML
schema syntax using REFID's for multiple subclassing.
It has not been subjected to rigorous use case scenarios. For example, the following category description from the base data
schema is the closest we can get to describing the http header information.

http Navigation and Click-stream Data, Computer Information httpinfo HTTP Protocol Information

However, http header information cannot be described by any of the terms in the sentence " Navigation and Click-stream
Data, Computer Information ".
It does not have a well-defined semantics. It is not clear whether the items in the schema refer to categories, or to data
objects. For example in discussions with the working group, we have seen the data schema term "email" described as
referring to an email address, whilst we would argue that it refers to a class into which may be placed all data objects, which
can be described as "email". There is an important difference because the term, "user" clearly does not refer to a user, but to
his data. This needs to be rectified within the context of a more formal ontology development process, as described below.
Although it does provide a P3P-specific customization route, because of the non-standard syntax, this does not allow
applications to leverage existing ontological frameworks and thus widen the descriptive power.
3.3.2. As mentioned above, the recipient taxonomy stands at the central point of any privacy taxonomy and as such it is
not sufficiently descriptive in P3P 1.0. Specifically, it needs to be altered to be able to express the requirements of the
European directive. Furthermore, as has been noted elsewhere in this paper, there are more fundamental requirements, which
need to be addressed in order for the recipient taxonomy to be adequate. These include the ability to attach security and
jurisdictional taxonomies as sub-trees of recipient instances.
3.3.3. The purpose specification taxonomy needs to be subjected to more rigorous use case scenarios. The evaluation
group [3] felt that given the purpose specification of data collection is one of the most important elements in the taxonomy,
the 12 cases provided did not cover what is required.
Consider the following sentence from Doubleclick's human readable policy.

"DoubleClick does use information about your browser and Web surfing to determine which ads to show your browser."

P3P would cover the preceding sentence with the Element <customization/> and possibly <individual-decision/> and
<tailoring/>- however it is not clear from any of these, and it cannot be expressed, that it is for the purposes of advertising
third party products. This would however be something of concern to many users.

2.4. APPEL issues


Problems:
As we noted above, a preference exchange language is a very necessary part of P3P. However, there are various problems
with the preference expression language. Constructing the logic of matching patterns is very complex, and involves various
inherent contradictions. For example, the following rule looks for any information which is not the user's IP address or user
agent string and blocks resources which ask for it.

<appel:RULE behavior="block"><p3p:POLICY>
<p3p:STATEMENT><p3p:DATA-GROUP appel:connective="non-and">
<p3p:DATA ref="#dynamic.clickstream.clientip.fullip"/>
<p3p:DATA ref="#dynamic.http.useragent"/>
</p3p:DATA-GROUP></p3p:STATEMENT>
</p3p:POLICY></appel:RULE>

This RULE will cause a block behavior for the following web site policy (only relevant parts quoted),

<POLICY>
<STATEMENT>
<DATA-GROUP appel:connective="and">
<DATA ref="#dynamic.clickstream.clientip.fullip"/>
<DATA ref="#dynamic.http.useragent"/>
</DATA-GROUP>
</STATEMENT>
</POLICY>

but not for this one

<POLICY>
<STATEMENT>
<DATA-GROUP>
<DATA ref="#dynamic.clickstream.clientip.fullip"/>
</DATA-GROUP>
5
P3P technology and policy issues from a EU perspective

</STATEMENT>
<STATEMENT>
<DATA-GROUP>
<DATA ref="#dynamic.http.useragent"/>
</DATA-GROUP>
</STATEMENT>
</POLICY>

Note the presence of the "non-and" connective, which means - "only if not all sub-elements in the rule are present in the sub-
elements of the matched element in the policy". This is true for the first policy snippet but not the second, which given that
they have the same meaning is clearly unacceptable. We will look at solutions which address this problem below.
Proposed solutions:
We have already noted the benefits of moving APPEL to a version based on an OWL P3P ontology of P3P, namely
improvements in visualization techniques and reasoning. Given the work involved, this may however be considered a long
term objective.

A more immediate solution, which would an initial use of a standard query language for the condition matching part of APPEL.
Instead of using APPEL's somewhat quirky APPEL connective system and recursive matching algorithm the rule condition could
be specified by an XPATH [12] query (or by the time it becomes relevant, Xpath 2.0[11]). These query languages are
designed to match arbitrary node sets with high efficiency. They have the advantage that developers are familiar with them
and efficient algorithms exist to execute the queries. As it has become very clear that APPEL is not a language that will be
written by anyone other than developers or ordinary users using a GUI, this is clearly the best approach.

E.g. a rule in this format, which would solve the above ambiguity problem would be:

<appel:RULE behavior="block" prompt="yes" promptmsg="Rule found policy using your home info beyond current purpose
">
<appel:MATCHQUERY query=
"//DATA[not(substring(@ref,' dynamic.clickstream.clientip.fullip') or substring(@ref,' dynamic.http.useragent'))]"
querylangauge="XPATH">
</appel:RULE>

It should be noted that the recent issue of the XPATH 2.0 [11] specification, which provides an even more powerful matching
language, makes this an even more compelling solution.

3. Long Term issues Summary.

1. Long term vocabulary issues.


We suggest that P3P should be based on an OWL ontology, which has been developed according to a formally documented
ontology capture process. We give motivations and the many advantages of this approach.
2. Consent Issues.
We suggest that P3P is the ideal specification in which to incorporate a mechanism for obtaining consent (signed or otherwise)
from users for data processing. We suggest using an http header mechanism to achieve this.
3. Non-repudiatability issues.
We discuss the issues around the use of XML digitally signed policies in order to increase consumer trust and provide a
watertight route of legal recourse.
4. Expression of security measures.
We outline reasons for the inclusion of a security measures taxonomy within P3P, most notably that this is required for policies
to be able to express compliance with EU law. We stress the link between this and a later discussion of audit trails.
5. Identity Management and Data flows.
We explain the need for granular management of data flow within P3P or APPEL and the need to link to the syntax of XForms.
7. Audit Trails
We discuss the idea of addressing the problem that P3P provides no enforceability by outlining in detail a proposal for using
P3P and the proposed P3P ontology to provide an interoperable system for privacy audit trails. This is to address the question:
“That’s what they say – but what if they don’t do what they say?”

6
P3P technology and policy issues from a EU perspective

Annexe 1. Background
The findings of this paper are based on the following:
1. Research into P3P and a full participation in the development process of the standard, including the development of a
reference implementation (see http://p3p.jrc.it), which was the first (and until now the only) implementation, which fully
complies with the P3P1.0 April 2002 specification. It consists of an open source Java User Agent and a model e-commerce
site. The agent was built specifically with the intention of demonstrating and evaluating P3P from a research perspective. The
architecture was designed with the following objectives:
• Ease of extension
• Open source and modular to allow others to experiment with it.
• Quick development time
• Browser independence.
2. A meeting of privacy experts held on May 27th 2002. The meeting included a demonstration and covered concerns and
issues, many of which are relevant to this paper. A report [3] was published with the findings of this meeting.
3. Research published in a peer-reviewed publication, "A Fully Compliant Research Implementation of the P3P Standard for
Privacy Protection". This paper will be published at the European Symposium on Research in Computer Security (ESORICS)
outlining our most important findings to a high level of detail. [2]
4. A special meeting of the Internet Task Force of the Art 29 WG on September 23rd 2002, DG Markt, Brussels.

7
P3P technology and policy issues from a EU perspective

Annexe 2. Long Term Issues


5. Long Term Issues
5.1 Long term vocabulary issues.
1. The P3P taxonomy should be given a formally documented consensus process, which traces its approval by stakeholders,
and includes methods for eliciting expert knowledge from stakeholders who are not necessary technical experts. The JRC is
developing such a process in conjunction with Aberdeen university and has published a description of it.
2. It should be expressed in formal Ontological syntax according to the latest OWL specification of W3C. This has the obvious
advantage of integrating it in a well grounded semantic model and allows developers to leverage a large corpus of work done
on ontology visualization, RDF query languages and rule based matching systems. It also allows the taxonomy to hook into
related ontologies, such as a full geographical ontology, which, as mentioned above, is necessary for compliance with
European law.
3.Clear understanding of terms allows ease of translation between alternative ontologies. For example if 2 competing data
protection ontologies are developed, but use the same standard, then a translation service between them can easily be set up.
4. Clear separation of vocabulary and syntax allows the same vocabulary to be plugged into different data protection systems.

5.2. Consent issues.


Problem description
The Article 29 working group has stated. "Internet users must have a real possibility of objecting …on-line by clicking a
box"[1]. Any collection of personal data must have a specific opt-in mechanism - in other words, consent must be explicitly
expressed.
Although P3P is able to check what a P3P policy states about consent, using the opt-in, opt-out attributes in the policy, it is
not able to check that there is actually a mechanism in place for doing this. More specifically, the following should be
provided;
An integration with Xforms [4] to extract the semantics of consent boxes and validate claims of opt-in mechanisms.
Methods for expressing (possibly signed) consent. Although the requirements of the directive do not stipulate this, the specific
requirement that users must have the option of explicit objection to data collection effectively requires that businesses can
prove, in certain cases that consent was given. If some way of expressing signed consent were built into P3P, it would be a
considerable aid to both parties and especially to businesses wishing to protect themselves against the consequences of
disputes.
It may be argued that it is not within the remit of P3P to deal with the issue of consent, and that this should be addressed
perhaps by the XFORMS group. However, consent for using personal data is an issue, which relates specifically to data
privacy, and is independent of whether that data is transmitted through forms, or through for example, http headers.
Therefore P3P is the ideal specification to include a mechanism for expressing consent.

Proposed Solution:
Here we outline a sketch of how such mechanisms might work. Full details would be a matter for the specification group.
5.2.1. Checking for an opt-in/out mechanism

a. There could be a specific attribute published within a namespace approved by the P3P specification, but mentioned within
the Xforms specification (alongside other proposed attributes such as the policy reference declaration), which expresses in a
machine readable way the fact a check box or other formfield is for expressing consent.s

E.g. <xform:checkbox ref="YesIDo" P3P:consentfield="yes">

This would have the important advantage of providing a standard syntax useable by all form systems for expressing consent.
5.2.2. Requesting signed consent

We considered the possibility of a mechanism for expressing consent by including, within a policy, an element specifying the
name and various other specifications for a hidden form field to be added to a POST operation, containing a signed statement,
as specified in the element.
However, this mechanism has several disadvantages:
− It is attached to the form, and not the processing application. There is therefore not an authoritative relationship to the
application which processes the data. For example many web shops use third parties to process their forms. These third
party processors might find it difficult to control expressions of consent if they were managed by third parties'.

8
P3P technology and policy issues from a EU perspective

− It is not ideal for the client to have to add POST fields to a form, there considerable opportunity for ambiguity. Also it is
generally harder for a third party software vendor to alter the operation of an application (e.g. API) to do this than for
example to alter http headers sent.
We therefore suggest that a mechanism could be provided for requesting and providing consent using http headers, which
would also provide the option of asking for a signed consent.
In this case, an element would be added to the P3P policy similar to the following:

<DATA ref="user.home-info">
<CONSENTREQUEST method="httpheader" headername="consent1">
<DATAREQUIERED certificate="X.509" algotrithmtype="RSA" minkeylength="128">I agree that
my data in this form will be published on the internet.
</DATAREQUIRED>
</CONSENTREQUEST>
<DATA/>

CONSENTREQUEST could be inserted within a DATA element to state that the collection of this type of data requires the
consent of the user, and how this consent should be sent.
“method” - specifies that the consent should be specified using an HTTP header specified by the attribute “headername” -
specifies the name of the header which should contain the signature data. The DATA element contains the statement which is
required to be signed to express consent. In its attributes, it contains various requirements to allow for flexibility in the
requirements for signature types.
5.2.3. Structure of message.

To be of any use, consent messages need to be stored in a structured way in the "back office" of the service provider. The
most important requirement for the "back office" is that the message can be linked to the data which it provides consent in
the case of a dispute. This requirement however needs to be set against the possible loss of privacy involved should the
message be linked with a unique identifier.
Because of this latter consideration, it should be left up to the service provider to link the consent message with a unique
identifier binding it to the information, such that the possible privacy losses contained in such an identifier are appropriate to
the situation. For example if the subject is willing for their entire information to be retained indefinitely, then a hash of the all
or part of the information may be used. However, if they are not, then this is not appropriate, because such a hash could later
be used to perform data mining operations on sensitive information. In this case, a hash of some form of session id might be
more appropriate. Another solution is some form of escrowed key system which could be used to unlock the identifiers by a
legal authority requiring the proof of consent. This is overkill for most situations. In either situation, the date of the consent
may be taken from the http request headers.
One possibility for structuring of the messages themselves however is to express them according to the proposed OWL P3P
semantic model. For example, RDF statements could be constructed to formally express statements such as
"I am a data subject and I agree that the data objects transferred in this request may be transferred to third parties."
(ontological terms underlined)
If such a consent statement were expressed using RDF statements it would carry more legal weight through this unambiguous
and transparent semantics and would make management of different consent statements easier by making them easily
processable by software agents.

5.3. Repudiatability of policies.


Problem:
A principle problem for P3P is that if a company’s practices contravene its stated privacy policy, there is little technical
framework to prove that a company made the statements which may have existed on its server at a given time. I.e. it is too
easy for a company to repudiate its policy.
While P3P does increase the level of trust felt by consumers by providing more transparent and unambiguous information, it
does not however provide any assurance as to the authenticity and integrity of this information.
Proposed solution:
XML signatures [13] offer an ideal solution to the problem of making a policy at a given URI non-repudiatable. XML signatures
provide the opportunity to introduce assertions such as "X assures the content of this document" into the semantics of signed
material. Also since P3P is entirely expressed in XML, it is pragmatic to use the XML version of asymmetric digital signatures to
provide this assurance. The following section defines in detail how this might be achieved.

Joseph Reagle of W3C has already gone some way towards outlining the detail of this solution. We examine and build upon
the proposals of Reagle [14] for the inclusion of XML digitally signed [13] policies within P3P. As Reagle has already set out
most of the mechanisms for achieving this, we make only three minor additions to the technical specification. Our main aim is
to look at possible technical problems with the use of the XML signature extension, and their solutions.

XML Digitally Signed Policies.


9
P3P technology and policy issues from a EU perspective

P3P enabled servers could have the possibility of providing an XML digital signature as part of their policy, or as a separate
document referenced within the policy. This is easily accomplished provided that the correct syntax is incorporated into the
P3P specification, as shown by Reagle. Reagle's example should however be modified in the following ways.
a) Add an X.509 certificate bag to provide full non-repudiatability.
b.) Include a time stamp to comply with EU regulations.
c.) Require an additional signature over the PRF, which details which resources the policy applies to. Any signature that does
not assure this information loses much of its legal significance. Note also that this signature cannot be bundled in with the
policy signature because several PRF's may refer to the same policy. Furthermore, the person responsible for producing policy
signatures may not even know the location of PRF's referring to the policy (in the case of a standard issue policy used by third
parties.) We suggest the addition of a "DISPUTES" element to the PRF identical to the DISPUTES element in the P3P policy
which allows the specification of a signature URI using the validation attribute.

The P3P process has 2 main components on the server; an XML policy and an XML PRF, which binds policies to resources.
Semantically therefore, a P3P statement is a combination of the policy and the PRF, which binds a policy to a resource at a
given time. The PRF, like the policy has a validity timestamp.
However, Reagle’s P3P extension does not include any binding signature for the P3P PRF. This is an oversight, because
omission of a signature binding the policy to the resource it applies to negates the non-reputability of the statements being
made. The importance of the PRF cannot be ignored. We therefore suggest that a signature also be provided for any PRF's
used. We show, however, in the example signature the necessary extensions for a signature to be bound to a PRF. It is also
worth mentioning the possibility of an additional signature over the human readable policy, which could be achieved by the
same mechanism. There has been some discussion of the fact that there may be discrepancies between the human readable
and the XML version of privacy policies. This would ensure a commitment to consistency between both versions.

5.4. Expression of security measures


Problem description:
The European directive specifies that adequate security measures should be taken to protect data (95/46/EC Article 17).
However there is no means within P3P to express the level of security around personal data. This may be seen also as an
issue of vocabulary. However we have included it in this section because it specifically relates to the EU directives. The
reasons for this are clear. State-of-the-art security measures are constantly changing. Furthermore, it is very difficult to
define security measures in any meaningful way. It might for example be stated that a database is password protected, but
password might in reality be "abc".
Proposed Solution:
There are several candidate schemas already in existence for classifying and describing security measures. It may be that
these are able to provide some solution to this problem if incorporated within the P3P taxonomy. However, as mentioned
above, any security taxonomy will either be too general to be useful, or would be out of date within a short space of time.
An additional solution, which may solve this problem is therefore to provide the opportunity for third party security seals
within policies. P3P already provides a placeholder for data protection seals within the DISPUTES element. However these do
not relate to security measures, only to data practices. The specific provision of a security seal placeholder would allow for a
validation by a third party which would not constrain expressiveness to a taxonomy of security based around a changing and
meaningless set of parameters. Instead, it would provide proof of a flexible and intelligent audit carried out by reputable
individuals. It may also include a datestamp as an indication of the "freshness" of the seal. The expense of providing a
meaningful seal may be a problem in in itself.
Finally, we suggest that the incorporation of a framework for machine understandable audit trails (see section…), may also
provide some solution to this problem, as it would provide the possibility for rapid and accurate assessment of security
policies.

5.5. Identity Management and management of data flows


Problem:
P3P 1.0 makes no recommendations on how to link privacy policies to data transfer events, and how to make decisions around
such events. The W3C APPEL note, which we look at later in this document, makes recommendations on how to make such
decisions. However, these recommendations are limited to only three basic behaviors. What is needed to make P3P into a
really powerful tool within ebusiness, is the ability to release data selectively based on privacy policies and the agent's level of
trust in them.
One of the main problems existing on the Internet today, is the amount of time and effort it takes to assess which bits of
one's identity to release and which to withhold. Therefore it would be useful to extend P3P so that it could automate such
decisions in a granular way.

To look at a specific example, the mobile device community has expressed interest in linking P3P with the CC/PP (Client
Capabilities, Preferences Profile). In this case, it would be extremely powerful if a P3P enabled agent + Rule base were able to
reveal only selected device capabilities, basing its decision on the privacy policy and which capabilities the service might need
to know. Most client applications would benefit from such a capability, if it were made easy and robust. When filling in forms,
users generally reveal only what is necessary and if the users do not trust the entity with the information which it claims to
require, they will not go ahead with the data transfer at all.

10
P3P technology and policy issues from a EU perspective

It should be mentioned that the ability to selectively release data is strongly connected with identity management, and
therefore any developments in this area should be linked into research in this area.
Proposed solution:
As this is an area where extensive further research is required, rather than describing a detailed solution, we will just outline
the technical requirements of such a system, and briefly suggest their likely solution.

Technical requirements:
1. An ontology expressive enough to capture the various data types which might make up a composite identity (selective
release of PII). This has already been discussed elsewhere in the paper.
2. Ways of linking that ontology to the data requests by client applications such as Xforms and CC/PP [10]. In other words
there should be a common ontology between these specifications and P3P, or an effective translation between them.
3. A rule language and User interface expressive enough to allow selective release of information. This would most likely
involve the definition of identities, in other words groups of information types, using a visual representation of a PII ontology
and their linking to certain patterns recognized in policies. The identities would therefore become ad hoc classes within the
ontology.
4. A clear specification for each page, what kind of information is being requested and which is optional, and which is required.
Without this, the engine cannot decide what information to release in a particular case. As it stands, it would be very difficult
for P3P to perform this function alone, because P3P policies are necessarily generalized between different resources and
semantically they do not give any information about what is required on a particular page.
The underlying semantic structure of P3P policies is:

"whatever the resource this policy is applied to, if you give us information x, we will do y"
, not
"please send us information x for resource y"

What is needed in this new scenario is both the above semantics. We suggest therefore that if the second semantic is
provided by (e.g.) the Xform and linked in a granular way with P3P policies, this provides enough information for an agent to
make a decision. For example a particular XForm might be able to express the semantic

"I require your email - this email address will be processed according to P3P policy Policy1, which can be found by means x."

Policy1 will then express the semantic.

"Email addresses will be given to 3rd parties for marketing purposes."

`
On the part of P3P, it would simply require the capability to associate P3P policies to a more granular level than that of
resources. In particular, in the case of Xforms, it requires P3P policies to be associated with individual form fields. If a more
general specification could be produced allowing association of policies with more diverse entities, this would open up the way
for the application of P3P in other similar settings such as CC/PP, irc (chat) etc...

5.6. Non-enforceability - automated audit trail systems.


Problem:
P3P has sometimes been presented as an aid to the enforcement of data protection principles. However, discussion among
data protection professionals brings up the obvious obstacle that P3P in its present form can only provide statements of
companies’ intentions about their data practices. This has no necessary connection with their actual practices and therefore
effectively makes P3P powerless as an enforcement tool. In other words, P3P cannot guarantee that the promise matches the
practice.

There is of course the inverse problem that companies may abide by the law to the letter, and yet not publish a p3p policy.
Therefore P3P can neither guarantee that a company is within the law, or that it is not.
Solution:
One solution to the first problem of non-enforceability, which is still somewhat within the realms of fantasy, but is however not
inconceivable, is to use the taxonomy of P3P, perhaps somewhat extended, within a system for automated audit trails.
This solution can be compared to the solution adopted by restaurants who wish to make clients trust their hygiene practices.
They put the kitchen in full view of the customers. In the same way, given a sufficiently standardized system, provided,
perhaps within P3P, servers could record their data processing events and security related events in such a way that
authorized auditing agents could assess them in a measurable way against the regulatory standards and of perhaps additional
standards of trust seals. The full details of such a system are beyond the scope of this paper. However, we present below a
scenario which helps to view this set of extensions in a concrete way, and from there to extract some requirements for P3P
2.0.

11
Audit Trail
P3P technology and policy issues from a EU perspective

Audit Trail Scenario:


1.User U submits their email address to company x1. This event is logged as: "Data submission event, data type
emailaddress:stored in database D“
2. Query of database D1 by software agent x3.
At this point, the data can take one of 2 paths, which must be clearly distinguished:A.Data viewed by a moral agent,
human individual, with certain legal responsibilities and perhaps risks, outlined in his profile, p..1.Profile Pm is of a legal moral
agent M who has been allowed to view the data.
2.Pm may contain information such as links to signed NDA's, commitments made by M, perhaps a trust profile etc...
3. Trail records that fields x4,x5,x6 for subject x7 are displayed to M.B. Data passed to another application.1.Agent profile Pa
contains
1.1. A set of commitments entered into by that agent as described in a P3P policy.
1.2. A pointer to how to find the audit trail left by that agent (anonymized versions may even be publicly available).

An important feature of such a system should be that any agent system A1 passing information to another agent system A2
should have a way of knowing whether A2 is also committed to saving audit trail information, and where and under what
circumstances, this information could be accessed.

The crucial feature of such a system must allow an information trail to be stipulated and subsequently followed in order to
track real privacy practices rather than privacy promises such as contained in P3P policies.

What this requires of P3P is

a. A placeholder for description of audit trail


-commitments
-access conditions and locations.
-seals

12
P3P technology and policy issues from a EU perspective

b. Improved recipient taxonomy to allow expression of priviledge profiles p as in A.2.


c. A taxonomy for creating audit trail logs (for example was data passed in encrypted form or not.) was it placed in a secure
environment.
d. If competing systems exist, then a taxonomy for distinguishing between them. For example a system is able to understand
the meaning of " if you release this information, it will be passed from an environment which uses audit trail system x8 to a
system which uses audit trail system x9"

Annexe 3. Sample XML Digital Signature


Sample XML signature of P3P policy. Note that a signature for the PRF would be identical except that the node marked with ***'s would refer to a policy
reference file.

<Signature Id="Signature1" xmlns="http://www.w3.org/2000/09/xmldsig#">


<SignedInfo>
<CanonicalizationMethod Algorithm="http://www.w3.org/TR/2000/WD-xml-c14n-20000907"/>
<SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#dsa-sha1"/>
<Reference URI="http://www.example.org/p3p.xml">
<Transforms>
<Transform Algorithm="http://www.w3.org/TR/2000/WD-xml-c14n-20000907"/>
</Transforms>
<DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/>
<DigestValue>j6lwx3rvEPO0vKtMup4NbeVu8nk=</DigestValue>
</Reference>
<Reference URI="#Assurance1" Type="http://www.w3.org/2000/09/xmldsig#SignatureProperties">
<DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/>
<DigestValue>1342=-0KKAASIC!=123Adxdf</DigestValue>
</Reference>
<!-- Reference over signature policy *******or policy reference file if for prf **********-->
<Reference URI="http://www.example.org/signaturePolicy.xml">
<DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/>
<DigestValue>1234x3rvEPO0vKtMup4NbeVu8nk=</DigestValue>
</Reference>
<!-- Reference over Time Stamp to comply with EU directive -->
<Reference URI="#TimeStamp1" Type="http://www.w3.org/2000/09/xmldsig#SignatureProperties">
<DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/>
<DigestValue>k3453rvEPO0vKtMup4NbeVu8nk=</DigestValue>
</Reference>
</SignedInfo>
<SignatureValue>MC0CFFrVLtRlk=...</SignatureValue>
<KeyInfo>
<X509Data>
<X509IssuerSerial>
<X509IssuerName>CN=Smith John, OU=TRL, O=IBM, L=Example, ST=Ex-ample,
C=</X509IssuerName>
<X509SerialNumber>
12345678
</X509SerialNumber>
</X509IssuerSerial>
<X509SKI>
31d97bd7
</X509SKI>
</X509Data>

<X509Data><!-- single pointer to certificate-B -->

<X509SubjectName>Subject of Certificate B</X509SubjectName>

</X509Data>

<X509Data> <!-- certificate chain -->

13
P3P technology and policy issues from a EU perspective

<X509Certificate>MIICXTCCA..</X509Certificate>

<X509Certificate>MIICPzCCA...</X509Certificate>

<X509Certificate>MIICSTCCA...</X509Certificate>

</X509Data>

</KeyInfo>
<Object>
<SignatureProperties>
<SignatureProperty Id="Assurance1" Target="#Signature1"
xmlns="http://www.w3.org/2000/09/xmldsig#">
<Assures Policy="http://www.example.org/p3p.xml" xmlns="http://www.w3.org/2001/02/xmldsig-
p3p-profile"/>
</SignatureProperty>
<SignatureProperty Id="TimeStamp1" Target="#MySecondSignature"> <timestamp
xmlns="http://www.ietf.org/rfcXXXX.txt"> date>19990908</date> <time>14:34:34:34</time>
</timestamp> </SignatureProperty>
</SignatureProperties>
</Object>
</Signature>

14
P3P technology and policy issues from a EU perspective

REFERENCES
[1] Art 29 – Data Protection Working party: Recommendation 2/2001 on certain minimum requirements for collecting personal
data on-line in the European Union; Opinion on P3P, 16 June 1998.

[2] "A fully compliant research implementation of the P3P standard for privacy protection: experiences and
recommendations", Giles Hogben, Tom Jackson, Marc Wilikens.
ESORICS 2002, Zurich, 14-16 October 2002. Springer Verlag.

[3] JRC P3P demonstrator project: Evaluation meeting report. See http://p3p.jrc.it/presentations/P3Pminutes.pdf
for evaluation report and attendees.

[4] Xforms W3C specification http://www.w3.org/MarkUp/Forms/

[5] Resource Description Framwork, W3C specification, http://www.w3.org/RDF/

[6] Web Ontology Language, W3C specification http://www.w3.org/2001/sw/WebOnt/

[7] Ackerman, M. S., Cranor, L., and Reagle, J. (1999). Privacy in E-Commerce: Examining User Scenarios and Privacy
Preferences. Proceedings of the ACM Conference in Electronic Commerce : 1-8. New York: ACM Press.

[8] A P3P Preference Exchange Language 1.0 (APPEL1.0)


W3C Working Draft 15 April 2002
http://www.w3.org/TR/P3P-preferences/

[9] Ontology editor see http://oiled.man.ac.uk/

[10] W3C specification, see http://www.w3.org/Mobile/CCPP/

[11] W3C specification, see http://www.w3.org/TR/2002/WD-xpath20-20020816/

[12] W3C specification, see http://www.w3.org/TR/xpath

[13] W3C specification, see http://www.w3.org/Signature/

[14] A P3P Assurance Signature Profile W3C Note 2 February 2001


http://www.w3.org/TR/xmldsig-p3p-profile/

[15] Information Technology – Code of practice for information security management. BS ISO/IEC 7799-1:2000. British
Standards Institution. 2000. ISBN 0 580 36958 7

[16] The P3P Base Data Schema, See http://www.w3.org/TR/P3P/#basedataxml

15

You might also like