Professional Documents
Culture Documents
Problems With P3P 1.0 and Proposals For P3P 2.0
Problems With P3P 1.0 and Proposals For P3P 2.0
Problems With P3P 1.0 and Proposals For P3P 2.0
Position paper for "Future of P3P" workshop, Dulles, Virginia, USA, 12-13 November 2002.
This paper represents the views of the Joint Research Center of the European Commission and has been broadly endorsed by
the Internet Task Force of the Article 29 working group of the European Commission.
Table of Contents
1. Introduction............................................................................................ 2
2. Background ............................................................................................ 7
3. Short Term Issues ................................................................................... 2
3.1 Compliance with DPD principles. ........................................................... 2
3.1.1. Cookie Management ..................................................................... 2
3.1.2.Point of provision of purpose specification
3.1.3. Consent issues............................................................................. 8
3.1.4. Geographical Semantics ................................................................ 3
3.2 User interface issues ........................................................................... 3
3.3 Short term vocabulary issues................................................................ 4
4. Long Term Issues Summary
Annexe 1 Background
1. Introduction
This paper discusses problem areas in P3P 1.0 and proposes possible solutions. For completeness, it contains a full list of what
we regard as the most important problem areas. It covers: compliance with EU data protection principles, user interface
issues and vocabulary issues, management of data flows and identities, APPEL issues and non-repudiatability.
In keeping with the theme of the workshop, we have focused on the issues which lend themselves to a more immediate
solution. Since awareness of the long term issues is crucial to the planning process for short term issues, we include a
summary of longer term issues with a full description in an annexe. We plan to present these in detail at a workshop planned
for spring 2003 in Europe.
Long term issues are contained in an annexe. We have, however included a short summary of each issue in the main section
of the paper because we wish to stress that many of these issues need to be addressed even when planning how to work
towards the short term issues.
2
P3P technology and policy issues from a EU perspective
Human readable information on data processing purposes should be conveyed to users prior to any data collection act. P3P1.0
does not provide mechanisms which adequately satisfy EU requirements in this respect. Although P3P policies can be used to
pre-inform the user of privacy practices, the default implementation (for example in IE6) and the sense of the specification, is
for the user to be informed ex post factum.
Proposed solution:
The solution to this problem is to emphasize somewhere in the P3P specification, and in any implementation guides that in
order to satisfy European law for acts of data collection involving PII, display of human readable elements must not be ex post
facto. This involves user interface issues because it is potentially very invasive to present all the information on data collection
purpose prior to every collection event. We believe however that these issues can be solved and we present a potential
solution in section 5 below.
<ENTITY>
<DATA-GROUP>
<DATA ref="#business.name">CatalogExample</DATA>
<DATA ref="#business.contact-info.postal.country">USA</DATA>
</DATA-GROUP>
</ENTITY>
In conjunction with a list of countries and information about their legal systems, this could allow the agent to determine
whether the controller was outside EU jurisdiction. If the same semantics could be included within <RECIPIENTS/> elements,
then this would offer a quick fix for this problem. It is clearly not ideal, because it requires, but does not provide a
standardization of CDATA elements within the DATA elements. It would also not be able to tell us for example whether a
country was within a supranational area of jurisdiction such as the EU.
This solution however makes it difficult to determine jurisdiction, although it can be deduced to a certain extent from any
country information contained. It would not for example be able to express the fact that a piece of information had passed
with a US "safe harbour" zone. We suggest therefore that in the longer term, P3P should move to an RDF [5] + OWL [6]
ontology solution. As mentioned below, regional and jurisdictional ontologies could then be plugged in or as may be
necessary in the case of the jurisdictional ontology, created from scratch. These would then provide the semantic richness to
make this kind of distinction, as well as other useful distinctions which would follow from such information. For example a
taxonomy of jurisdictions and a dynamic ranking system which can be referred to in order to determine the relative levels of
privacy protection would address the unsolved problems in expressing the EU directives. In the longer term, a specification
should not show such obvious regional bias as the "EUleveljurisdiction" attribute mentioned above. Therefore a more neutral
means of expressing the same thing must be developed.
P3P 1.0 makes no mention of user interface issues, neither does the P3P implementation guide. There are good reasons for
this, as it is generally outside of the scope of a specification to specify how it will be implemented. However, there are
considerable problems to be overcome in this area and there is considerable interplay between the specification and its
implementations. For example, P3P 1.0 provides several attributes to enable a user interface to display a human readable
distillation of a policy. It is therefore equally crucial for subsequent versions of P3P to make concessions to the user interface,
which take into account likely solutions. To this end, we would make the general remark that it is important for the
specification group to work alongside implementers and researchers on such issues.
3
P3P technology and policy issues from a EU perspective
We now describe some of the specific challenges to be overcome and some possible solutions:
An interchangeable format for preference rules is very important as it allows data protection professionals to disseminate
minimum guidelines and default privacy protection levels to users who have neither the time nor the knowledge to create
themselves.
The user interface for this would be much improved by a move of P3P to a formalized ontology. If P3P were expressed within a
formal ontology, tools for visualizing this could be used within user interfaces, and translations of interfaces, which are natural
to users could be made into RDF based query languages for matching purposes. Ontology interfaces such as OILED [9] allow
natural representations of conceptual frameworks to be made.
Interfaces should be sharply divided into sections for expert users and users who can choose between a small number of
predetermined preference sets. However, it should be possible for users to make more advanced choices if they wish.
P3P allows for a range of levels of feedback into the user experience - from a simple red light system as in Internet Explorer
6.0 - to a full page explanation of what happened in the evaluation in the resource. Users should be given a choice between
levels of information they want to receive. This is achieved to a certain extent in existing implementations, however we feel
that it could be improved upon if P3P were incorporated into a "tab" system such as is used to provide history information in
IE. A “tab” system is a collapsible frame on the left hand side of every page, which can provide real-time information relating
to pages displayed opposite, or other information, independent but simultaneous to that of the page being browsed. Netscape
now makes a lot of use of a system of "tabs" to provide information such as bookmarks, news, history. This system is not
currently used in any P3P system but would seem to be an ideal mode of providing feedback. The difference between this and
providing information as IE and Netscape currently do is that if the user wishes, the tab can be a source of information which
provides instantly available fresh information on the resources being accessed, simultaneous to page viewing. Such a tab
could be configured, rather as the IE search tab to provide different levels of information.
The JRC's proxy implementation makes use of a similar system, which was implemented after experimentation with other
systems. It uses an expandable floating privacy tab, which expands on mouseover and provides information on pages which
have been evaluated in the current session. As it is part of a proxy server implementation, it cannot alter the browser
interface and is inserted as a DHTML tab. However, it has many of the advantages of a tab
4
P3P technology and policy issues from a EU perspective
It is defined in a very cumbersome way (by a P3P specific pointer mechanism within a flat definition scheme) - this is highly
problematic for implementations and could easily be rectified by a move to a formal ontology, or at least to a standard XML
schema syntax using REFID's for multiple subclassing.
It has not been subjected to rigorous use case scenarios. For example, the following category description from the base data
schema is the closest we can get to describing the http header information.
http Navigation and Click-stream Data, Computer Information httpinfo HTTP Protocol Information
However, http header information cannot be described by any of the terms in the sentence " Navigation and Click-stream
Data, Computer Information ".
It does not have a well-defined semantics. It is not clear whether the items in the schema refer to categories, or to data
objects. For example in discussions with the working group, we have seen the data schema term "email" described as
referring to an email address, whilst we would argue that it refers to a class into which may be placed all data objects, which
can be described as "email". There is an important difference because the term, "user" clearly does not refer to a user, but to
his data. This needs to be rectified within the context of a more formal ontology development process, as described below.
Although it does provide a P3P-specific customization route, because of the non-standard syntax, this does not allow
applications to leverage existing ontological frameworks and thus widen the descriptive power.
3.3.2. As mentioned above, the recipient taxonomy stands at the central point of any privacy taxonomy and as such it is
not sufficiently descriptive in P3P 1.0. Specifically, it needs to be altered to be able to express the requirements of the
European directive. Furthermore, as has been noted elsewhere in this paper, there are more fundamental requirements, which
need to be addressed in order for the recipient taxonomy to be adequate. These include the ability to attach security and
jurisdictional taxonomies as sub-trees of recipient instances.
3.3.3. The purpose specification taxonomy needs to be subjected to more rigorous use case scenarios. The evaluation
group [3] felt that given the purpose specification of data collection is one of the most important elements in the taxonomy,
the 12 cases provided did not cover what is required.
Consider the following sentence from Doubleclick's human readable policy.
"DoubleClick does use information about your browser and Web surfing to determine which ads to show your browser."
P3P would cover the preceding sentence with the Element <customization/> and possibly <individual-decision/> and
<tailoring/>- however it is not clear from any of these, and it cannot be expressed, that it is for the purposes of advertising
third party products. This would however be something of concern to many users.
<appel:RULE behavior="block"><p3p:POLICY>
<p3p:STATEMENT><p3p:DATA-GROUP appel:connective="non-and">
<p3p:DATA ref="#dynamic.clickstream.clientip.fullip"/>
<p3p:DATA ref="#dynamic.http.useragent"/>
</p3p:DATA-GROUP></p3p:STATEMENT>
</p3p:POLICY></appel:RULE>
This RULE will cause a block behavior for the following web site policy (only relevant parts quoted),
<POLICY>
<STATEMENT>
<DATA-GROUP appel:connective="and">
<DATA ref="#dynamic.clickstream.clientip.fullip"/>
<DATA ref="#dynamic.http.useragent"/>
</DATA-GROUP>
</STATEMENT>
</POLICY>
<POLICY>
<STATEMENT>
<DATA-GROUP>
<DATA ref="#dynamic.clickstream.clientip.fullip"/>
</DATA-GROUP>
5
P3P technology and policy issues from a EU perspective
</STATEMENT>
<STATEMENT>
<DATA-GROUP>
<DATA ref="#dynamic.http.useragent"/>
</DATA-GROUP>
</STATEMENT>
</POLICY>
Note the presence of the "non-and" connective, which means - "only if not all sub-elements in the rule are present in the sub-
elements of the matched element in the policy". This is true for the first policy snippet but not the second, which given that
they have the same meaning is clearly unacceptable. We will look at solutions which address this problem below.
Proposed solutions:
We have already noted the benefits of moving APPEL to a version based on an OWL P3P ontology of P3P, namely
improvements in visualization techniques and reasoning. Given the work involved, this may however be considered a long
term objective.
A more immediate solution, which would an initial use of a standard query language for the condition matching part of APPEL.
Instead of using APPEL's somewhat quirky APPEL connective system and recursive matching algorithm the rule condition could
be specified by an XPATH [12] query (or by the time it becomes relevant, Xpath 2.0[11]). These query languages are
designed to match arbitrary node sets with high efficiency. They have the advantage that developers are familiar with them
and efficient algorithms exist to execute the queries. As it has become very clear that APPEL is not a language that will be
written by anyone other than developers or ordinary users using a GUI, this is clearly the best approach.
E.g. a rule in this format, which would solve the above ambiguity problem would be:
<appel:RULE behavior="block" prompt="yes" promptmsg="Rule found policy using your home info beyond current purpose
">
<appel:MATCHQUERY query=
"//DATA[not(substring(@ref,' dynamic.clickstream.clientip.fullip') or substring(@ref,' dynamic.http.useragent'))]"
querylangauge="XPATH">
</appel:RULE>
It should be noted that the recent issue of the XPATH 2.0 [11] specification, which provides an even more powerful matching
language, makes this an even more compelling solution.
6
P3P technology and policy issues from a EU perspective
Annexe 1. Background
The findings of this paper are based on the following:
1. Research into P3P and a full participation in the development process of the standard, including the development of a
reference implementation (see http://p3p.jrc.it), which was the first (and until now the only) implementation, which fully
complies with the P3P1.0 April 2002 specification. It consists of an open source Java User Agent and a model e-commerce
site. The agent was built specifically with the intention of demonstrating and evaluating P3P from a research perspective. The
architecture was designed with the following objectives:
• Ease of extension
• Open source and modular to allow others to experiment with it.
• Quick development time
• Browser independence.
2. A meeting of privacy experts held on May 27th 2002. The meeting included a demonstration and covered concerns and
issues, many of which are relevant to this paper. A report [3] was published with the findings of this meeting.
3. Research published in a peer-reviewed publication, "A Fully Compliant Research Implementation of the P3P Standard for
Privacy Protection". This paper will be published at the European Symposium on Research in Computer Security (ESORICS)
outlining our most important findings to a high level of detail. [2]
4. A special meeting of the Internet Task Force of the Art 29 WG on September 23rd 2002, DG Markt, Brussels.
7
P3P technology and policy issues from a EU perspective
Proposed Solution:
Here we outline a sketch of how such mechanisms might work. Full details would be a matter for the specification group.
5.2.1. Checking for an opt-in/out mechanism
a. There could be a specific attribute published within a namespace approved by the P3P specification, but mentioned within
the Xforms specification (alongside other proposed attributes such as the policy reference declaration), which expresses in a
machine readable way the fact a check box or other formfield is for expressing consent.s
This would have the important advantage of providing a standard syntax useable by all form systems for expressing consent.
5.2.2. Requesting signed consent
We considered the possibility of a mechanism for expressing consent by including, within a policy, an element specifying the
name and various other specifications for a hidden form field to be added to a POST operation, containing a signed statement,
as specified in the element.
However, this mechanism has several disadvantages:
− It is attached to the form, and not the processing application. There is therefore not an authoritative relationship to the
application which processes the data. For example many web shops use third parties to process their forms. These third
party processors might find it difficult to control expressions of consent if they were managed by third parties'.
8
P3P technology and policy issues from a EU perspective
− It is not ideal for the client to have to add POST fields to a form, there considerable opportunity for ambiguity. Also it is
generally harder for a third party software vendor to alter the operation of an application (e.g. API) to do this than for
example to alter http headers sent.
We therefore suggest that a mechanism could be provided for requesting and providing consent using http headers, which
would also provide the option of asking for a signed consent.
In this case, an element would be added to the P3P policy similar to the following:
<DATA ref="user.home-info">
<CONSENTREQUEST method="httpheader" headername="consent1">
<DATAREQUIERED certificate="X.509" algotrithmtype="RSA" minkeylength="128">I agree that
my data in this form will be published on the internet.
</DATAREQUIRED>
</CONSENTREQUEST>
<DATA/>
CONSENTREQUEST could be inserted within a DATA element to state that the collection of this type of data requires the
consent of the user, and how this consent should be sent.
“method” - specifies that the consent should be specified using an HTTP header specified by the attribute “headername” -
specifies the name of the header which should contain the signature data. The DATA element contains the statement which is
required to be signed to express consent. In its attributes, it contains various requirements to allow for flexibility in the
requirements for signature types.
5.2.3. Structure of message.
To be of any use, consent messages need to be stored in a structured way in the "back office" of the service provider. The
most important requirement for the "back office" is that the message can be linked to the data which it provides consent in
the case of a dispute. This requirement however needs to be set against the possible loss of privacy involved should the
message be linked with a unique identifier.
Because of this latter consideration, it should be left up to the service provider to link the consent message with a unique
identifier binding it to the information, such that the possible privacy losses contained in such an identifier are appropriate to
the situation. For example if the subject is willing for their entire information to be retained indefinitely, then a hash of the all
or part of the information may be used. However, if they are not, then this is not appropriate, because such a hash could later
be used to perform data mining operations on sensitive information. In this case, a hash of some form of session id might be
more appropriate. Another solution is some form of escrowed key system which could be used to unlock the identifiers by a
legal authority requiring the proof of consent. This is overkill for most situations. In either situation, the date of the consent
may be taken from the http request headers.
One possibility for structuring of the messages themselves however is to express them according to the proposed OWL P3P
semantic model. For example, RDF statements could be constructed to formally express statements such as
"I am a data subject and I agree that the data objects transferred in this request may be transferred to third parties."
(ontological terms underlined)
If such a consent statement were expressed using RDF statements it would carry more legal weight through this unambiguous
and transparent semantics and would make management of different consent statements easier by making them easily
processable by software agents.
Joseph Reagle of W3C has already gone some way towards outlining the detail of this solution. We examine and build upon
the proposals of Reagle [14] for the inclusion of XML digitally signed [13] policies within P3P. As Reagle has already set out
most of the mechanisms for achieving this, we make only three minor additions to the technical specification. Our main aim is
to look at possible technical problems with the use of the XML signature extension, and their solutions.
P3P enabled servers could have the possibility of providing an XML digital signature as part of their policy, or as a separate
document referenced within the policy. This is easily accomplished provided that the correct syntax is incorporated into the
P3P specification, as shown by Reagle. Reagle's example should however be modified in the following ways.
a) Add an X.509 certificate bag to provide full non-repudiatability.
b.) Include a time stamp to comply with EU regulations.
c.) Require an additional signature over the PRF, which details which resources the policy applies to. Any signature that does
not assure this information loses much of its legal significance. Note also that this signature cannot be bundled in with the
policy signature because several PRF's may refer to the same policy. Furthermore, the person responsible for producing policy
signatures may not even know the location of PRF's referring to the policy (in the case of a standard issue policy used by third
parties.) We suggest the addition of a "DISPUTES" element to the PRF identical to the DISPUTES element in the P3P policy
which allows the specification of a signature URI using the validation attribute.
The P3P process has 2 main components on the server; an XML policy and an XML PRF, which binds policies to resources.
Semantically therefore, a P3P statement is a combination of the policy and the PRF, which binds a policy to a resource at a
given time. The PRF, like the policy has a validity timestamp.
However, Reagle’s P3P extension does not include any binding signature for the P3P PRF. This is an oversight, because
omission of a signature binding the policy to the resource it applies to negates the non-reputability of the statements being
made. The importance of the PRF cannot be ignored. We therefore suggest that a signature also be provided for any PRF's
used. We show, however, in the example signature the necessary extensions for a signature to be bound to a PRF. It is also
worth mentioning the possibility of an additional signature over the human readable policy, which could be achieved by the
same mechanism. There has been some discussion of the fact that there may be discrepancies between the human readable
and the XML version of privacy policies. This would ensure a commitment to consistency between both versions.
To look at a specific example, the mobile device community has expressed interest in linking P3P with the CC/PP (Client
Capabilities, Preferences Profile). In this case, it would be extremely powerful if a P3P enabled agent + Rule base were able to
reveal only selected device capabilities, basing its decision on the privacy policy and which capabilities the service might need
to know. Most client applications would benefit from such a capability, if it were made easy and robust. When filling in forms,
users generally reveal only what is necessary and if the users do not trust the entity with the information which it claims to
require, they will not go ahead with the data transfer at all.
10
P3P technology and policy issues from a EU perspective
It should be mentioned that the ability to selectively release data is strongly connected with identity management, and
therefore any developments in this area should be linked into research in this area.
Proposed solution:
As this is an area where extensive further research is required, rather than describing a detailed solution, we will just outline
the technical requirements of such a system, and briefly suggest their likely solution.
Technical requirements:
1. An ontology expressive enough to capture the various data types which might make up a composite identity (selective
release of PII). This has already been discussed elsewhere in the paper.
2. Ways of linking that ontology to the data requests by client applications such as Xforms and CC/PP [10]. In other words
there should be a common ontology between these specifications and P3P, or an effective translation between them.
3. A rule language and User interface expressive enough to allow selective release of information. This would most likely
involve the definition of identities, in other words groups of information types, using a visual representation of a PII ontology
and their linking to certain patterns recognized in policies. The identities would therefore become ad hoc classes within the
ontology.
4. A clear specification for each page, what kind of information is being requested and which is optional, and which is required.
Without this, the engine cannot decide what information to release in a particular case. As it stands, it would be very difficult
for P3P to perform this function alone, because P3P policies are necessarily generalized between different resources and
semantically they do not give any information about what is required on a particular page.
The underlying semantic structure of P3P policies is:
"whatever the resource this policy is applied to, if you give us information x, we will do y"
, not
"please send us information x for resource y"
What is needed in this new scenario is both the above semantics. We suggest therefore that if the second semantic is
provided by (e.g.) the Xform and linked in a granular way with P3P policies, this provides enough information for an agent to
make a decision. For example a particular XForm might be able to express the semantic
"I require your email - this email address will be processed according to P3P policy Policy1, which can be found by means x."
`
On the part of P3P, it would simply require the capability to associate P3P policies to a more granular level than that of
resources. In particular, in the case of Xforms, it requires P3P policies to be associated with individual form fields. If a more
general specification could be produced allowing association of policies with more diverse entities, this would open up the way
for the application of P3P in other similar settings such as CC/PP, irc (chat) etc...
There is of course the inverse problem that companies may abide by the law to the letter, and yet not publish a p3p policy.
Therefore P3P can neither guarantee that a company is within the law, or that it is not.
Solution:
One solution to the first problem of non-enforceability, which is still somewhat within the realms of fantasy, but is however not
inconceivable, is to use the taxonomy of P3P, perhaps somewhat extended, within a system for automated audit trails.
This solution can be compared to the solution adopted by restaurants who wish to make clients trust their hygiene practices.
They put the kitchen in full view of the customers. In the same way, given a sufficiently standardized system, provided,
perhaps within P3P, servers could record their data processing events and security related events in such a way that
authorized auditing agents could assess them in a measurable way against the regulatory standards and of perhaps additional
standards of trust seals. The full details of such a system are beyond the scope of this paper. However, we present below a
scenario which helps to view this set of extensions in a concrete way, and from there to extract some requirements for P3P
2.0.
11
Audit Trail
P3P technology and policy issues from a EU perspective
An important feature of such a system should be that any agent system A1 passing information to another agent system A2
should have a way of knowing whether A2 is also committed to saving audit trail information, and where and under what
circumstances, this information could be accessed.
The crucial feature of such a system must allow an information trail to be stipulated and subsequently followed in order to
track real privacy practices rather than privacy promises such as contained in P3P policies.
12
P3P technology and policy issues from a EU perspective
</X509Data>
13
P3P technology and policy issues from a EU perspective
<X509Certificate>MIICXTCCA..</X509Certificate>
<X509Certificate>MIICPzCCA...</X509Certificate>
<X509Certificate>MIICSTCCA...</X509Certificate>
</X509Data>
</KeyInfo>
<Object>
<SignatureProperties>
<SignatureProperty Id="Assurance1" Target="#Signature1"
xmlns="http://www.w3.org/2000/09/xmldsig#">
<Assures Policy="http://www.example.org/p3p.xml" xmlns="http://www.w3.org/2001/02/xmldsig-
p3p-profile"/>
</SignatureProperty>
<SignatureProperty Id="TimeStamp1" Target="#MySecondSignature"> <timestamp
xmlns="http://www.ietf.org/rfcXXXX.txt"> date>19990908</date> <time>14:34:34:34</time>
</timestamp> </SignatureProperty>
</SignatureProperties>
</Object>
</Signature>
14
P3P technology and policy issues from a EU perspective
REFERENCES
[1] Art 29 – Data Protection Working party: Recommendation 2/2001 on certain minimum requirements for collecting personal
data on-line in the European Union; Opinion on P3P, 16 June 1998.
[2] "A fully compliant research implementation of the P3P standard for privacy protection: experiences and
recommendations", Giles Hogben, Tom Jackson, Marc Wilikens.
ESORICS 2002, Zurich, 14-16 October 2002. Springer Verlag.
[3] JRC P3P demonstrator project: Evaluation meeting report. See http://p3p.jrc.it/presentations/P3Pminutes.pdf
for evaluation report and attendees.
[7] Ackerman, M. S., Cranor, L., and Reagle, J. (1999). Privacy in E-Commerce: Examining User Scenarios and Privacy
Preferences. Proceedings of the ACM Conference in Electronic Commerce : 1-8. New York: ACM Press.
[15] Information Technology – Code of practice for information security management. BS ISO/IEC 7799-1:2000. British
Standards Institution. 2000. ISBN 0 580 36958 7
15