Professional Documents
Culture Documents
Security Protocols: Helping Alice and Bob To Share Secrets (COMP - SEC.220) Coursework II
Security Protocols: Helping Alice and Bob To Share Secrets (COMP - SEC.220) Coursework II
Coursework II
Antonis Michalas
antonios.michalas@tuni.fi
August 18, 2021
Important!
1. In case you do not get at least 40% in this coursework, you immediately fail
the course. There will be no other chance to re-submit another coursework.
This is a programming coursework that contains two different but similar assignments and counts to-
wards 40% of your overall course marka . You should choose one of the two available options.
The first assignment is personal and you are required to develop a Symmetric Searchable Encryption
(SSE) scheme using the information/specifications provided.
The second assignment is a team work. You can create a team of up to five (5) students and work together.
Your task will be to implement four (4) different SSE schemes and compare their performance based on
a set of predefined experiments provided.
Some Rules
• You should only choose one out of the two assignments.
• Programming Language: For both assignments, you can use any programming language you pre-
fer.
• Dataset: To test your implementation(s), you should use the dataset provided
on https://zenodo.org/record/3360392
• Marking in the team project: The team project is a team work. Therefore, all students of the team
will get the same mark. Hence, I will not accept any marks-related complaints from any of the team
members involved. It is up to you to ensure that the effort in the team is equally distributed and
properly managed to have a successful outcome.
A SSIGNMENTS
A SSIGNMENT 1 – I NDIVIDUAL P ROJECT (100 MARKS )
Your task is to implement the Symmetric Searchable Encryption scheme illustrated in the paper given below. The
goal is to recreate the "Experimental results" section of the paper, using exactly the same dataset.
Searching in the Dark: A Multi Client Symmetric Searchable Encryption Scheme
with Forward Privacy
time (m)
TABLE 1: Size of Datasets and Unique Keywords
300
250
setup time for each one of the sub-datasets shown 200
in table 1. Each process was run ten times on each 150
machine and the average time for the completion of 100
the entire process on each machine was measured. 50
Figure 1 illustrates the time needed for indexing and 10
encrypting text files ranging from 184MB to 1.7GB 0.1 0.2 0.4 0.6 0.8 1 1.2 1.3
that resulted to a set of more than 12 million unique Number of Unique Extracted Keywords·107
keywords. As can be seen from figure 1, the desktop
machine needed the less time to complete the setup Figure 1: Indexing and Encrypting Files
phase while the tablet took significantly more time not
only compared to the desktop but also in comparison Unique Keywords (w, id) Pairs
to the time needed by the laptop. However, in all three 1,370,023 5,387,216
cases it is evident that the scheme can be considered
1,999,520 10,036,252
as practical and can even run in typical users’ de-
vices. This is an encouraging result and we hope that 2,688,552 19,258,625
will motivate researchers to design and implement 7,453,612 28,781,567
even better and more efficient SSE schemes but most 12,124,904 39,747,904
importantly we hope that will inspire key industrial
players in the field of cloud computing to create and TABLE 2: Keywords and Filenames pairs
launch modern cloud-services based on the promising
concept of Symmetric Searchable Encryption. Table 3
of resources and computational power). To this end,
summarizes our measurements from this phase of the
in our experiments we measured the time to generate
epxeriments. As can be since, to index and encrypt
the search token on the laptop and the tablet (i.e.
text files that contained 1,370,023 distinct keywords
typical users’ machines) while the actual search time
the average processing time was 8.49m, 22.48min and
was measured using the desktop machine described
68.75m for the desktop, laptop and tablet accordingly
earlier. On average the time needed to generate the
while for a set of files that resulted in 12,124,904
search token on the Surface Book laptop was 9µs
distinct keywords the average processing time was
while on the Surface tablet the time for token gener-
68.44m, 203.28m and 545.28. Based on the fact that
ation slightly increased to 13µs. Regarding the actual
this phase is the most demanding one in an SSE
search that is taking place on the CSP it needs to
scheme the time needed to index and encrypt such
be noted that the actual process is just a series of
a large number of files is considered as acceptable
SELECT and UPDATE queries to the database. More
not only based on the size of the selected dataset but
precisely, searching for a specific keyword over a set
also based on the results of other schemes that do not
of 12,124,904 distinct keywords and 39,747,904 ad-
offer forward privacy [11] as well as on the fact that
dresses required 1.328sec on average while searching
we ran our experiments on commodity machines and
for a specific keyword over a set of 1,999,520 distinct
not only on a powerful server.
keywords and 10,036,252 addresses took 0.131sec.
Apart from the generation of the indexer that
contains the unique keywords, the incorporated SSE Delete: To measure the performance of the delete
scheme also creates an indexer that maintains a map- process, we randomly selected 100 different files, per-
ping between a keyword (w) and the filename (id) formed the delete operation and measured the average
that w can be found at. The total number of the gen- processing time for the delete process to be com-
erated pairs in relation to the size of the underlying
datasets is shown in table 2. ```
```Testbed Tablet Laptop Desktop
Dataset ```
Search: In this part of the experiments we measured
184MB 68.75m 22.48m 8.49m
the time needed to complete a search over encrypted
357MB 109.36m 40.00m 13.51m
data. In our implementation, the search time is cal- 670 195.09m 86.43m 29.51m
culated as the sum of the time needed to generate 1GB 367.75m 141.60m 48.99m
a search token and the time required to find the 1.7GB 545.28m 203.28m 68.44m
corresponding matches at the database. It is worth
TABLE 3: Setup time (in minutes) of SID using our dataset and
mentioning that the main part of this process will be different machines
running on the CSP (i.e. a machine with a large pool
pleted. We performed the delete queries in our largest Architectural Support for Security and Privacy,
dataset containing 12,124,904 distinct keywords and pp. 1–8, 2018.
39,747,904 addresses. The average time to delete a [4] AMD, “AMD SEV-SNP: Strengthening VM
single file and update all the relevant addresses in the Isolation with Integrity Protection and more,”
database was 1.19min. Even though this time might tech. rep., 01 2020.
be considered as high for just deleting a single file [5] S. Kamara and K. Lauter, “Cryptographic
it is important to mention that this process will be cloud storage,” in Financial Cryptography and
running on a CSP with a large pool of resources Data Security (R. Sion, R. Curtmola, S. Di-
and computational power (e.g. Amazon EC2). Hence, etrich, A. Kiayias, J. M. Miret, K. Sako, and
this time is expected to drop significantly on such a F. Sebé, eds.), (Berlin, Heidelberg), pp. 136–
computer where more cores will be also utilized. Fur- 149, Springer Berlin Heidelberg, 2010.
thermore, it is important to mention that to properly [6] A. Michalas, “The lord of the shares: Com-
test the performance of our delete function we need bining attribute-based encryption and searchable
to conduct further experiments where we will also encryption for flexible data sharing,” in Proceed-
consider the number of keywords contained in the file ings of the 34th ACM/SIGAPP Symposium on
to be deleted as well as the number of other files that Applied Computing, SAC ’19, (New York, NY,
each keyword can be also found at. This is important USA), pp. 146–155, ACM, 2019.
because it is expected that these factors will heavily [7] A. Bakas and A. Michalas, “Modern family: A
affect the performance of the delete function. We plan revocable hybrid encryption scheme based on
to conduct further and more detailed experiments on attribute-based encryption, symmetric search-
the delete function in our future works. able encryption and sgx.” Cryptology ePrint
Archive, Report 2019/682, 2019. https://eprint.
References iacr.org/2019/682.
[8] “PyCrypto – the Python cryptography toolkit,”
[1] M. Naveed, “The fallacy of composition of 2013.
oblivious ram and searchable encryption.,” IACR [9] “Project gutenberg,” 1971.
Cryptology ePrint Archive, vol. 2015, p. 668, [10] J.-B. Michel, Y. K. Shen, A. P. Aiden, A. Veres,
2015. M. K. Gray, T. G. B. Team, J. P. Pickett,
[2] N. Paladi, C. Gehrmann, and A. Michalas, “Pro- D. Hoiberg, D. Clancy, P. Norvig, J. Orwant,
viding user security guarantees in public infras- S. Pinker, M. A. Nowak, and E. L. Aiden,
tructure clouds,” IEEE Transactions on Cloud “Quantitative analysis of culture using millions
Computing, vol. 5, pp. 405–419, July 2017. of digitized books,” Science, vol. 331, no. 6014,
[3] S. Mofrad, F. Zhang, S. Lu, and W. Shi, “A pp. 176–182, 2011.
comparison study of intel sgx and amd memory [11] R. Dowsley, A. Michalas, M. Nagel, and N. Pal-
encryption technology,” in Proceedings of the adi, “A survey on design and implementation of
7th International Workshop on Hardware and protected searchable data in the cloud,” Com-
puter Science Review, 2017.
Note: Implementing the SEV-enabled entities is not required. However, doing so will provide you with a 10%
bonus.
Before implementing the schemes, you should read carefully the section "Experimental Results" of Assignment 1.
The goal of your team is to recreate this section for each one of the four schemes that you will choose. For the
four schemes that your team will choose, you are expected to provide detailed graphs where the schemes will be
compared with each other.