Professional Documents
Culture Documents
Evidence, Securing Online Evidence
Evidence, Securing Online Evidence
Evidence, Securing Online Evidence
CREATE 8
COLLECTION 8
MONITORING 9
RETAIN 10
LEGALIZING 10
INDEXING 10
ARCHIVING 11
MANAGE 11
DISPOSE 13
RECORDS RETENTION 13
LONG-TERM PRESERVATION 13
LIVE REPLAY 14
ADVANCED SEARCH 14
DATA EXPORT 14
RETENTION SCHEDULING 15
LONG-TERM PRESERVATION 15
LEGAL HOLD 15
LET’S CONNECT! 16
EDISCOVERY IN THE MODERN ENTERPRISE
While regulations are a major reason why enterprises need to keep records of online data, it is by no means the
only reason. eDiscovery is another crucial factor that companies need to consider.
Website, social media, and enterprise collaboration content is increasingly being used in litigation related to
employment, intellectual property, contract issues, defamation, insurance fraud, etc. Consequently, companies
need a way to collect and store this unstructured data in a way that makes it easy to search and process it during
eDiscovery and litigation preparation.
These days, around 80% of enterprise data is unstructured, and 70% of enterprises are unsure how to manage
and protect this data. When it comes to the complex processes of data collection and preparation, this inability to
adequately deal with unstructured data can result in extremely high litigation costs. The solution is to collect and
store data in a way that makes it easily digestible by modern eDiscovery platforms.
Examples can include anything from PDF contracts to Word documents, health records, and even the
unstructured text in the body of an email. Recently, however, online data (including metadata) from
websites, social media channels, enterprise collaboration content, and mobile text messages are
becoming important components of unstructured data, as it’s often needed for compliance and litigation
purposes.
While keeping records of official emails and discreet electronic documents is one thing, capturing modern web
content is quite another. Enterprises are expected to maintain records of:
Doing this isn’t easy since content is constantly evolving—every passing minute brings more comments, replies,
likes, and shares—and they all result in new electronic records.
Mix of Content
Message boards, forums, blogs, enterprise collaboration platforms, and social media platforms don’t necessarily
consist of one simple stream of content—they have timelines, pages, direct messages, images, videos, comments,
etc. This makes online content particularly prone to recordkeeping errors. It’s all too easy for a post to be edited
or a comment deleted before an accurate record is created.
5
Real-Time Activity
Thousands of comments, likes, and shares can happen in an hour, and with each new interaction a new record is
generated. In other words, a single post with lots of engagement can result in the creation of thousands of records
in a very short space of time. This neverending real-time activity poses a tremendous challenge, since a record
can be outdated almost the moment that it’s created.
Evolving Platforms
Since a manual process like screenshotting is labor-intensive, can lead to incomplete records, and is unlikely to
result in records that’ll stand up in court, many organizations resort to some form of recordkeeping that collects
social media data automatically. While this is a good approach, it’s worth keeping in mind that social media
platforms are always evolving, so whatever solution an organization opts for, it needs to be able to adapt to
platform changes. Otherwise, every platform change will result in lengthy downtimes and record gaps.
Integration Requirements
In order to ensure that social media content is always collected in real-time, that archives are of evidentiary quality,
and that any changes to a platform will not impact the ability to archive data, it's necessary to leverage platforms'
APIs. Gaining access to these APIs and building the necessary integrations isn't always easy, but it's undoubtedly
the best way to ensure accurate records.
What is Metadata?
Metadata is hidden data typically not visible to a user, or only visible in a limited capacity. If you examine
the metadata associated with a social media post, for example, it contains:
6
THE EDRM AND THE INFORMATION GOVERNANCE
REFERENCE MODEL
In order to help organizations better understand and manage the eDiscovery process, the Electronic Discovery
Reference Model (EDRM) was created in 2006.
• Identify • Preserve
• Collect • Process
• Review • Analyze
• Produce • Present
But it does not only consider the steps of eDiscovery. On the left, the EDRM also attempts to address what’s
needed in order to properly manage electronically stored information (ESI) for eDiscovery through the Information
Governance Reference Model (IGRM).
Although this model can be immensely useful in managing ESI, there are very specific information governance
considerations when it comes to online data like website and social media content. With this in mind, Pagefreezer
has expanded on the IGRM to provide enterprises with a comprehensive step-by-step guide to managing online
records. This model breaks the stages of the IGRM down into 10 distinct steps that look like this:
To understand how an information governance framework like the IGRM can be adapted and applied specifically
to online data, let’s zoom into the four stages.
CREATE
COLLECTION
Electronic recordkeeping starts with the collection of data from sources such as websites, blogs, social media
networks, and enterprise collaboration platforms. As mentioned, the collection of online content is complicated by
the inherent nature of the data — the mix of content, constantly-evolving platforms, and real-time activity.
MONITORING
The second component of the Capture stage is monitoring. Due to the real-time nature of social media networks
and enterprise collaboration platforms especially, it’s important for organizations to reduce risk by monitoring
content in real-time. It should be done for two reasons: (i) preventing data loss and (ii) ensuring compliant,
appropriate use of these platforms.
Policy Compliance
For both external social media channels like Facebook and Twitter and internal chat platforms like Workplace,
organizations should have a detailed policy in place that governs their use. Combined with this should be some
form of monitoring solution that allows the organization to be alerted when something is posted that does not
comply with the policy—if, for instance, someone makes a threat of physical violence or uses profanity on a
Facebook page.
RETAIN
The second stage of the Pagefreezer framework is Maintenance. Crucial to this stage within the realm of online
content is the legalization, indexing, and archiving of data.
LEGALIZING
This process relates to the capturing of data in a way that will make it defensible in a court of law. As explained
earlier in this document, this means gathering associated metadata of all electronic records and furnishing them
with a timestamp and digital signature that proves data integrity and authenticity.
While collecting and storing online data is important, and any organization actively doing it deserves to be
congratulated, it’s important to do it in a way that results in records that would be admissible in a court of law. So
simple screenshots would not be adequate since they wouldn’t have the metadata and digital signatures needed
for litigation.
INDEXING
What differentiates an archive of electronic records from a basic back-up of data is the fact that properly archived
records are indexed, meaning that the content is compiled in a way that makes it easy to search. So when a
specific record needs to be found, all that’s required is a simple search and not a labor-intensive trawl through
thousands of files. Properly indexed data also maintains relationships between data and users (allowing for the
posts and comments of a specific user to easily be identified), and even allows metadata to be searched.
ARCHIVING
Once information has been captured, part of the maintenance process is placing that data in an archive. As stated
above, this isn’t simply a back-up of online data, but is instead a database that is indexed and fully searchable.
Of course, while an archive is not merely a back-up of data, it is important to create back-ups of the archive
itself. The data should ideally be replicated three times, saved to WORM (Write Once, Read Many) storage, and
backed up remotely in the event of a disaster.
Another crucial component to consider when it comes to the archiving of data is security. In order to show
compliance and successfully use data during litigation, the accuracy and integrity of the information should be
beyond question. This will only be the case if the data is being archived in a secure way. Enterprises should aim
to make use of an archiving vendor that is ISO 27001 and SOC 2 certified.
MANAGE
What is WARC?
The ability to place a legal hold is another important consideration. Data doesn’t stay in an archive forever.
Organizations can be expected to retain official records for anything from three to 10 years, and once that
retention period is reached, information is typically deleted. However, if the data is needed for legal purposes,
this should be overridden to ensure that evidence isn’t lost. An archive solution should, therefore, enable the
organization to easily place a post or comment on legal hold to preserve it for litigation.
DISPOSE
RECORDS RETENTION
As touched on in the previous section, data doesn’t remain in the archive permanently. All archived content has
a disposition status, and unless something is on legal hold, that status is usually temporary. So as soon as it
falls outside the period during which an organization is obligated to keep the information, the data may safely
be deleted. Ideally, this process should be automated to ensure that data is never being kept if it’s not needed,
while also reducing the workload that would come with manually deleting content on a daily basis. Lastly, an
organization should make sure that any archiving vendor being considered offers a grace period with regards to
deleted content, just in case deleted data needs to be recovered.
LONG-TERM PRESERVATION
It is common for information in large enterprises to be preserved long-term—timelines that can stretch to 100
years, or even beyond that. This means that once the data is removed from an organization’s archive, it is moved
to a central repository where it can be preserved long-term. When the information is transferred in this way, it
needs to be done in WARC format. So once again, it’s important that archive data be exportable in WARC.
DATA EXPORT
Archived content can be exported in PDF or WARC
through the Pagefreezer dashboard. Specific social
media accounts, selections of messages, open
records cases, or even a complete account archive
can be exported. The exports include all selected
messages and conversation threads, as well as
associated metadata.
• We’re proven and trusted by over 1,700 customers in a wide range of industries including finance, legal,
telecom, retail, utilities, government, and post-secondary education.
• We’re results focused — your success is our success. It’s our job to make your life easier. Up and running in
minutes with a Customer Success team supporting you every step of the journey.
• We offer a comprehensive solution — we provide solutions for all your archiving needs: website, social
media, corporate chat, and SMS/text messages.
• We’re affordable — we are reasonably priced and there are no hidden fees.