The Deep Web: 21493 Shyneil Dsouza 21496 Aditya Upadhyay 21502 Poornank Shirke

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 24

The Deep

Web
21493 Shyneil Dsouza
21496 Aditya Upadhyay
21502 Poornank Shirke
Surface Web

 The surface Web is that portion of the World Wide Web that is indexable by conventional search
engines.

 It is also known as the Clearnet, the visible Web or indexable Web.

 Eighty-five percent of Web users use search engines to find needed information, but nearly as
high a percentage cite the inability to find desired information as one of their biggest frustrations.

 A traditional search engine sees only a small amount of the information that's available -- a
measly 0.03 % [source: OEDB].
Deep Web - Introduction
 The Deep Web is World Wide Web content that is not part of the Surface Web, which is
indexed by standard search engines.

 It is also called the Deepnet, Invisible Web or Hidden Web.

 Largest growing category of new information on the Internet.

 400-550X more public information than the Surface Web.

 Total quality 1000-2000X greater than the quality of the Surface Web.
History
 Jill Ellsworth used the term invisible Web in 1994 to refer to websites that were not
registered with any search engine.

 Mike Bergman cited a January 1996 article by Frank Garcia:


“It would be a site that's possibly reasonably designed, but they didn't bother to
register it with any of the search engines. So, no one can find them! You're hidden. I call that
the invisible Web”.
 Another early use of the term Invisible Web was by Bruce Mount and Matthew B. Koll of
Personal Library Software in 1996.

 The first use of the specific term Deep Web, now generally accepted, occurred in the
aforementioned 2001 Bergman study.
How search engines work
 Search engines construct a database of the Web by using programs called spiders or Web
crawlers that begin with a list of known Web pages.

 The spider gets a copy of each page and indexes it, storing useful information that will let
the page be quickly retrieved again later.

 Any hyperlinks to new pages are added to the list of pages to be crawled.

 Eventually all reachable pages are indexed, unless the spider runs out of time or disk space.

 The collection of reachable pages defines the Surface Web.


How search engines work
Contents
 Dynamic Content

 Unlinked content

 Private Web

 Contextual Web

 Limited access content

 Non-Scripted content

 Non-HTML/text content;
 Dynamic content
• Dynamic pages which are returned in response to a submitted query or accessed
only through a form
• especially if open-domain input elements (such as text fields) are used
• such fields are hard to navigate without domain knowledge

 Unlinked Content
• Pages which are not linked to by other pages
• Which may prevent web crawling programs from accessing the content
• This content is referred to as pages without backlinks (or inlinks).
 Private Web: sites that require registration and login (password-protected
resources).

 Contextual Web: pages with content varying for different access contexts (e.g.,
ranges of client IP addresses or previous navigation sequence).

 Limited access content: sites that limit access to their pages in a technical
way (e.g., using the Robots Exclusion Standard, CAPTCHAs, or no-cache Pragma
HTTP headers which prohibit search engines from browsing them and creating
cached copies.
 Scripted content
pages that are only accessible through links produced by JavaScript as well as
content dynamically downloaded from Web servers via Flash or Ajax solutions.

 Non-HTML/text content
textual content encoded in multimedia (image or video) files or specific file
formats not handled by search engines.
Deep Potential
 The deep Web is an endless repository for a mind-reeling amount of information.

 It's powerful. It unleashes human nature in all its forms, both good and bad.

 There are engineering databases, financial information of all kinds, medical papers, pictures,
illustrations ... the list goes on, basically, forever.

 For example, construction engineers could potentially search research papers at multiple universities in
order to find the latest and greatest in bridge-building materials.

 Doctors could swiftly locate the latest research on a specific disease.

 The potential is unlimited. The technical challenges are daunting. That's the draw of the deep Web.
Shadow Land
 The deep Web may be a shadow land of untapped potential.

 The bad stuff, as always, gets most of the headlines.

 You can find illegal goods and activities of all kinds through the dark Web.

 That includes illicit drugs, child pornography, stolen credit card numbers, human trafficking, weapons, exotic
animals, copyrighted media and anything else you can think of.

 Theoretically, you could even, say, hire a hit man to kill someone you don't like.

 But you won't find this information with a Google search.

 These kinds of Web sites require you to use special software, such as The Onion Router, more commonly known
as Tor.
The Onion Router (TOR)
 Tor is software that installs into your browser and sets up the specific connections you need to access
dark Web sites.

 Critically it is free software for enabling online anonymity and censorship resistance.

 Onion routing refers to the process of removing encryption layers from Internet communications,
similar to peeling back the layers of an onion.

 Using Tor makes it more difficult to trace Internet activity, including "visits to Web sites, online posts,
instant messages, and other communication forms", back to the user.

 It is intended to protect the personal privacy of users, as well as their freedom and ability to conduct
confidential business by keeping their internet activities from being monitored.
Cont…
 Instead of seeing domains that end in .com or .org, these hidden sites end in .onion.

 The most infamous of these onion sites was the now-defunct Silk Road, an online marketplace
where users could buy drugs, guns and all sorts of other illegal items.

 The FBI eventually captured Ross Ulbricht, who operated Silk Road, but copycat sites like Black
Market Reloaded are still readily available.

 Tor is the result of research done by the U.S. Naval Research Laboratory, which created Tor for
political dissidents and whistleblowers, allowing them to communicate without fear of reprisal.

 Tor was so effective in providing anonymity for these groups that it didn't take long for the
criminally-minded to start using it as well.
U.S. authorities shut down Silk after the
Silk Road Website alleged owner of the site Ross William Ulbricht
was arrested.
Money-related transactions
 You may wonder how any money-related transactions can happen when sellers and buyers can't
identify each other.

 That's where Bitcoin comes in.

 Bitcoin, it's basically an encrypted digital currency.

 Like regular cash, Bitcoin is good for transactions of all kinds, and notably, it also allows for
anonymity; no one can trace a purchase, illegal or otherwise.

 When paired properly with TOR, it's perhaps the closest thing to a foolproof way to buy and sell
on the web.
The Brighter Side of Darkness
 The deep Web is home to alternate search engines, e-mail services, file storage, file sharing,
social media, chat sites, news outlets and whistleblowing sites, as well as sites that provide a safer
meeting ground for political dissidents and anyone else who may find themselves on the fringes
of society.

 In an age where NSA-type surveillance is omnipresent and privacy seems like a thing of the past,
the dark Web offers some relief to people who prize their anonymity.

 Bitcoin may not be entirely stable, but it offers privacy, which is something your credit card
company most certainly does not.

 For citizens living in countries with violent or oppressive leaders, the dark Web offers a more
secure way to communicate with like-minded individuals.
Invisible Web Search Tools
• A List of Deep Web Search Engines – Purdue Owl’s Resources to Search the Invisible Web
• Art – Musie du Louvre
• Books Online – The Online Books Page
• Economic and Job Data – FreeLunch.com
• Finance and Investing – Bankrate.com
• General Research – GPO’s Catalog of US Government Publications
• Government Data – Copyright Records (LOCIS)
• International – International Data Base (IDB)
• Law and Politics – THOMAS (Library of Congress)
• Library of Congress – Library of Congress
• Medical and Health – PubMed
• Transportation – FAA Flight Delay Information
Future
 The lines between search engine content and the deep Web have begun to blur, as search
services start to provide access to part or all of once-restricted content.

 An increasing amount of deep Web content is opening up to free search as publishers and
libraries make agreements with large search engines.

 In the future, deep Web content may be defined less by opportunity for search than by access
fees or other types of authentication.
Conclusion
 The deep web will continue to perplex and fascinate everyone who uses the internet.

 It contains an enthralling amount of knowledge that could help us evolve technologically and as
a species when connected to other bits of information.

 And of course, its darker side will always be lurking, too, just as it always does in human nature.

 The deep web speaks to the fathomless, scattered potential of not only the internet, but the
human race, too.
References

 http://computer.howstuffworks.com/internet/basics/how-the-deep-web-works5.ht
m
 http://oedb.org/ilibrarian/invisible-web/
 http://en.wikipedia.org/wiki/Deep_Web
 http://money.cnn.com/infographic/technology/what-is-the-deep-web/?iid=EL
 http://en.wikipedia.org/wiki/Surface_Web
Thank You

You might also like