*Originally posted Jan 5, 2016. Updated for accuracy Nov 1, 2019
The Deep Web is Where All the Criminals Hang Out, Right?
Well, yes and no. To really understand what the deep web is, it helps to talk about what the Internet actually is.
A Very Basic Description of the Internet
The Internet, in its simplest form, is a group of computers talking to each other. Computer A asks Computer B for a piece of information, and Computer B sends it back. This piece of information could be a webpage, an advertisement, a calculation—pretty much anything.
Each computer uses a unique name during this communication. That name is an IP address (IP stands for Internet Protocol, it is formatted like this: 188.8.131.52). IP addresses aren’t very memorable, so they appear in the form of a web address. For example, when you type “facebook.com,” your Internet Service Provider takes that request (from Computer A) and translates that domain name to the IP address corresponding to a Facebook server (Computer B). Facebook (Computer B) then gives Computer A the information it searched.
This communication method is used across the Internet, including the deep web and dark web. The difference between the deep web and the rest of the Internet is whether or not you can search for it.
The Deep Web Vs The Surface Web
Internet users can use Google to search for Facebook, CapitalOne, or TSN. The results of these searches are examples of indexed pages. Anything that can be discovered using conventional search engines are considered surface web pages.
Users can’t, however, search for a conversation their teenager had on WhatsApp last week, nor can they search for the contents of a private email or banking page, or even some classifieds sites. Another example of unsearchable data is the mindless chatter that computers constantly spew out to check on the status, health, or performance of other computers, or other equipment in a system.
This data comprises the majority of the deep web. It's mostly boring information that is not useful for the vast majority of people. According to some sources, the searchable surface web only makes up about 10% of the Internet—the deep web contains the remaining 90%.
The Deep Web vs. The Dark Web
The terms “deep web” and “dark web” (sometimes called the darknet) are often used interchangeably, but they are very different. The dark web forms a small part of the deep web, as exemplified by the infamous iceberg metaphor. Like the deep web, it contains unsearchable web pages—but is designed intentionally to create user anonymity, and requires special tools to access. User anonymity allows illegal activities to flourish, which is how the dark web gets its bad reputation.
These “dark” areas of the deep web started well before the Internet was mainstream and popular. Early Internet users hung out in online chatrooms called Internet Relay Chats (IRC). Some criminal activity on the dark web today arose from these IRC communities.
The dark web isn’t exclusively used for selling drugs and having open discussions about neonazism, however—it can be used by anyone seeking anonymity. This could include whistleblowers protecting their identity when releasing information, or users searching the web freely in a country where certain content might be censored or blocked.
As long as a user knows where they’re going (ie. they have a link), they can easily access an unindexed deep web page. However, even if a user can find a link to a dark web site, they can’t access that page in a conventional browser such as Chrome or Firefox.
It’s worth mentioning that, while the iceberg metaphor helps differentiate surface, deep, and dark web sites, it’s not an accurate depiction of how they operate. In reality, they all function alongside each other rather than in compartmentalized sections of a digital space. In other words, deep and dark web sites are hidden, or secured, in plain view—security through obscurity.
There are a number of other tricks that can be used to hide something on the Internet, but most nefarious websites are hidden in plain sight.
Accessing the Dark Web: Tor
The SilkRoad was a notorious dark web marketplace that was shut down by an FBI sting in November of 2014. What made SilkRoad a dark website?
To access dark websites, users must use Tor. Tor is an Internet browser, which looks much like any other Internet browser, but gives users anonymity. It does this through a process called onion routing (the acronym "Tor" stands for "the onion router").
Tor forces a computer to run its communications through a large number of other computers, called nodes, before they are directed to the final computer. Nodes, also called relays, can be just about any computer that has been set up with Tor software (you can actually download it here).
Traveling through multiple nodes means that by the time a communication gets to its destination, it is impossible to determine its original location or IP address. This gives users total anonymity while browsing. The multiple nodes routing communications represent multiple layers of "the onion" in Tor.
Onion routing also means that the browser operates very slowly. Users can view any site URL (even surface websites) in the Tor browser, but dark web site links (which have .onion as their top-level domain, as opposed to .com) must be viewed in Tor.
Tor was designed to totally protect its users identity and it is extremely effective. [Tweet This]
Who built Tor? Was it a secret group of hackers?
Actually, it was the US Government. The US Government developed and refined Tor browser technology to protect their own anonymity and communication channels:
“The core principle behind Tor, namely, 'onion routing,' was originally funded by the US Office of Navel Research in 1995, and the development of the technology was helped by DARPA in 1997.” —Joseph Babatunde Fagoyinbo.
Tor was finally released to the public in 2002.
Can Tor Be Hacked? Is There a Way to “Break” the Anonymity?
Yes and no. It is not possible to hack the Tor algorithm, as far as we understand. It can be tracked backwards through the maze of computers to the source—but this cumbersome process would take decades to complete.
Humans are fallible, which means it is much easier to "hack" Tor's human operators than to hack the system itself. For example, SilkRoad had a nearly perfect system. If it weren’t for a series of fortunate tips, and mistakes made by the founder, it would probably still be running.
For a similarly interesting story, WW2s Enigma machine was only cracked from an analysis of human nature.
Tor also doesn’t protect users from potentially downloading malware that broadcasts individual locations to would-be attackers. This means there is potential for user identity, especially inexperienced user identities, to be exposed on the dark web.
It’s also possible to find out if someone is using Tor (this is not true for very sophisticated adversaries) by tracking exit nodes. An exit node is the last computer that a person hits before visiting a target site. Many Tor exit nodes are well known and mapped with reasonable certainty. For a similarly interesting story, WW2s Enigma machine was only cracked from an analysis of human nature.
Tor doesn’t protect users from potentially downloading malware that broadcasts individual locations to would-be attackers.
You can also find out if someone is using Tor (this is not true for very sophisticated adversaries), however, by tracking exit nodes. An exit node is the last computer that a person hits before going to a target site. Many Tor exit nodes are well known, and mapped. As a result, exit nodes can be mapped with reasonable certainty.
How is Data on the Deep Web and Dark Web Useful?
Given the user anonymity and lack of search-ability on the deep web and dark web, it’s no surprise that these sites provide valuable data sources for discovering bad actors in a variety of crimes. This can be immensely useful for a variety of industries, from law enforcement to retail chains.
For example, the deep web hosts discussion forums inciting hate speech, being used to target individuals, organize physical threats, host precursory documents, or discuss illegal activities like shoplifting and drug use/sales. Paste sites, such as Pastebin, which are not indexed, are a good place to find evidence of data breaches. There’s also a variety of unindexed pages on classifieds sites, which often contain adult services linked to human trafficking, or stolen goods for sale.
The dark web takes illegal activities even further: while a lot of dark web sites are actually scams, there are also marketplaces selling drugs, breached data, child pornography, and a variety of other illegal goods and services. The dark web also contains a number of forums and news/commentary sites where users openly exchange harmful ideas and potential threats.
Because the deep web is unindexed, and the dark web is cumbersome and dangerous to navigate, public safety officials must use specialized tools, such as Beacon, to access relevant content safely and efficiently.
To Sum Up:
- The Internet is relatively simple.
- The deep web is HUGE. It is also pretty boring.
- The dark web is a small portion of the deep web that is designed for anonymity, and consequently harbours illegal activity.
- Dark things do happen on the deep web, but not nearly as much as the media would like us to think.
- Data on the deep web and dark web is immensely valuable for organizations (such as law enforcement) looking for potential threats, from data breaches to drug trafficking.
Do you want to efficiently search through content on the deep web and dark web, and accomplish this safely? Beacon allows you to extract key information from the dark web in just a few clicks. Book a demo to learn more.