The Deep Web is Where All the Criminals Hang Out, Right?
Well, yes and no. The deep web is essentially all of the non-indexed, non-searchable content on the Internet. To really understand this, we need to quickly talk about what the Internet IS.
A Very Basic Summary of What the Internet Really Is:
This is a simplification of what is really going on. If you already know the intimate details about any of the following: IP, FTP, DNS, or similar – jump down to the next section. This part is not for you.
Read more about dark web threat intelligence
The Internet, in its simplest form, is a group of computers talking to each other. Computer A asks Computer B for a piece of information, and Computer B sends it back. The piece of information could be a webpage, an advertisement, a calculation—pretty much anything.
Each computer uses a unique name during this communication. That name is your IP address (IP stands for Internet Protocol, it is formatted like this: 126.96.36.199). As IP addresses are hard to remember for humans, we make them look like something else—a web address. When you type “Facebook.com,” your Internet Service Provider (Comcast, Google Fibre, Time Warner, Shaw) takes the request sent from your computer (Computer A) and sneakily changes that domain name to the correct IP address that corresponds to a Facebook server (Computer B). Facebook (Computer B) then gets you (Computer A) the information that you want to look at, at home, on the go, or at work (I’m not judging).
This is the same for any communication carried out on the Internet, including the deep web. The difference between the deep web and the rest of the Internet is whether or not you can search it.
I can’t, however, search for a conversation your teenager had on WhatsApp last week, nor can I search for data transmitted from your Internet-enabled fridge to your iPhone telling you, “your beer is now cold.” Another example of data that you can’t search is the mindless chatter that computers constantly spew out to check on the status, health, or performance of other computers, or other equipment, in the system. This data is the majority of what comprises the deep web. It's mostly boring information that is not useful for the vast majority of people. According to some sources, the deep web is approximately 500x bigger than the surface web. What is rarely addressed is how much of the deep web is human-readable versus computer-readable.
But, What About the Criminals and the “Silk Road”?
This is where a distinct difference comes into play: the deep web versus the dark web.
According to Wikipedia, the deep web is "content on the World Wide Web that is not indexed by standard search engines." This is what we've been discussing so far.
The dark web, on the other hand, is "World Wide Web content that exists on darknets, overlay networks which use the public Internet but which require specific software, configurations or authorization to access. The dark web forms a small part of the deep web..."
"Huh?" (That's what we said.)
To access the dark web, you have to know where to look, and you must have the right tools to access it.
These areas of the deep web started well before the Internet was mainstream and popular. Early Internet users hung out in online chatrooms called Internet Relay Chats (IRC). Some criminal activity on the dark web today arose from these IRC communities.
The deep web is comprised of hosted web pages and other unsearchable sites. Sites are intentionally made unsearchable every day. For example, when an organization is working on a new website, but it isn’t quite ready to show the world, they will tell Google and other search engines to not read and index the website.
This is done programmatically through a web document called “robot.txt.” You can look at most robots.txt files by simply adding “/robots.txt” to a web address.
For example, here is Organization X. To take this off the searchable Internet, all an IT person needs to do is add the line of text “Disallow: /” to the robot.txt file. Google interprets this code as: “I don’t want you to read anything on my site. It's not ready yet!”
NOTE!: The greatest thing you can do for website “SEO” (search engine optimization) is not to have “Disallow: /” in your robots.txt!
That doesn’t mean that you can't access the site if you know where to find it. It only means that you can’t search for it using Google, Bing, or another search engine.
Considering how many millions of sites and billions of possible addresses there are, you are probably not going to find a site that someone wants to hide if you aren’t told where to look.
In essence, this is how people, not just criminals, keep their websites hidden in plain view— security through obscurity.
There are a number of other tricks that can be used to hide something on the Internet, but most nefarious websites are hidden in plain sight. [Tweet This]
Accessing the Dark Web: Tor
The SilkRoad was a notorious drug trafficking website that was shut down by an FBI sting in November of 2014.
There were a couple of additional cyber security measures in place (other than being un-indexed) that prevented average users from accidentally stumbling across that site.
One such counter measure is forcing users to connect to a site using Tor. Tor is an Internet tool or browser, which looks much like any other Internet browser, but allows users to stay anonymous. It does this through a process called onion routing (the acronym "Tor" stands for "the onion router").
Tor forces a computer to run all of its communication through a large number of other computers, called nodes, before it is directed to the final computer. Nodes, also called relays, can be just about any computer that has been set up with Tor software (you can actually download it here). People might set up a node because they strongly believe in anonymous browsing.
Traveling through multiple nodes means that by the time a communication gets to its destination, it is impossible to determine its original location or IP address. This gives users total anonymity while browsing. The multiple nodes routing communications represent multiple layers of "the onion" in Tor.
Tor was designed to totally protect its users identity and it is extremely effective. [Tweet This]
Who built Tor? Was it a secret group of hackers?
Actually, it was the US Government. The US Government developed and refined Tor browser technology to protect their own anonymity and communication channels:
“The core principle behind Tor, namely, 'onion routing,' was originally funded by the US Office of Navel Research in 1995, and the development of the technology was helped by DARPA in 1997”
—Joseph Babatunde Fagoyinbo.
Tor was finally released to the public in 2002.
Is There a Way of Hacking Tor? If Someone Is on Tor, Can I Find Out?
Yes and no. It is not possible to hack the Tor algorithm, as far as we understand. It can be tracked backward through the maze of computers to the source—but unfortunately, by the time you are finished this cumbersome process, your great-great-great grandchildren will be very old.
Humans are fallible, which means it is much easier to "hack" Tor's human operators than to hack the system itself. For example, The Silk Road had a nearly perfect system. If it weren’t for a series of fortunate tips, and mistakes made by the founder, it would probably still be running.
For a similarly interesting story, WW2s Enigma machine was only cracked from an analysis of human nature.
Tor doesn’t protect users from potentially downloading malware that broadcasts individual locations to would-be attackers.
You can also find out if someone is using Tor (this is not true for very sophisticated adversaries), however, by tracking exit nodes. An exit node is the last computer that a person hits before going to a target site. Many Tor exit nodes are well known, and mapped. As a result, exit nodes can be mapped with reasonable certainty.
In summary: if someone wants to browse anonymously online, there is very little that can be done about it. It is possible to know how many people are using Tor to access a website!
To Wrap Up:
- The Internet is relatively simple.
- The deep web is HUGE. It is also pretty boring.
- Dark things do happen on the deep web, but not nearly as much as the media would like us to think.
- If you want to use Tor, you can download and install it from here: https://www.torproject.org/. You will be fully anonymous while browsing. It will also be significantly slower! Remember, however, that Tor doesn’t necessarily protect you from bad actors or from seeing harmful content.
Beacon allows you to extract key information from the dark web in just a few clicks. Book a demo to learn more.