The deep Web’s elusive nature may evoke images of hackers engaging in shady activities, but this is not an accurate representation.
© aetb/iStock/ThinkstockIndeed, what a complex web we have. Approximately 40 percent of the global population utilizes the Web for news, entertainment, communication, and countless other purposes [source: Internet World Stats]. However, despite the growing number of users, the data available to them is becoming increasingly scarce. This is because only a small portion of the World Wide Web is easily accessible to us.
The so-called surface Web, which most of us engage with regularly, contains information that search engines can index and present in response to search queries. However, much like an iceberg where only a small fraction is visible, traditional search engines can only access a tiny fraction of what’s out there — just a mere 0.03 percent [source: OEDB].
What about the rest of it? Well, much of it remains concealed within the deep Web. Known by various names such as the undernet, invisible Web, and hidden Web, this part of the internet houses data that can’t be discovered through a simple Google search.
While the exact size of the deep Web remains unknown, it is estimated to be hundreds, if not thousands, of times larger than the surface Web. This data isn’t deliberately concealed; it’s just that current search engine technology struggles to locate and interpret it.
There's another, much murkier side to the deep Web — one that's darker and more mysterious — which is why it's often referred to as the dark web. In this space, users intentionally hide data. To access it, specialized browser software is typically required to navigate the onion-like layers that conceal the dark Web.
This software ensures the privacy of both the data's origin and destination, as well as the individuals accessing it. For political activists and criminals alike, the anonymity provided by the dark Web presents an immense power, enabling the exchange of information, goods, and services — both legally and illegally — much to the dismay of authorities around the globe.
Just as search engines only skim the surface of the Web, we are still just beginning to explore its vastness. Continue reading to uncover the full complexity of our interconnected Web.
Hidden in Plain Site
The deep Web vastly outnumbers the surface Web in size. The current Web includes over 555 million registered domains, each containing dozens, hundreds, or even thousands of sub-pages, many of which aren't indexed, placing them in the deep Web category.
While it’s difficult to pinpoint exactly, the deep Web may be 400 to 500 times larger than the surface Web [source: BrightPlanet]. Both the surface and deep Web continue to expand every day.
To grasp why so much data is hidden from search engines, it helps to understand the technologies behind online searches. You can explore this further with How Internet Search Engines Work, but here’s a brief overview.
Search engines build an index of data by discovering information hosted on websites and other online platforms. This involves using automated spiders or crawlers, which search for domains and follow hyperlinks from one site to another, much like an arachnid traversing a web, effectively mapping out the internet.
This index or map is crucial for locating specific information quickly. When you perform a keyword search, results are delivered almost instantly, thanks to the index. Without it, a search engine would have to examine billions of pages from scratch each time, which would be both cumbersome and frustrating.
Search engines are unable to access data stored in the deep Web due to various technical issues and data incompatibilities. Many private websites require login credentials, such as passwords, to view their content. Crawlers also cannot access data behind keyword searches on specific websites, and timed-access sites restrict public access once a certain time period has passed.
These numerous challenges, along with many others, make it significantly more difficult for search engines to locate and index data. Continue reading to learn more about the distinctions between the surface Web and the deep Web.
Just Below the Surface
If you visualize the Web as an iceberg, the vast section beneath the surface represents the deep Web, while the small portion visible above the water represents the surface Web.
©Jan Will/iStock/ThinkstockAs previously mentioned, there are millions of sub-pages scattered across millions of domains. These include internal pages with no external links, such as internal.Mytour.com, which serve maintenance purposes. Additionally, there are unpublished blog posts, image galleries, file directories, and countless other pieces of content that search engines cannot access.
For example, there are numerous independent news websites on the internet, and at times, search engines manage to index certain articles from these sites. This is particularly the case for high-profile news stories that attract significant media coverage. A simple search on Google will likely reveal a large number of articles on topics like the World Cup soccer teams.
However, if you're on the hunt for a more obscure story, you may need to visit a specific newspaper's website and either browse or search their content to find what you need. This becomes especially relevant as news stories age. Older stories are more likely to be archived in the newspaper's database, hidden from the public view of the surface web, making them harder for search engines to find. This means these articles belong to the deep web.
Deep Potential
Unlocking the deep web could open up opportunities to search through specialized professional databases and access hard-to-find information, which could greatly benefit fields like medicine.While the deep web contains data that is difficult for search engines to detect, its obscurity doesn’t mean the information is any less important. As we've seen in the case of the newspaper example, the data stored in the deep web can hold immense value.
The deep web is an infinite treasure trove of information, containing vast databases for engineering, finance, medical studies, images, illustrations, and much more. Essentially, the amount of data seems boundless.
As the deep web continues to expand and grow more intricate, search engine developers face the task of diving deeper into this complex space to surface relevant data. Their challenge is not only to locate trustworthy information but also to present it in a manner that doesn’t overwhelm users.
Just like in any business, search engines are concerned with larger issues than simply helping us find the perfect apple crisp recipe. Their real goal is to assist powerful corporations in exploring and utilizing the deep web in unique and valuable ways.
For example, construction engineers could potentially search for academic papers across universities to discover the latest breakthroughs in bridge-building materials. Similarly, doctors could quickly find the most recent research about a specific illness.
While the possibilities in the deep web are endless, the technical obstacles are formidable. This complexity is part of what makes the deep web so intriguing. However, there is also a darker side, one that many find unsettling for various reasons.
Darkness Falls
The deep Web is like an unexplored wilderness, brimming with hidden opportunities. With the right skills and some luck, you can uncover valuable secrets that many have worked to keep out of sight. On the dark Web, where the goal is to keep things concealed, it's best if you keep the lights turned off.
The dark Web is similar to the Web's hidden core. It thrives in privacy, thrives in anonymity, and holds immense power. It brings out human instincts in every form—both the noble and the nefarious.
As is often the case, the darker aspects of the Web make the most noise. On the dark Web, you can find illegal transactions of all sorts: narcotics, child exploitation materials, stolen financial data, human trafficking, weapons, rare creatures, pirated content, and more. In theory, it’s even possible to hire someone to carry out a deadly task for you.
However, this information won't show up with a simple Google search. Accessing these types of sites requires special software, such as The Onion Router, commonly referred to as Tor.
Tor is a browser-based software that establishes the necessary connections for you to explore dark Web sites. This technology is encrypted to help safeguard users' anonymity online. It does this by routing traffic through multiple servers across the globe, making it significantly more difficult to track.
Tor enables users to visit so-called hidden services — secretive Web sites that are commonly associated with the dark Web. Unlike traditional domains ending in .com or .org, these hidden sites have the .onion extension. In the following pages, we’ll delve deeper into these intriguing 'onion' sites.
Titillating Tor
In October 2013, U.S. authorities dismantled Silk Road after the suspected owner, Ross William Ulbricht, was apprehended.
© David Colbran/Demotix/CorbisThe most notorious of the onion sites was Silk Road, a now-defunct online marketplace where users could purchase illegal goods like drugs, weapons, and more. Though the FBI eventually arrested its creator, Ross Ulbricht, numerous similar sites, such as Black Market Reloaded, continue to exist on the dark Web.
Interestingly, Tor originated from research conducted by the U.S. Naval Research Laboratory. It was initially developed to help political activists and whistleblowers communicate securely, protecting them from potential retaliation.
Tor proved so successful in offering anonymity that it quickly became popular among those with criminal intentions as well, who began utilizing it for their own purposes.
This situation has placed U.S. law enforcement in the paradoxical position of trying to track criminals who are using government-created software to cover their tracks. Tor, it appears, is both a tool for privacy and a challenge for authorities.
Anonymity is a fundamental aspect of the dark Web, but you might wonder how financial transactions can take place when buyers and sellers are unable to identify one another. This is where Bitcoin comes into play.
If Bitcoin is unfamiliar to you, it's an encrypted digital currency. For more information, check out How Bitcoin Works. Much like physical currency, Bitcoin is versatile for all types of transactions and, importantly, it guarantees anonymity, ensuring that no one can trace the source of a transaction, whether legal or not.
Bitcoin may be the currency of the future -- a decentralized and unregulated type of money free of the reins of any one government. But because Bitcoin isn't backed by any government, its value fluctuates, often wildly. It's anything but a safe place to store your life savings. But when paired properly with Tor, it's perhaps the closest thing to a foolproof way to buy and sell on the Web.
The Brighter Side of Darkness
A significant aspect of Bitcoin's appeal is the anonymity of transactions.
© audioundwerbung/iStock/ThinkstockThe dark Web has its ominous overtones. But not everything on the dark side is bad. There are all sorts of services that don't necessarily run afoul of the law.
The dark Web is home to alternate search engines, e-mail services, file storage, file sharing, social media, chat sites, news outlets and whistleblowing sites, as well as sites that provide a safer meeting ground for political dissidents and anyone else who may find themselves on the fringes of society.
In a time where widespread surveillance like that of the NSA is commonplace, and privacy seems to be a thing of the past, the Dark Web provides a refuge for those who value their anonymity. Although search engines on the Dark Web don't personalize results, they don't monitor your online activity or bombard you with constant ads. While Bitcoin's value may fluctuate, it ensures a level of privacy that your credit card company cannot match.
A study conducted by researchers at the University of Luxembourg aimed to rank the most visited sites on the Dark Web. The research revealed that while illegal activities and adult content dominate, there is also a significant presence of sites dedicated to human rights and freedom of information [Source: ArXiv].
While the Dark Web certainly has its dark side, it also holds immense potential.
Even Deeper
The Deep Web is continuously growing. Its vast collection of both knowledge and trivial content expands every day, making it increasingly difficult to navigate and comprehend. Ultimately, this complexity represents one of the greatest challenges of the Internet we've built.
Programmers will keep refining search engine algorithms, enhancing their ability to explore the deeper parts of the Web. In doing so, they'll enable researchers and businesses to link and cross-reference information in ways that were previously unimaginable.
Simultaneously, the main role of a smart search engine isn't merely to find data. The real goal is for it to uncover the most pertinent information. Otherwise, users would be overwhelmed by a sea of irrelevant data, leaving them regretting their click.
This is the issue with so-called big data. Big data refers to enormous datasets that become too unwieldy and confusing to manage. As the Internet expands rapidly, our world is inundated with data, making it difficult for anyone, even the advanced computers at Bing and Google, to process it all.
As the Internet continues to grow, large corporations are investing more and more in data management and analysis. These efforts are crucial for maintaining internal operations and gaining competitive advantages. Mining and organizing the deep Web is a key element of these strategies. Companies that can use this data to their advantage will thrive, potentially creating groundbreaking technologies. In contrast, those who rely only on the surface Web will struggle to keep up.
Meanwhile, the deep Web will persist in both mystifying and captivating Internet users. It harbors a vast amount of knowledge that could propel technological and human progress when interconnected with other pieces of data. And, naturally, its darker side will always be present, as it is a reflection of human nature. The deep Web speaks to the limitless, fragmented potential not just of the Internet, but also of humanity as a whole.
