Search engines reconnaissance – The magic weapons

Pierluigi Paganini November 09, 2013

Search engines are formidable tools for reconnaissance, Google Hacking is essential knowledge for professionals searching for website vulnerabilities.

Search engines are powerful tools for attackers that need to conduct passive reconnaissance, their use could help to gather information on the target network organization, application used and related vulnerabilities, sensitive documents and info on company personnel. I decided to start from an interesting submodule of the hacking program proposed by The Hacker Academy dedicated to use of Google during a penetration test to extend the discussion with a proof of concept.

Johnny Long is the expert first expert that introduced the concept of Google Hacking in the homonymous book, he has been talking about the use of search engines like Google for hacking purposes. The book is considered a bible by hackers that use Google to collect information for the attack phase.

It is opportune to clarify that web search using these “tips” aren’t illegal despite the data retrieved was not intended for public distribution as stated by the authors of the book:

“Nothing I am going to describe to you is illegal, nor does it in any way involve accessing unauthorized data,”

Hackers that desire to use the Google search engine for reconnaissance purpose need to know Google Basics such as modifiers and operators.

Principal search modifiers are:

Modifier Description
+ Requires a term to match exactly
Avoid results that match the term
* Wildcard
“” Search for a specific phase


While then principal search operators are:

Search Operators Description


If you start your query with allintext:, Google restricts results to those containing all the query terms you specify in the text of the page. 


If you start your query with allintitle:, Google restricts results to those containing all the query terms you specify in the title.


If you start your query with allinurl:, Google restricts results to those containing all the query terms you specify in the URL.


If you include filetype:suffix in your query, Google will restrict the results to pages whose names end in suffix. For example, [user guide filetype:pdf ] will return Adobe Acrobat pdf files that match the terms “user” “guide” . filetype is very useful for finding “hidden” documents and commonly exploited file types. Typical searches for vulnerabilities (eg. Searching for vulnerable scripts and files) include suffixes php, cgi, jsp, swf and asp.


The query intext:term restricts results to documents containing term in the text. Intext allows to find pages containing known phrases.


The query intitle:term restricts results to documents containing term in the title. Using intitle it is possible to find pages with common titles (e.g. “Administrator”).


If you include inurl: in your query, Google will restrict the results to documents containing that word in the URL.


If you include site: in your query, Google will restrict your search results to the site or domain you specify. For example, [ privacy: ] will show privacy information from NSA site and [ privacy: gov ] will find pages about peace within the .gov domain.  The site: operator is useful to locate files within a specific domain and allows also to search all its indexed. During the attack phase site: is useful to map all services provided by the target.


Combining the above operators and modifier it is possible to execute complex queries, let’s think to be interested to search for email applications present on the website to submit a communication to site management. Analyzing the following table it is possible to note the improvement in the quality of the research obtained combining the Google operators.


Searched string Number Results
[formmail.cgi] 232.000 results
[inurl:formmail.cgi] 3.940 results
[inurl:formmail.cgi filetype:cgi] 5.920 results
[inurl:formmail.cgi filetype:cgi] 56 results


Google is the perfect instrument for finding people information (e.g. Email address, names, management key figures), attackers could use it to search for key professionals within the target organizations.

Hackers using Google could easily retrieve the following information on target organization:

  • Staff Information
  • Organizational chart of the company and organization of  internal departments,
  • Staff list and positions.
  • Contact information.
  • Facilities Information
  • Maps of facility locations, buildings and satellite images.
  • Maps of building interiors showing departments and function of areas.
  • Operational Information
  • Job listing including needed technical skills that give an attacker information of technology used by the company.
  • Help Desk Frequently Asked Questions.
  • Security policies.
  • Subcontractors Information


Hackers could use it for social engineering attacks, they could contact victims pretending to be a member of the organization, for example IT support member.

Using the appropriate search operator (e.g. filetype) an attacker can retrieve crucial documents for enterprise security such as security policies, networking policy, BYOD policies or detailed installation procedure for application and appliances used by the company. This kind of documents in the majority of cases contains precious information, digging on the internet with google it is easy to find list of network devices present in corporate data centers, detailed configuration used for the appliances (e.g. Ports exposed on the Internet, firewall rules), the list of mobile devices and phone number assigned to employees. I remember that first time I produced documents for ISO 27001 certification I have found using Google dorks many Statement of Applicability (SoA), a strategic document for companies that defines how it implements a large part of information security, a mine for an attacker. Following a sample of query to use for this kind of search;

[statement of applicability filetype:xls]

Hackers could search for corporate documents that respect known name conventions proposed by principal standards and that address sensitive corporate functions.

Another interesting way to exploit the Google engine to conduct reconnaissance activities is to search through “google groups” posting to found computer network security policies posted on web pages.

Hackers use this technique to collect corporate document searched using combinations of the target site domain name plus mails provider names (e.g. “”, “”, “” … and so on).

Search engines are powerful tools to identify also known vulnerabilities within target systems, recently we have observed that many large-scale attacks were characterized by an automated reconnaissance phase conducted using platforms such as Google.  Hackers exploit the search engine to find evidence of software and applications for which are documented flaws and that could be easily exploited. Similar search could be addressed against specific targets properly laying with above operator or could be used blindly

With google dorks it is possible to search for website vulnerable to SQL Injection attacks or platform having a default security settg thinat could be exploited. Let’s imagine to be interested to find a vulnerable web site in the overall .it domain, a first query that I can use to identify candidate victims could be something like this:

Obviously you will receive a huge quantity of websites, using the manual method it is possible to search for errors that give us further info on the target. Once you have chosen your target site, check if it is vulnerable, simply add an apostrophe ( ‘ ) to the end of the url.′ 

In presence of a vulnerability it is possible retrieve  error or something similar somewhere on the page.

” Error executing query: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘\\\’ ORDER BY date_added DESC’ at line 1 “

This kind of messages a mine for attackers that could understand which is the version of DB attacked which are the tables it contains.

Using more complex queries an attacker could obtain a series of information on the status of the target, for example to discover if it has been already “backdoored” and discovery which are the vulnerability that can potentially affect the system. The Google hacking database provides various examples of queries that can help a hacker to find vulnerable servers, to gain information on the target, to explore sensitive directories finding vulnerable files, to find password files or to find sensitive online shopping info.

inurl:”r00t.php”  – This dork finds websites that were hacked, backdoored and contains their system information allintext:”fs-admin.php“ – A foothold using allintext:”fs-admin.php” shows the world readable directories of a plug-in that enables WordPress to be used as a forum. Many of the results of the search also show error logs which give an attacker the server side paths including the home directory name. This name is often also used for the login to ftp and shell access, which exposes the system to attack. There is also an undisclosed flaw in version 1.3 of the software, as the author has mentioned in version 1.4 as a security fix, but does not tell us what it is that was patched.filetype:config inurl:web.config inurl:ftp – This google dork to find sensitive information of MySqlServer , “uid, and password” in web.config throughftp..filetype:config inurl:web.config inurl:ftp

The attackers could use similar techniques to discover a wide range of web vulnerabilities, it you believe that the process could be very time consuming and boring let  suggest you to try one of the numerous tools available in the underground.

Almost every tool has a preset of dorks to utilize for the research of vulnerability, many tools give also the possibility to automatically scan a large amount of website providing detailed information about the flaw discovered and out to exploit them.

Google hacking Google_Dorks_SQL_Injection_Mass_Web_Site_Hacking_Tool

Let’s conclude this short overview of Google hacking techniques mentioning another excellent use of the popular search engine, its caching service.  Google cache content of web pages could allow the attackers to access to the target resources without ever accessing the target servers.

Pierluigi Paganini

(Security Affairs – Search engines, Pen Testing)

you might also like

leave a comment