• Home
  • Cyber Crime
  • Cyber warfare
  • APT
  • Data Breach
  • Deep Web
  • Digital ID
  • Hacking
  • Hacktivism
  • Intelligence
  • Internet of Things
  • Laws and regulations
  • Malware
  • Mobile
  • Reports
  • Security
  • Social Networks
  • Terrorism
  • ICS-SCADA
  • POLICIES
  • Contact me
MUST READ

Microsoft Patch Tuesday security updates for July 2025 fixed a zero-day

 | 

Italian police arrested a Chinese national suspected of cyberespionage on a U.S. warrant

 | 

U.S. CISA adds MRLG, PHPMailer, Rails Ruby on Rails, and Synacor Zimbra Collaboration Suite flaws to its Known Exploited Vulnerabilities catalog

 | 

IT Worker arrested for selling access in $100M PIX cyber heist

 | 

New Batavia spyware targets Russian industrial enterprises

 | 

Taiwan flags security risks in popular Chinese apps after official probe

 | 

U.S. CISA adds Google Chromium V8 flaw to its Known Exploited Vulnerabilities catalog

 | 

Hunters International ransomware gang shuts down and offers free decryption keys to all victims

 | 

SECURITY AFFAIRS MALWARE NEWSLETTER ROUND 52

 | 

Security Affairs newsletter Round 531 by Pierluigi Paganini – INTERNATIONAL EDITION

 | 

North Korea-linked threat actors spread macOS NimDoor malware via fake Zoom updates

 | 

Critical Sudo bugs expose major Linux distros to local Root exploits

 | 

Google fined $314M for misusing idle Android users' data

 | 

A flaw in Catwatchful spyware exposed logins of +62,000 users

 | 

China-linked group Houken hit French organizations using zero-days

 | 

Cybercriminals Target Brazil: 248,725 Exposed in CIEE One Data Breach

 | 

Europol shuts down Archetyp Market, longest-running dark web drug marketplace

 | 

Kelly Benefits data breach has impacted 550,000 people, and the situation continues to worsen as the investigation progresses

 | 

Cisco removed the backdoor account from its Unified Communications Manager

 | 

U.S. Sanctions Russia's Aeza Group for aiding crooks with bulletproof hosting

 | 
  • Home
  • Cyber Crime
  • Cyber warfare
  • APT
  • Data Breach
  • Deep Web
  • Digital ID
  • Hacking
  • Hacktivism
  • Intelligence
  • Internet of Things
  • Laws and regulations
  • Malware
  • Mobile
  • Reports
  • Security
  • Social Networks
  • Terrorism
  • ICS-SCADA
  • POLICIES
  • Contact me
  • Home
  • Breaking News
  • Hacking
  • How hackers use Robots txt to harvest information

How hackers use Robots txt to harvest information

Pierluigi Paganini May 19, 2015

The penetration tester Thiebauld Weksteen has published an interesting analysis to explaine the importance of robots.txt for the hacking activities.

Thiebauld Weksteen, a penetration tester from Melbourne is advising system administrators that robots.txt can give precious details to hackers, when it comes to attacks, because robots.txt as the capability to tell search engines which directories can and cannot be crawled on a web server.

Weksteen, a security expert at Securus Global explained that system administrators offer clues to hackers related the storage of sensitive assets and information by including the path in the robots.txt file. The analysis of robots.txt could help the intruder to target the attack, instead of trying to strike blindly.

“In the simplest cases, it (robots.txt) will reveal restricted paths and the technology used by your servers, ” From a defender perspective, two common fallacies remain; that robots.txt somewhat is acting as an access control mechanism [and that] content will only be read by search engines and not by humans.”

In the same way that portal administrators don’t invest enough time in patching vulnerabilities, robots.txt normally shows the lack of caring, regularly reporting lists of text to disallow.

robots txt

Reconnaissance is an essential activity for a penetration tester, by harvesting robots.txt it is possible to get precious information for the sensitive assets.

“During the reconnaissance stage of a web application testing, the tester usually uses a list of known subdirectories to brute force the server and find hidden resources.

Depending on the uptake of certain web technologies, it needs to be refreshed on a regular basis. As you may see, the directive disallow gives an attacker precious knowledge on what may be worth looking at. Additionally, if that is true for one site, it is worth checking for another. ”

This is done by using a crawling technique, and as reported by Weksteen, on a total of 59,558 crawled website, 59,436 sent us a response.

“This is a good indicator of the freshness of Common Crawl results. Of these, 37,431 responded with a HTTP status of 200. Of these, 35,376 returned something that looked like a proper robots.txt (i.e., matching at least one standard directive).”

Administrators need to spend more time in hardening their systems, and if possible excluding assets based in generic terms, not specific ones.

Let me suggest you to read the analysis published by Weksteen at the following address
http://thiébaud.fr/robots.txt.html

About the Author Elsio Pinto

Elsio Pinto (@high54security) is at the moment the Lead Mcafee Security Engineer at Swiss Re, but he also as knowledge in the areas of malware research, forensics, ethical hacking. He had previous experiences in major institutions being the European Parliament one of them. He is a security enthusiast and tries his best to pass his knowledge. He also owns his own blog http://high54security.blogspot.com/

Edited by Pierluigi Paganini

(Security Affairs – Robots.txt, hacking)


facebook linkedin twitter

Hacking penetration testing Pierluigi Paganini Robots.txt security Security Affairs

you might also like

Pierluigi Paganini July 08, 2025
Microsoft Patch Tuesday security updates for July 2025 fixed a zero-day
Read more
Pierluigi Paganini July 08, 2025
Italian police arrested a Chinese national suspected of cyberespionage on a U.S. warrant
Read more

leave a comment

newsletter

Subscribe to my email list and stay
up-to-date!

    recent articles

    Microsoft Patch Tuesday security updates for July 2025 fixed a zero-day

    Security / July 08, 2025

    Italian police arrested a Chinese national suspected of cyberespionage on a U.S. warrant

    Intelligence / July 08, 2025

    U.S. CISA adds MRLG, PHPMailer, Rails Ruby on Rails, and Synacor Zimbra Collaboration Suite flaws to its Known Exploited Vulnerabilities catalog

    Hacking / July 08, 2025

    IT Worker arrested for selling access in $100M PIX cyber heist

    Cyber Crime / July 08, 2025

    New Batavia spyware targets Russian industrial enterprises

    Malware / July 07, 2025

    To contact me write an email to:

    Pierluigi Paganini :
    pierluigi.paganini@securityaffairs.co

    LEARN MORE

    QUICK LINKS

    • Home
    • Cyber Crime
    • Cyber warfare
    • APT
    • Data Breach
    • Deep Web
    • Digital ID
    • Hacking
    • Hacktivism
    • Intelligence
    • Internet of Things
    • Laws and regulations
    • Malware
    • Mobile
    • Reports
    • Security
    • Social Networks
    • Terrorism
    • ICS-SCADA
    • POLICIES
    • Contact me

    Copyright@securityaffairs 2024

    We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept All”, you consent to the use of ALL the cookies. However, you may visit "Cookie Settings" to provide a controlled consent.
    Cookie SettingsAccept All
    Manage consent

    Privacy Overview

    This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities...
    Necessary
    Always Enabled
    Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
    Non-necessary
    Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
    SAVE & ACCEPT