Malicious Python Package uses Unicode to evade detection

Malicious Python Package uses Unicode support to evade detection

Pierluigi Paganini March 27, 2023

Researchers discovered a malicious package on PyPI that uses Unicode to evade detection while stealing sensitive data.

Supply chain security firm Phylum discovered a malicious Python package on the Python Package Index (PyPI) repository that uses Unicode to evade detection and deliver information-stealing malware.

The package, named onyxproxy, was uploaded to the PyPI repository on March 15, 2023. The analysis of the package revealed that it supports data harvesting capabilities.

“Phylum’s automated platform recently detected the onyxproxy package on PyPI, a malicious package that harvests and exfiltrates credentials and other sensitive data. In many ways, this package typifies other token stealers that we have found prevalent in PyPI.” reads the analysis published by Phylum.”However, one feature of this particular package caught our eye: an obfuscation technique that was foreseen in 2007 during a discussion about Python’s support for Unicode, documented in PEP-3131“

While inspecting the code the experts multiple strange, non-monospaced, sans-serif font with mixed bold and italics. The attackers used Unicode variants of characters that appear identical to a human inspection (homoglyphs) (i.e., self vs. 𝘀𝘦𝘭𝘧). The attackers used this trick to evade detection, but when the Python interpreter parsed the code the malicious code was executed.

“An obvious and immediate benefit of this strange scheme is readability. We can still easily reason about this code, because our eyes and brains can still read the words, despite the intermixed fonts. Moreover, these visible differences do not prevent the code from running, which it does.” continues the analysis. “One might dismiss this as a developer trying to show how clever they can be, except that this package is trying to steal and exfiltrate things immediately upon installation.”

A similar technique was detailed by the researchers Nicholas Boucher and Ross Anderson, who explained how to abuse bidirectional override characters and homoglyphs in a variety of programming languages.

The experts pointed out that the author of onyxproxy demonstrates is not sophisticated, he likely merely cut-and-paste code from various sources and put them together. This obfuscation technique is absent from other parts of the code in setup.py and many Python modules are imported multiple times.

“But, whomever this author copied this obfuscated code from is clever enough to know how to use the internals of the Python interpreter to generate a novel kind of obfuscated code, a kind that is somewhat readable without divulging too much of exactly what the code is trying to steal.” concludes the report. “This novelty is something that we will be keeping an eye on at Phylum, because now that this technique has proven viable in the wild, we fully anticipate others to copy and improve their attempts to attack developers.”

Follow me on Twitter: @securityaffairs and Facebook and Mastodon

Pierluigi Paganini

(SecurityAffairs – hacking, Python)

Pierluigi Paganini July 22, 2026

U.S. CISA adds DD-WRT, Langflow and WordPress flaws to its Known Exploited Vulnerabilities catalog

Pierluigi Paganini July 22, 2026

Malicious Python Package uses Unicode support to evade detection

Researchers discovered a malicious package on PyPI that uses Unicode to evade detection while stealing sensitive data.

you might also like

U.S. CISA adds DD-WRT, Langflow and WordPress flaws to its Known Exploited Vulnerabilities catalog

OpenAI AI models exploited zero-days to reach Hugging Face in benchmark test

leave a comment

newsletter

Subscribe to my email list and stay
up-to-date!

recent articles

U.S. CISA adds DD-WRT, Langflow and WordPress flaws to its Known Exploited Vulnerabilities catalog

OpenAI AI models exploited zero-days to reach Hugging Face in benchmark test

Public PoC triggers active exploitation of critical SharePoint RCE vulnerability CVE-2026-50522

Zimbra 10.1.20 patches multiple security issues, including a critical command injection bug

Qilin Ransomware Affiliates Abuse CVE-2026-0257 to Gain Unauthorized VPN Access

QUICK LINKS

Malicious Python Package uses Unicode support to evade detection

Researchers discovered a malicious package on PyPI that uses Unicode to evade detection while stealing sensitive data.

you might also like

U.S. CISA adds DD-WRT, Langflow and WordPress flaws to its Known Exploited Vulnerabilities catalog

OpenAI AI models exploited zero-days to reach Hugging Face in benchmark test

leave a comment

newsletter

Subscribe to my email list and stay up-to-date!

recent articles

U.S. CISA adds DD-WRT, Langflow and WordPress flaws to its Known Exploited Vulnerabilities catalog

OpenAI AI models exploited zero-days to reach Hugging Face in benchmark test

Public PoC triggers active exploitation of critical SharePoint RCE vulnerability CVE-2026-50522

Zimbra 10.1.20 patches multiple security issues, including a critical command injection bug

Qilin Ransomware Affiliates Abuse CVE-2026-0257 to Gain Unauthorized VPN Access

Subscribe to my email list and stay
up-to-date!