Supply chain security firm Phylum discovered a malicious Python package on the Python Package Index (PyPI) repository that uses Unicode to evade detection and deliver information-stealing malware.
The package, named onyxproxy, was uploaded to the PyPI repository on March 15, 2023. The analysis of the package revealed that it supports data harvesting capabilities.
“Phylum’s automated platform recently detected the onyxproxy package on PyPI, a malicious package that harvests and exfiltrates credentials and other sensitive data. In many ways, this package typifies other token stealers that we have found prevalent in PyPI.” reads the analysis published by Phylum.”However, one feature of this particular package caught our eye: an obfuscation technique that was foreseen in 2007 during a discussion about Python’s support for Unicode, documented in PEP-3131“
While inspecting the code the experts multiple strange, non-monospaced, sans-serif font with mixed bold and italics. The attackers used Unicode variants of characters that appear identical to a human inspection (homoglyphs) (i.e., self vs. 𝘀𝘦𝘭𝘧). The attackers used this trick to evade detection, but when the Python interpreter parsed the code the malicious code was executed.
“An obvious and immediate benefit of this strange scheme is readability. We can still easily reason about this code, because our eyes and brains can still read the words, despite the intermixed fonts. Moreover, these visible differences do not prevent the code from running, which it does.” continues the analysis. “One might dismiss this as a developer trying to show how clever they can be, except that this package is trying to steal and exfiltrate things immediately upon installation.”
A similar technique was detailed by the researchers Nicholas Boucher and Ross Anderson, who explained how to abuse bidirectional override characters and homoglyphs in a variety of programming languages.
The experts pointed out that the author of onyxproxy demonstrates is not sophisticated, he likely merely cut-and-paste code from various sources and put them together. This obfuscation technique is absent from other parts of the code in setup.py and many Python modules are imported multiple times.
“But, whomever this author copied this obfuscated code from is clever enough to know how to use the internals of the Python interpreter to generate a novel kind of obfuscated code, a kind that is somewhat readable without divulging too much of exactly what the code is trying to steal.” concludes the report. “This novelty is something that we will be keeping an eye on at Phylum, because now that this technique has proven viable in the wild, we fully anticipate others to copy and improve their attempts to attack developers.”
Follow me on Twitter: @securityaffairs and Facebook and Mastodon
(SecurityAffairs – hacking, Python)
AISURU/Kimwolf botnet hit a record 31.4 Tbps DDoS attack lasting 35 seconds in Nov 2025,…
A study found nearly 5 million servers exposing Git metadata, with 250,000 leaking deployment credentials…
U.S. Cybersecurity and Infrastructure Security Agency (CISA) adds SmarterTools SmarterMail and React Native Community CLI…
Substack confirmed a data breach after a hacker leaked data from nearly 700,000 users, including…
Italy stopped Russian-linked cyberattacks targeting Foreign Ministry offices and Winter Olympics websites and hotels, Foreign…
China-linked hackers tracked as Amaranth-Dragon targeted government and law enforcement agencies across Southeast Asia in…
This website uses cookies.