A drive-by download attack occurs when a user visits a webpage which attempts to automatically download malware without the user’s consent. Attackers sometimes use a malware distribution network (MDN) to manage a large number of malicious webpages, exploits, and malware executables. In this paper, we provide a new method to determine these MDNs from the secondary URLs and redirect chains recorded by a high-interaction client honeypot. In addition, we propose a novel drive-by download detection method. Instead of depending on the malicious content used by previous methods, our algorithm first identifies and then leverages the URLs of the MDN’s central servers, where a central server is a common server shared by a large percentage of the drive-by download attacks in the same MDN. A set of regular expression-based signatures are then generated based on the URLs of each central server. This method allows additional malicious webpages to be identified which launched but failed to execute a successful drive-by download attack. The new drive-by detection system named ARROW has been implemented, and we provide a large-scale evaluation on the output of a production drive-by detection system. The experimental results demonstrate the effectiveness of our method, where the detection coverage has been boosted by 96% with an extremely low false positive rate.
Available at: http://works.bepress.com/junjie_zhang/2/