The lab has two papers at Financial Crypto this year: Excision, our system for in-browser detection of malware using inclusion sequence analysis, and CuriousDroid, our system for intelligently exercising mobile applications to improve dynamic analysis.
As others have observed, drive-by downloads often involve directing victims along redirection chains that trace a path through a malicious infrastructure that performs tasks such as fingerprinting the user’s browser and selecting an appropriate exploit to deliver. While recognizing traversals of malicious infrastructure have been used in an offline capacity, Excision provides online detection of suspicious resource inclusion sequences that correspond to exploits in progress.
We implemented this idea as a modification to Chromium that incrementally
builds an inclusion tree as a page loads. This inclusion tree, as opposed to
a DOM, records for a subset of security-critical elements (e.g.,
img) their origin and the element responsible for loading them.
In an offline phase, we built a knowledge base of benign and malicious inclusion trees by crawling the Alexa Top 10K and a blacklist of known malicious sites using Excision-instrumented Chromium instances. Using this knowledge base, we then classify inclusion sequences at run-time to determine whether a sequence is likely to be malicious before dangerous elements are loaded and have the chance to attack the system.
Dynamic analysis is the predominant technique for identifying malware and characterizing its behavior, and the mobile domain is no exception. There are numerous dynamic analysis sandboxes these days, including iSecLab’s own Andrubis.
The main difficulty with dynamic analysis is that high quality test inputs are required for high coverage of the analysis artifact. For Android applications, this largely involves exercising the app’s user interface. However, automatically generating realistic user inputs is no simple task.
CuriousDroid uses several novel techniques to extract, decompose, and process UIs. For instance, it can infer realistic data to enter into text fields presented to users, and can infer the correct UI action to perform in order to explore new paths through an application. These techniques allow it to achieve higher coverage than prior approaches, thereby eliciting more behavior from apps that can then be observed dynamic analyzer.