Conference

To appear in Proceedings of the Annual Computer Security Applications Conference (ACSAC).
New Orleans, LA, USA, December 2014.
Abstract

The ubiquity of Internet advertising has made it a popular target for attackers. One well-known instance of these attacks is the widespread use of trick banners that use social engineering techniques to lure victims into clicking on deceptive fake links, potentially leading to a malicious domain or malware. A recent and pervasive trend by attackers is to imitate the “download” or “play” buttons in popular file sharing sites (e.g., one-click hosters, video-streaming sites, bittorrent sites) in an attempt to trick users into clicking on these fake banners instead of the genuine link.

In this paper, we explore the problem of automatically assisting Internet users in detecting malicious trick banners and helping them identify the correct link. We present a set of features to characterize trick banners based on their visual properties such as image size, color, placement on the enclosing webpage, whether they contain animation effects, and whether they consistently appear with the same visual properties on consecutive loads of the same webpage. We have implemented a tool called TrueClick, which uses image processing and machine learning techniques to build a classifier based on these features to automatically detect the trick banners on a webpage. Our approach automatically classifies trick banners, and requires no manual effort to compile blacklists as current approaches do. Our user experiments show that TrueClick is useful in practice, resulting in a 3.55 factor improvement in correct link selection.

BibTeX
@inproceedings{acsac2014trueclick,
    title = {{TrueClick: Automatically Distinguishing Trick Banners from Genuine Download Links}},
    author = {Sevtap Duman and Kaan Onarlioglu and Ali Osman and William Robertson and Engin Kirda},
    booktitle = {{Proceedings of the Annual Computer Security Applications Conference (ACSAC)}},
    month = {December},
    year = {2014},
    location = {{New Orleans, LA, USA}},
}
To appear in Proceedings of the International Symposium on Research in Attacks, Intrusions, and Defenses (RAID).
Gothenburg, Sweden, September 2014.
Abstract

Content Security Policy (CSP) has been proposed as a principled and robust browser security mechanism against content injection attacks such as XSS. When configured correctly, CSP renders malicious code injection and data exfiltration exceedingly difficult for attackers. However, despite the promise of these security benefits and being implemented in almost all major browsers, CSP adoption is minuscule—our measurements show that CSP is deployed in enforcement mode on only 1% of the Alexa Top 100.

In this paper, we present the results of a long-term study to determine challenges in CSP deployments that can prevent wide adoption. We performed weekly crawls of the Alexa Top 1M to measure adoption of web security headers, and find that CSP both significantly lags other security headers, and that the policies in use are often ineffective at actually preventing content injection. In addition, we evaluate the feasibility of deploying CSP from the perspective of a security-conscious website operator. We used an incremental deployment approach through CSP’s report-only mode on four websites, collecting over 10M reports. Furthermore, we used semi-automated policy generation through web application crawling on a set of popular websites. We found both that automated methods do not suffice and that significant barriers exist to producing accurate results.

Finally, based on our observations, we suggest several improvements to CSP that could help to ease its adoption by the web community.

BibTeX
@inproceedings{raid2014csp,
    title = {{Why is CSP Failing? Trends and Challenges in CSP Adoption}},
    author = {Michael Weissbacher and Tobias Lauinger and William Robertson},
    booktitle = {{Proceedings of the International Symposium on Research in Attacks, Intrusions, and Defenses (RAID)}},
    month = {September},
    year = {2014},
    location = {{Gothenburg, Sweden}},
}
In Proceedings of the IEEE/IFIP International Conference on Dependable Systems and Networks (DSN).
Atlanta, GA, USA, June 2014.
Abstract

QR codes, a form of 2D barcode, allow easy interaction between mobile devices and websites or printed material by removing the burden of manually typing a URL or contact information. QR codes are increasingly popular and are likely to be adopted by malware authors and cyber-criminals as well. In fact, while a link can “look” suspicious, malicious and benign QR codes cannot be distinguished by simply looking at them. However, despite public discussions about increasing use of QR codes for malicious purposes, the prevalence of malicious QR codes and the kinds of threats they pose are still unclear.

In this paper, we examine attacks on the Internet that rely on QR codes. Using a crawler, we performed a large-scale experiment by analyzing QR codes across 14 million unique web pages over a nine-month period. Our results show that QR code technology is already used by attackers, for example to distribute malware or to lead users to phishing sites. However, the relatively few malicious QR codes we found in our experiments suggest that, on a global scale, the frequency of these attacks is not alarmingly high and users are rarely exposed to the threats distributed via QR codes while surfing the web.

BibTeX
@inproceedings{dsn2014qrcodes,
    title = {{Optical Delusions: A Study of Malicious QR Codes in the Wild}},
    author = {Amin Kharraz and Engin Kirda and William Robertson and Davide Balzarotti and Aurelien Francillon},
    booktitle = {{Proceedings of the IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)}},
    month = {June},
    year = {2014},
    location = {{Atlanta, GA, USA}},
}
In Proceedings of the ACM Symposium on Information, Computer and Communications Security (ASIACCS).
Kyoto, Japan, June 2014.
Abstract

Since its introduction, Android’s in-app billing service has quickly gained popularity. The in-app billing service allows all users to pay for options, services, subscriptions, and virtual goods from within mobile apps themselves. In-app billing is attractive for developers because it is easy to integrate, and has the advantage that the developer does not need to be concerned with managing financial transactions.

In this paper, we present the first fully-automated attack against the in-app billing service on Android. Using our prototype, we conducted a robustness study against our attack, analyzing 85 of the most popular Android apps that make use of in-app billing. We found that 60% of these apps were easily and automatically crackable. We were able to bypass highly popular and prominent games such as Angry Birds and Temple Run, each of which have millions of users. Based on our study, we developed a defensive technique that specifically counters automated attacks against in-app billing. Our technique is lightweight and can be easily added to existing applications.

BibTeX
@inproceedings{asiaccs2014virtualswindle,
    title = {{VirtualSwindle: An Automated Attack Against In-App Billing on Android}},
    author = {Collin Mulliner and William Robertson and Engin Kirda},
    booktitle = {{Proceedings of the ACM Symposium on Information, Computer and Communications Security (ASIACCS)}},
    month = {June},
    year = {2014},
    location = {{Kyoto, Japan}},
}
In Proceedings of the IEEE Symposium on Security and Privacy (Oakland).
San Jose, CA, USA, May 2014.
Abstract

Graphical user interfaces (GUIs) are the predominant means by which users interact with modern programs. GUIs contain a number of common visual elements or widgets such as labels, textfields, buttons, and lists, and GUIs typically provide the ability to set attributes on these widgets to control their visibility, enabled status, and whether they are writable. While these attributes are extremely useful to provide visual cues to users to guide them through an application’s GUI, they can also be misused for purposes they were not intended. In particular, in the context of GUI-based applications that include multiple privilege levels within the application, GUI element attributes are often misused as a mechanism for enforcing access control policies.

In this work, we introduce GEMs, or instances of GUI element misuse, as a novel class of access control vulnerabilities in GUI-based applications. We present a classification of different GEMs that can arise through misuse of widget attributes, and describe a general algorithm for identifying and confirming the presence of GEMs in vulnerable applications. We then present GEM Miner, an implementation of our GEM analysis for the Windows platform. We evaluate GEM Miner over a test set of three complex, real-world GUI-based applications targeted at the small business and enterprise markets, and demonstrate the efficacy of our analysis by finding numerous previously unknown access control vulnerabilities in these applications. We have reported the vulnerabilities we discovered to the developers of each application, and in one case have received confirmation of the issue.

BibTeX
@inproceedings{oakland2014gems,
    title = {{Hidden GEMs: Automated Discovery of Access Control Vulnerabilities in Graphical User Interfaces}},
    author = {Collin Mulliner and William Robertson and Engin Kirda},
    booktitle = {{Proceedings of the IEEE Symposium on Security and Privacy (Oakland)}},
    month = {May},
    year = {2014},
    location = {{San Jose, CA, USA}},
}
In Proceedings of the Annual Computer Security Applications Conference (ACSAC).
New Orleans, LA, USA, December 2013.
Abstract

As more and more Internet-based attacks arise, organizations are responding by deploying an assortment of security products that generate situational intelligence in the form of logs. These logs often contain high volumes of interesting and useful information about activities in the network, and are among the first data sources that information security specialists consult when they suspect that an attack has taken place. However, security products often come from a patchwork of vendors, and are inconsistently installed and administered. They generate logs whose formats differ widely and that are often incomplete, mutually contradictory, and very large in volume. Hence, although this collected information is useful, it is often dirty.

We present a novel system called Beehive that attacks the problem of automatically mining and extracting knowledge from the dirty log data produced by a wide variety of security products in a large enterprise. We improve on signature-based approaches to detecting security incidents and instead achieve behavioral detection of suspicious host activities that Beehive reports as potential security incidents. These incidents can then be further analyzed by incident response teams to determine whether a policy violation or attack has occurred. We have evaluated Beehive on the log data collected in a large enterprise, EMC, over a period of two weeks. We compare the incidents identified by Beehive against enterprise Security Operations Center reports, antivirus software alerts, and feedback received from enterprise security specialists. We show that Beehive is able to identify malicious events and policy violations within the enterprise network which would otherwise go undetected.

BibTeX
@inproceedings{acsac2013beehive,
    title = {{Beehive: Large-Scale Log Analysis for Detecting Suspicious Activity in Enterprise Networks}},
    author = {Ting-Fang Yen and Alina Oprea and Kaan Onarlioglu and Todd Leetham and William Robertson and Ari Juels and Engin Kirda},
    booktitle = {{Proceedings of the Annual Computer Security Applications Conference (ACSAC)}},
    month = {December},
    year = {2013},
    location = {{New Orleans, LA, USA}},
}
In Proceedings of the Annual Computer Security Applications Conference (ACSAC).
New Orleans, LA, USA, December 2013.
Abstract

Android is currently the largest mobile platform with around 750 million devices worldwide. Unfortunately, more then 30% of all devices contain publicly known security vulnerabilities and, in practice, cannot be updated through normal mechanisms since they are not longer supported by the manufacturer and mobile operator. This failure of traditional patch distribution systems has resulted in the creation of a large population of vulnerable mobile devices.

In this paper, we present PatchDroid, a system to distribute and apply third-party security patches for Android. Our system is designed for device-independent patch creation, and uses in-memory patching techniques to address vulnerabilities in both native and managed code. We created a fully usable prototype of PatchDroid, including a number of patches for well-known vulnerabilities in Android devices. We evaluated our system on different devices from multiple manufacturers and show that we can effectively patch security vulnerabilities on Android devices without impacting performance or usability. Therefore, PatchDroid represents a realistic path towards dramatically reducing the number of exploitable Android devices in the wild.

BibTeX
@inproceedings{acsac2013patchdroid,
    title = {{PatchDroid: Scalable Third-Party Patches for Android Devices}},
    author = {Collin Mulliner and Jon Oberheide and William Robertson and Engin Kirda},
    booktitle = {{Proceedings of the Annual Computer Security Applications Conference (ACSAC)}},
    month = {December},
    year = {2013},
    location = {{New Orleans, LA, USA}},
}
In Proceedings of the International Symposium on Research in Attacks, Intrusions, and Defenses (RAID).
Amsterdam, The Netherlands, October 2013.
Abstract

According to copyright holders, One-Click Hosters (OCHs) such as Megaupload are frequently used to host and distribute copyright infringing content. This has spurred numerous initiatives by legislators, law enforcement and content producers. Due to a lack of representative data sets that properly capture private uses of OCHs (such as sharing holiday pictures among friends), to date, there are no reliable estimates of the proportion of legitimate and infringing files being uploaded to OCHs. This situation leaves the field to the partisan arguments brought forward by copyright owners and OCHs. In this paper, we provide empirical data about the uses and misuses of OCHs by analysing six large data sets containing file metadata that we extracted from a range of popular OCHs. We assess the status of these files with regard to copyright infringement and show that at least 26% to 79% of them are potentially infringing. Perhaps surprising after the shutdown by the FBI for alleged copyright infringement, we found Megaupload to have the second highest proportion of legitimate files in our study.

BibTeX
@inproceedings{raid2013och,
    title = {{Holiday Pictures or Blockbuster Movies? Insights into Copyright Infringement in User Uploads to One-Click File Hosters}},
    author = {Tobias Lauinger and Kaan Onarlioglu and Abdelberi Chaabane and Engin Kirda and William Robertson and Mohamed Kaafar},
    booktitle = {{Proceedings of the International Symposium on Research in Attacks, Intrusions, and Defenses (RAID)}},
    month = {October},
    year = {2013},
    location = {{Amsterdam, The Netherlands}},
}
In Proceedings of the International Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA).
Berlin, Germany, July 2013.
Abstract

A poorly designed web browser extension with a security vulnerability may expose the whole system to an attacker. Therefore, attacks directed at “benign-but-buggy” extensions, as well as extensions that have been written with malicious intents pose significant security threats to a system running such components. Recent studies have indeed shown that many Firefox extensions are over-privileged, making them attractive attack targets. Unfortunately, users currently do not have many options when it comes to protecting themselves from extensions that may potentially be malicious. Once installed and executed, the extension needs to be trusted. This paper introduces Sentinel, a policy enforcer for the Firefox browser that gives fine-grained control to the user over the actions of existing JavaScript Firefox extensions. The user is able to define policies (or use predefined ones) and block common attacks such as data exfiltration, remote code execution, saved password theft, and preference modification. Our evaluation of Sentinel shows that our prototype implementation can effectively prevent concrete, real-world Firefox extension attacks without a detrimental impact on the user’s browsing experience.

BibTeX
@inproceedings{dimva2013sentinel,
    title = {{Securing Legacy Firefox Extensions with Sentinel}},
    author = {Kaan Onarlioglu and Mustafa Battal and William Robertson and Engin Kirda},
    booktitle = {{Proceedings of the International Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA)}},
    month = {July},
    year = {2013},
    location = {{Berlin, Germany}},
}
In Proceedings of the IEEE Symposium on Security and Privacy (Oakland).
San Francisco, CA, USA, May 2013.
Abstract

Privacy has become an issue of paramount importance for many users. As a result, encryption tools such as TrueCrypt, OS-based full-disk encryption such as FileVault, and privacy modes in all modern browsers have become popular. However, although such tools are useful, they are not perfect. For example, prior work has shown that browsers still leave many traces of user information on disk even if they are started in private browsing mode. In addition, disk encryption alone is not sufficient, as key disclosure through coercion remains possible. Clearly, it would be useful and highly desirable to have OS-level support that provides strong privacy guarantees for any application – not only browsers.

In this paper, we present the design and implementation of PrivExec, the first operating system service for private execution. PrivExec provides strong, general guarantees of private execution, allowing any application to execute in a mode where storage writes, either to the filesystem or to swap, will not be recoverable by others during or after execution. PrivExec does not require any explicit application support, recompilation, or any other preconditions. We have implemented a prototype of PrivExec by extending the Linux kernel that is performant, practical, and that secures sensitive data against disclosure. Our prototype implementation is open source, and we provide an anonymous YouTube link as a real-world demonstration of its capabilities.

BibTeX
@inproceedings{oakland2013privexec,
    title = {{PrivExec: Private Execution as an Operating System Service}},
    author = {Kaan Onarlioglu and Collin Mulliner and William Robertson and Engin Kirda},
    booktitle = {{Proceedings of the IEEE Symposium on Security and Privacy (Oakland)}},
    month = {May},
    year = {2013},
    location = {{San Francisco, CA, USA}},
}
In Proceedings of the Network and Distributed System Security Symposium (NDSS).
San Diego, CA, USA, February 2013.
Abstract

Wireless networking technologies have fundamentally changed the way we compute, allowing ubiquitous, any-time, any-where access to information. At the same time, wireless technologies come with the security cost that adversaries may receive signals and engage in unauthorized communication even when not physically close to a network. Because of the utmost importance of wireless security, many standards have been developed that are in wide use to secure sensitive wireless networks; one such popular standard is WPA Enterprise.

In this paper, we present a novel, highly practical, and targeted variant of a wireless evil twin attack against WPA Enterprise networks. We show significant design deficiencies in wireless management user interfaces for commodity operating systems, and also highlight the practical importance of the weak binding between wireless network SSIDs and authentication server certificates. We describe a prototype implementation of the attack, and discuss countermeasures that should be adopted. Our user experiments with 17 technically-sophisticated users show that the attack is stealthy and effective in practice. None of the victims were able to detect the attack.

BibTeX
@inproceedings{ndss2013wpa,
    title = {{A Practical, Targeted, and Stealthy Attack Against WPA Enterprise Authentication}},
    author = {Aldo Cassola and William Robertson and Engin Kirda and Guevara Noubir},
    booktitle = {{Proceedings of the Network and Distributed System Security Symposium (NDSS)}},
    month = {February},
    year = {2013},
    location = {{San Diego, CA, USA}},
}
In Proceedings of the Annual Computer Security Applications Conference (ACSAC).
Orlando, FL, USA, December 2012.
Abstract

Hard disk encryption is known to be vulnerable to a number of attacks that aim to directly extract cryptographic key material from system memory. Several approaches to preventing this class of attacks have been proposed, including TRESOR and LoopAmnesia. The common goal of these systems is to confine the encryption key and encryption process itself to the CPU, such that sensitive key material is never released into system memory where it could be accessed by a DMA attack.

In this work, we demonstrate that these systems are nevertheless vulnerable to such DMA attacks. Our attack, which we call TRESOR-HUNT, relies on the insight that DMA-capable adversaries are not restricted to simply reading physical memory, but can write arbitrary values to memory as well. TRESOR-HUNT leverages this insight to inject a ring 0 attack payload that extracts disk encryption keys from the CPU into the target system’s memory, from which it can be retrieved using a normal DMA transfer.

Our implementation of this attack demonstrates that it can be constructed in a reliable and OS-independent manner that is applicable to any CPU-bound encryption technique, IA32-based system, and DMA-capable peripheral bus. Furthermore, it does not crash the target system or otherwise significantly compromise its integrity. Our evaluation supports the OS-independent nature of the attack, as well as its feasibility in real-world scenarios. Finally, we discuss several countermeasures that might be adopted to mitigate this attack and render CPU-bound encryption systems viable.

BibTeX
@inproceedings{acsac2012dma,
    title = {{TRESOR-HUNT: Attacking CPU-Bound Encryption}},
    author = {Erik-Oliver Blass and William Robertson},
    booktitle = {{Proceedings of the Annual Computer Security Applications Conference (ACSAC)}},
    month = {December},
    year = {2012},
    location = {{Orlando, FL, USA}},
}
In Proceedings of the Annual Computer Security Applications Conference (ACSAC).
Orlando, FL, USA, December 2012.
Abstract

Botnets continue to be a significant problem on the Internet. Accordingly, a great deal of research has focused on methods for detecting and mitigating the effects of botnets. Two of the primary factors preventing the development of effective large-scale, wide-area botnet detection systems are seemingly contradictory. On the one hand, technical and administrative restrictions result in a general unavailability of raw network data that would facilitate botnet detection on a large scale. On the other hand, were this data available, real-time processing at that scale would be a formidable challenge. In contrast to raw network data, NetFlow data is widely available. However, NetFlow data imposes several challenges for performing accurate botnet detection.

In this paper, we present DISCLOSURE, a large-scale, wide-area botnet detection system that incorporates a combination of novel techniques to overcome the challenges imposed by the use of NetFlow data. In particular, we identify several groups of features that allow DISCLOSURE to reliably distinguish C&C channels from benign traffic using NetFlow records (i.e., flow sizes, client access patterns, and temporal behavior). To reduce DISCLOSURE’s false positive rate, we incorporate a number of external reputation scores into our system’s detection procedure. Finally, we provide an extensive evaluation of DISCLOSURE over two large, real-world networks. Our evaluation demonstrates that DISCLOSURE is able to perform real-time detection of botnet C&C channels over datasets on the order of billions of flows per day.

BibTeX
@inproceedings{acsac2012disclosure,
    title = {{DISCLOSURE: Detecting Botnet Command and Control Servers Through Large-Scale NetFlow Analysis}},
    author = {Leyla Bilge and Davide Balzarotti and William Robertson and Engin Kirda and Christopher Kruegel},
    booktitle = {{Proceedings of the Annual Computer Security Applications Conference (ACSAC)}},
    month = {December},
    year = {2012},
    location = {{Orlando, FL, USA}},
}
In Proceedings of the IEEE Computer Software and Applications Conference.
Izmir, Turkey, July 2012.
Abstract

Web applications have become an integral part of the daily lives of millions of users. Unfortunately, web applications are also frequently targeted by attackers, and criticial vulnerabilities such as XSS and SQL injection are still common. As a consequence, much effort in the past decade has been spent on mitigating web application vulnerabilities.

Current techniques focus mainly on sanitization: either on automated sanitization, the detection of missing sanitizers, the correctness of sanitizers, or the correct placement of sanitizers. However, these techniques are either not able to prevent new forms of input validation vulnerabilities such as HTTP Parameter Pollution, come with large runtime overhead, lack precision, or require significant modifications to the client and/or server infrastructure.

In this paper, we present IPAAS, a novel technique for preventing the exploitation of XSS and SQL injection vulnerabilities based on automated data type detection of input parameters. IPAAS automatically and transparently augments otherwise insecure web application development environments with input validators that result in significant and tangible security improvements for real systems. We implemented IPAAS for PHP and evaluated it on five real-world web applications with known XSS and SQL injection vulnerabilities. Our evaluation demonstrates that IPAAS would have prevented 83% of XSS vulnerabilities and 65% of SQL injection vulnerabilities while incurring no developer burden.

BibTeX
@inproceedings{compsac2012input,
    title = {{Preventing Input Validation Vulnerabilities in Web Applications through Automated Type Analysis}},
    author = {Theodoor Scholte and William Robertson and Davide Balzarotti and Engin Kirda},
    booktitle = {{Proceedings of the IEEE Computer Software and Applications Conference}},
    month = {July},
    year = {2012},
    location = {{Izmir, Turkey}},
}
In Proceedings of the ACM Symposium on Applied Computing.
Trento, Italy, March 2012.
Abstract

Web applications have become an integral part of the daily lives of millions of users. Unfortunately, web applications are also frequently targeted by attackers, and attacks such as XSS and SQL injection are still common. In this paper, we present an empirical study of more than 7,000 input validation vulnerabilities with the aim of gaining deeper insights into how these common web vulnerabilities can be prevented. In particular, we focus on the relationship between the specific programming language used to develop web applications and the vulnerabilities that are commonly reported. Our findings suggest that most SQL injection and a significant number of XSS vulnerabilities can be prevented using straight-forward validation mechanisms based on common data types. We elaborate on these common data types, and discuss how support could be provided in web application frameworks.

BibTeX
@inproceedings{sac2012input,
    title = {{An Empirical Analysis of Input Validation Mechanisms in Web Applications and Languages}},
    author = {Theodoor Scholte and William Robertson and Davide Balzarotti and Engin Kirda},
    booktitle = {{Proceedings of the ACM Symposium on Applied Computing}},
    month = {March},
    year = {2012},
    location = {{Trento, Italy}},
}
In Proceedings of the Network and Distributed System Security Symposium (NDSS).
San Diego, CA, USA, February 2010.
Abstract

Learning-based anomaly detection has proven to be an effective black-box technique for detecting unknown attacks. However, the effectiveness of this technique crucially depends upon both the quality and the completeness of the training data. Unfortunately, in most cases, the traffic to the system (e.g., a web application or daemon process) protected by an anomaly detector is not uniformly distributed. Therefore, some components (e.g., authentication, payments, or content publishing) might not be exercised enough to train an anomaly detection system in a reasonable time frame. This is of particular importance in real-world settings, where anomaly detection systems are deployed with little or no manual configuration, and they are expected to automatically learn the normal behavior of a system to detect or block attacks.

In this work, we first demonstrate that the features utilized to train a learning-based detector can be semantically grouped, and that features of the same group tend to induce similar models. Therefore, we propose addressing local training data deficiencies by exploiting clustering techniques to construct a knowledge base of well-trained models that can be utilized in case of undertraining. Our approach, which is independent of the particular type of anomaly detector employed, is validated using the realistic case of a learning-based system protecting a pool of web servers running several web applications such as blogs, forums, or Web services. We run our experiments on a real-world data set containing over 58 million HTTP requests to more than 36,000 distinct web application components. The results show that by using the proposed solution, it is possible to achieve effective attack detection even with scarce training data.

BibTeX
@inproceedings{ndss2010scarcity,
    title = {{Effective Anomaly Detection with Scarce Training Data}},
    author = {William Robertson and Federico Maggi and Christopher Kruegel and Giovanni Vigna},
    booktitle = {{Proceedings of the Network and Distributed System Security Symposium (NDSS)}},
    month = {February},
    year = {2010},
    location = {{San Diego, CA, USA}},
}
In Proceedings of the International Symposium on Recent Advances in Intrusion Detection (RAID).
Saint-Malo, Brittany, France, September 2009.
Abstract

Because of the ad hoc nature of web applications, intrusion detection systems that leverage machine learning techniques are particularly well-suited for protecting websites. The reason is that these systems are able to characterize the applications’ normal behavior in an automated fashion. However, anomaly-based detectors for web applications suffer from false positives that are generated whenever the applications being protected change. These false positives need to be analyzed by the security officer who then has to interact with the web application developers to confirm that the reported alerts were indeed erroneous detections.

In this paper, we propose a novel technique for the automatic detection of changes in web applications, which allows for the selective retraining of the affected anomaly detection models. Our technique identifies changes in both the interface of the components of a web application and its navigational structure. By correctly identifying legitimate changes in web applications, we can reduce false positives and allow for the automated retraining of the anomaly models.

We have evaluated our approach by analyzing a number of real-world applications. Our analysis shows that web applications indeed change substantially over time, and that our technique is able to effectively detect changes and automatically adapt the anomaly detection models to the new structure of the changed web applications.

BibTeX
@inproceedings{raid2009drift,
    title = {{Protecting a Moving Target: Addressing Web Application Concept Drift}},
    author = {Federico Maggi and William Robertson and Christopher Kruegel and Giovanni Vigna},
    booktitle = {{Proceedings of the International Symposium on Recent Advances in Intrusion Detection (RAID)}},
    month = {September},
    year = {2009},
    location = {{Saint-Malo, Brittany, France}},
}
In Proceedings of the USENIX Security Symposium.
Montreal, Quebec, Canada, August 2009.
Abstract

Security vulnerabilities continue to plague web applications, allowing attackers to access sensitive data and co-opt legitimate web sites as a hosting ground for malware. Accordingly, researchers have focused on various approaches to detecting and preventing common classes of security vulnerabilities in web applications, including anomaly-based detection mechanisms, static and dynamic analyses of server-side web application code, and client-side security policy enforcement.

This paper presents a different approach to web application security. In this work, we present a web application framework that leverages existing work on strong type systems to statically enforce a separation between the structure and content of both web documents and database queries generated by a web application, and show how this approach can automatically prevent the introduction of both cross-site scripting and SQL injection vulnerabilities. We present an evaluation of the framework, and demonstrate both the coverage and correctness of our sanitization functions. Finally, experimental results suggest that web applications developed using this framework perform competitively with applications developed using traditional frameworks.

BibTeX
@inproceedings{sec2009typing,
    title = {{Static Enforcement of Web Application Integrity Through Strong Typing}},
    author = {William Robertson and Giovanni Vigna},
    booktitle = {{Proceedings of the USENIX Security Symposium}},
    month = {August},
    year = {2009},
    location = {{Montreal, Quebec, Canada}},
}
Davide Balzarotti, Greg Banks, Marco Cova, Viktoria Felmetsger, William Robertson, Fredrik Valeur, Giovanni Vigna, and Richard Kemmerer.
In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA).
Seattle, WA, USA, July 2008.
Abstract

Electronic voting systems play a critical role in today’s democratic societies, as they are responsible for recording and counting the votes of citizens. Unfortunately, there is an alarming number of reports describing the malfunctioning of these systems, suggesting that their quality is not up to the task. Recently, there has been a focus on the security testing of voting systems to determine if they can be compromised in order to control the results of an election. We have participated in two large-scale projects, sponsored by the Secretaries of State of California and Ohio, whose respective goals were to perform the security testing of the electronic voting systems used in those two states. The testing process identified major flaws in all the systems analyzed, and resulted in substantial changes in the voting procedures of both states. In this paper, we describe the testing methodology that we used in testing two real-world electronic voting systems, the findings of our analysis, and the lessons we learned.

BibTeX
@inproceedings{issta2008voting,
    title = {{Are Your Votes Really Counted? Testing the Security of Real-world Voting Systems}},
    author = {Davide Balzarotti and Greg Banks and Marco Cova and Viktoria Felmetsger and William Robertson and Fredrik Valeur and Giovanni Vigna and Richard Kemmerer},
    booktitle = {{Proceedings of the International Symposium on Software Testing and Analysis (ISSTA)}},
    month = {July},
    year = {2008},
    location = {{Seattle, WA, USA}},
}
In Proceedings of the Annual Computer Security Applications Conference (ACSAC).
Miami Beach, FL, USA, December 2007.
Abstract

The effectiveness and precision of network-based intrusion detection signatures can be evaluated either by direct analysis of the signatures (if they are available) or by using black-box testing (if the system is closed-source). Recently, several techniques have been proposed to generate test cases by automatically deriving variations (or mutations) of attacks. Even though these techniques have been useful in identifying “blind spots” in the signatures of closed-source, network-based intrusion detection systems, the generation of test cases is performed in a random, unguided fashion. The reason is that there is no information available about the signatures to be tested. As a result, identifying a test case that is able to evade detection is difficult.

In this paper, we propose a novel approach to drive the generation of test cases by using the information gathered by analyzing the dynamic behavior of the intrusion detection system. Our approach applies dynamic data flow analysis techniques to the intrusion detection system to identify which parts of a network stream are used to detect an attack and how these parts are matched by a signature. The result of our analysis is a set of constraints that is used to guide the black-box testing process, so that the mutations are applied to only those parts of the attack that are relevant for detection. By doing this, we are able to perform a more focused generation of the test cases and improve the process of identifying an attack variation that evades detection.

BibTeX
@inproceedings{acsac2007signatures,
    title = {{Improving Signature Testing Through Dynamic Data Flow Analysis}},
    author = {Davide Balzarotti and William Robertson and Christopher Kruegel and Giovanni Vigna},
    booktitle = {{Proceedings of the Annual Computer Security Applications Conference (ACSAC)}},
    month = {December},
    year = {2007},
    location = {{Miami Beach, FL, USA}},
}
In Proceedings of the International Symposium on Recent Advances in Intrusion Detection (RAID).
Gold Coast, Queensland, Australia, September 2007.
Abstract

Attacks against privileged applications can be detected by analyzing the stream of system calls issued during process execution. In the last few years, several approaches have been proposed to detect anomalous system calls. These approaches are mostly based on modeling acceptable system call sequences. Unfortunately, the techniques proposed so far are either vulnerable to certain evasion attacks or are too expensive to be practical. This paper presents a novel approach to the analysis of system calls that uses a composition of dynamic analysis and learning techniques to characterize anomalous system call invocations in terms of both the invocation context and the parameters passed to the system calls. Our technique provides a more precise detection model with respect to solutions proposed previously, and, in addition, it is able to detect data modification attacks, which cannot be detected using only system call sequence analysis.

BibTeX
@inproceedings{raid2007anomaly,
    title = {{Exploiting Execution Context for the Detection of Anomalous System Calls}},
    author = {Darren Mutz and William Robertson and Giovanni Vigna and Richard Kemmerer},
    booktitle = {{Proceedings of the International Symposium on Recent Advances in Intrusion Detection (RAID)}},
    month = {September},
    year = {2007},
    location = {{Gold Coast, Queensland, Australia}},
}
In Proceedings of the Network and Distributed System Security Symposium (NDSS).
San Diego, CA, USA, February 2006.
Abstract

The custom, ad hoc nature of web applications makes learning-based anomaly detection systems a suitable approach to provide early warning about the exploitation of novel vulnerabilities. However, anomaly-based systems are known for producing a large number of false positives and for providing poor or non-existent information about the type of attack that is associated with an anomaly.

This paper presents a novel approach to anomaly-based detection of web-based attacks. The approach uses an anomaly generalization technique that automatically translates suspicious web requests into anomaly signatures. These signatures are then used to group recurrent or similar anomalous requests so that an administrator can easily deal with a large number of similar alerts.

In addition, the approach uses a heuristics-based technique to infer the type of attacks that generated the anomalies. This enables the prioritization of the attacks and provides better information to the administrator. Our approach has been implemented and evaluated experimentally on real-world data gathered from web servers at two universities.

BibTeX
@inproceedings{ndss2006anomaly,
    title = {{Using Generalization and Characterization Techniques in the Anomaly-based Detection of Web Attacks}},
    author = {William Robertson and Giovanni Vigna and Christopher Kruegel and Richard Kemmerer},
    booktitle = {{Proceedings of the Network and Distributed System Security Symposium (NDSS)}},
    month = {February},
    year = {2006},
    location = {{San Diego, CA, USA}},
}
In Proceedings of the International Symposium on Recent Advances in Intrusion Detection (RAID).
Seattle, WA, USA, September 2005.
Abstract

Network worms are malicious programs that spread automatically across networks by exploiting vulnerabilities that affect a large number of hosts. Because of the speed at which worms spread to large computer populations, countermeasures based on human reaction time are not feasible. Therefore, recent research has focused on devising new techniques to detect and contain network worms without the need of human supervision. In particular, a number of approaches have been proposed to automatically derive signatures to detect network worms by analyzing a number of worm-related network streams. Most of these techniques, however, assume that the worm code does not change during the infection process. Unfortunately, worms can be polymorphic. That is, they can mutate as they spread across the network. To detect these types of worms, it is necessary to devise new techniques that are able to identify similarities between different mutations of a worm.

This paper presents a novel technique based on the structural analysis of binary code that allows one to identify structural similarities between different worm mutations. The approach is based on the analysis of a worm’s control flow graph and introduces an original graph coloring technique that supports a more precise characterization of the worm’s structure. The technique has been used as a basis to implement a worm detection system that is resilient to many of the mechanisms used to evade approaches based on instruction sequences only.

BibTeX
@inproceedings{raid2005worm,
    title = {{Polymorphic Worm Detection Using Structural Information of Executables}},
    author = {Christopher Kruegel and Engin Kirda and Darren Mutz and William Robertson and Giovanni Vigna},
    booktitle = {{Proceedings of the International Symposium on Recent Advances in Intrusion Detection (RAID)}},
    month = {September},
    year = {2005},
    location = {{Seattle, WA, USA}},
}
In Proceedings of the USENIX Security Symposium.
Baltimore, MD, USA, July 2005.
Abstract

Intrusion detection systems that monitor sequences of system calls have recently become more sophisticated in defining legitimate application behavior. In particular, additional information, such as the value of the program counter and the configuration of the program’s call stack at each system call, has been used to achieve better characterization of program behavior. While there is common agreement that this additional information complicates the task for the attacker, it is less clear to which extent an intruder is constrained.

In this paper, we present a novel technique to evade the extended detection features of state-of-the-art intrusion detection systems and reduce the task of the intruder to a traditional mimicry attack. Given a legitimate sequence of system calls, our technique allows the attacker to execute each system call in the correct execution context by obtaining and relinquishing control of the application’s execution flow through manipulation of code pointers.

We have developed a static analysis tool for Intel x86 binaries that uses symbolic execution to automatically identify instructions that can be used to redirect control flow and to compute the necessary modifications to the environment of the process. We used our tool to successfully exploit three vulnerable programs and evade detection by existing state-of-the-art system call monitors. In addition, we analyzed three real-world applications to verify the general applicability of our techniques.

BibTeX
@inproceedings{sec2005mimicry,
    title = {{Automating Mimicry Attacks Using Static Binary Analysis}},
    author = {Christopher Kruegel and Engin Kirda and Darren Mutz and William Robertson and Giovanni Vigna},
    booktitle = {{Proceedings of the USENIX Security Symposium}},
    month = {July},
    year = {2005},
    location = {{Baltimore, MD, USA}},
}
In Proceedings of the Annual Asia Pacific Information Technology Security Conference (AusCERT).
Gold Coast, Queensland, Australia, May 2005.
Abstract

Network-based intrusion detection systems analyze network traffic looking for evidence of attacks. The analysis is usually performed using signatures, which are rules that describe what traffic should be considered as malicious. If the signatures are known, it is possible to either craft an attack to avoid detection or to send synthetic traffic that will match the signature to over-stimulate the network sensor causing a denial of service attack. To prevent these attacks, commercial systems usually do not publish their signature sets and their analysis algorithms. This paper describes a reverse engineering process and a reverse engineering tool that are used to analyze the way signatures are matched by network-based intrusion detection systems. The results of the analysis are used to either generate variations of attacks that evade detection or produce non-malicious traffic that over-stimulates the sensor. This shows that security through obscurity does not work. That is, keeping the signatures secret does not necessarily increase the resistance of a system to evasion and over-stimulation attacks.

BibTeX
@inproceedings{auscert2005reveng,
    title = {{Reverse Engineering of Network Signatures}},
    author = {Darren Mutz and Christopher Kruegel and William Robertson and Giovanni Vigna and Richard Kemmerer},
    booktitle = {{Proceedings of the Annual Asia Pacific Information Technology Security Conference (AusCERT)}},
    month = {May},
    year = {2005},
    location = {{Gold Coast, Queensland, Australia}},
}
In Proceedings of the Annual Computer Security Applications Conference (ACSAC).
Tuscon, AZ, USA, December 2004.
Abstract

A rootkit is a collection of tools used by intruders to keep the legitimate users and administrators of a compromised machine unaware of their presence. Originally, rootkits mainly included modified versions of system auditing programs (e.g., ps or netstat on a Unix system). However, for operating systems that support loadable kernel modules (e.g., Linux and Solaris), a new type of rootkit has recently emerged. These rootkits are implemented as kernel modules, and they do not require modification of user space binaries to conceal malicious activity. Instead, the rootkit operates within the kernel, modifying critical data structures such as the system call table or the list of currently-loaded kernel modules.

This paper presents a technique that exploits binary analysis to ascertain, at load time, if a module’s behavior resembles the behavior of a rootkit. Through this method, it is possible to provide additional protection against this type of malicious modification of the kernel. Our technique relies on an abstract model of module behavior that is not affected by small changes in the binary image of the module. Therefore, the technique is resistant to attempts to conceal the malicious nature of a kernel module.

BibTeX
@inproceedings{acsac2004lkrm,
    title = {{Detecting Kernel-Level Rootkits Through Binary Analysis}},
    author = {Christopher Kruegel and William Robertson and Giovanni Vigna},
    booktitle = {{Proceedings of the Annual Computer Security Applications Conference (ACSAC)}},
    month = {December},
    year = {2004},
    location = {{Tuscon, AZ, USA}},
}
In Proceedings of the ACM Conference on Computer and Communications Security (CCS).
Washington DC, USA, October 2004.
Abstract

Misuse-based intrusion detection systems rely on models of attacks to identify the manifestation of intrusive behavior. Therefore, the ability of these systems to reliably detect attacks is strongly affected by the quality of their models, which are often called “signatures.” A perfect model would be able to detect all the instances of an attack without making mistakes, that is, it would produce a 100% detection rate with 0 false alarms. Unfortunately, writing good models (or good signatures) is hard. Attacks that exploit a specific vulnerability may do so in completely different ways, and writing models that take into account all possible variations is very difficult. For this reason, it would be beneficial to have testing tools that are able to evaluate the “goodness” of detection signatures.

This work describes a technique to test and evaluate misuse detection models in the case of network-based intrusion detection systems. The testing technique is based on a mechanism that generates a large number of variations of an exploit by applying mutant operators to an exploit template. These mutant exploits are then run against a victim host protected by a network-based intrusion detection system. The results of the systems in detecting these variations provide a quantitative basis for the evaluation of the quality of the corresponding detection model.

BibTeX
@inproceedings{ccs2004sploit,
    title = {{Testing Network-based Intrusion Detection Signatures Using Mutant Exploits}},
    author = {Giovanni Vigna and Davide Balzarotti and William Robertson},
    booktitle = {{Proceedings of the ACM Conference on Computer and Communications Security (CCS)}},
    month = {October},
    year = {2004},
    location = {{Washington DC, USA}},
}
In Proceedings of the USENIX Security Symposium.
San Diego, CA, USA, August 2004.
Abstract

Disassembly is the process of recovering a symbolic representation of a program’s machine code instructions from its binary representation. Recently, a number of techniques have been proposed that attempt to foil the disassembly process. These techniques are very effective against state-of-the-art disassemblers, preventing a substantial fraction of a binary program from being disassembled correctly. This could allow an attacker to hide malicious code from static analysis tools that depend on correct disassembler output (such as virus scanners).

The paper presents novel binary analysis techniques that substantially improve the success of the disassembly process when confronted with obfuscated binaries. Based on control flow graph information and statistical methods, a large fraction of the program’s instructions can be correctly identified. An evaluation of the accuracy and the performance of our tool is provided, along with a comparison to several state-of-the-art disassemblers.

BibTeX
@inproceedings{sec2004disasm,
    title = {{Static Disassembly of Obfuscated Binaries}},
    author = {Christopher Kruegel and William Robertson and Fredrik Valeur and Giovanni Vigna},
    booktitle = {{Proceedings of the USENIX Security Symposium}},
    month = {August},
    year = {2004},
    location = {{San Diego, CA, USA}},
}
Christopher Kruegel, Darren Mutz, William Robertson, and Fredrik Valeur.
In Proceedings of the Annual Computer Security Applications Conference (ACSAC).
Las Vegas, NV, USA, December 2003.
Abstract

Intrusion detection systems (IDSs) attempt to identify attacks by comparing collected data to predefined signatures known to be malicious (misuse-based IDSs) or to a model of legal behavior (anomaly-based IDSs). Anomaly-based approaches have the advantage of being able to detect previously unknown attacks, but they suffer from the difficulty of building robust models of acceptable behavior which may result in a large number of false alarms. Almost all current anomaly-based intrusion detection systems classify an input event as normal or anomalous by analyzing its features, utilizing a number of different models. A decision for an input event is made by aggregating the results of all employed models.

We have identified two reasons for the large number of false alarms, caused by incorrect classification of events in current systems. One is the simplistic aggregation of model outputs in the decision phase. Often, only the sum of the model results is calculated and compared to a threshold. The other reason is the lack of integration of additional information into the decision process. This additional information can be related to the models, such as the confidence in a model’s output, or can be extracted from external sources. To mitigate these shortcomings, we propose an event classification scheme that is based on Bayesian networks. Bayesian networks improve the aggregation of different model outputs and allow one to seamlessly incorporate additional information. Experimental results show that the accuracy of the event classification process is significantly improved using our proposed approach.

BibTeX
@inproceedings{acsac2003bayes,
    title = {{Bayesian Event Classification for Intrusion Detection}},
    author = {Christopher Kruegel and Darren Mutz and William Robertson and Fredrik Valeur},
    booktitle = {{Proceedings of the Annual Computer Security Applications Conference (ACSAC)}},
    month = {December},
    year = {2003},
    location = {{Las Vegas, NV, USA}},
}
In Proceedings of the Annual Computer Security Applications Conference (ACSAC).
Las Vegas, NV, USA, December 2003.
Abstract

Web servers are ubiquitous, remotely accessible, and often misconfigured. In addition, custom web-based applications may introduce vulnerabilities that are overlooked even by the most security-conscious server administrators. Consequently, web servers are a popular target for hackers. To mitigate the security exposure associated with web servers, intrusion detection systems are deployed to analyze and screen incoming requests. The goal is to perform early detection of malicious activity and possibly prevent more serious damage to the protected site. Even though intrusion detection is critical for the security of web servers, the intrusion detection systems available today only perform very simple analyses and are often vulnerable to simple evasion techniques. In addition, most systems do not provide sophisticated attack languages that allow a system administrator to specify custom, complex attack scenarios to be detected.

This paper presents WebSTAT, an intrusion detection system that analyzes web requests looking for evidence of malicious behavior. The system is novel in several ways. First of all, it provides a sophisticated language to describe multi-step attacks in terms of states and transitions. In addition, the modular nature of the system supports the integrated analysis of network traffic sent to the server host, operating system-level audit data produced by the server host, and the access logs produced by the web server. By correlating different streams of events, it is possible to achieve more effective detection of web-based attacks.

BibTeX
@inproceedings{acsac2003webstat,
    title = {{A Stateful Intrusion Detection System for World-Wide Web Servers}},
    author = {Giovanni Vigna and William Robertson and Vishal Kher and Richard Kemmerer},
    booktitle = {{Proceedings of the Annual Computer Security Applications Conference (ACSAC)}},
    month = {December},
    year = {2003},
    location = {{Las Vegas, NV, USA}},
}
William Robertson, Christopher Kruegel, Darren Mutz, and Fredrik Valeur.
In Proceedings of the USENIX Large Installations Systems Administration Conference (LISA).
San Diego, CA, USA, October 2003.
Abstract

Buffer overflows belong to the most common class of attacks on today’s Internet. Although stack-based variants are still by far more frequent and well-understood, heap-based overflows have recently gained more attention. Several real-world exploits have been published that corrupt heap management information and allow arbitrary code execution with the privileges of the victim process.

This paper presents a technique that protects heap management information and allows for run-time detection of heap-based overflows. We discuss the structure of these attacks and our proposed detection scheme that has been implemented as a patch to the GNU libc. We report the results of our experiments, which demonstrate the detection effectiveness and performance impact of our approach. In addition, we discuss different mechanisms to deploy the memory protection.

BibTeX
@inproceedings{lisa2003heap,
    title = {{Run-time Detection of Heap-based Overflows}},
    author = {William Robertson and Christopher Kruegel and Darren Mutz and Fredrik Valeur},
    booktitle = {{Proceedings of the USENIX Large Installations Systems Administration Conference (LISA)}},
    month = {October},
    year = {2003},
    location = {{San Diego, CA, USA}},
}
Christopher Kruegel, Darren Mutz, William Robertson, and Fredrik Valeur.
In Proceedings of the International Symposium on Recent Advances in Intrusion Detection (RAID).
Pittsburgh, PA, USA, September 2003.
Abstract

The Border Gateway Protocol (BGP) is a fundamental component of the current Internet infrastructure. Due to the inherent trust relationship between peers, control of a BGP router could enable an attacker to redirect traffic allowing man-in-the-middle attacks or to launch a large-scale denial of service. It is known that BGP has weaknesses that are fundamental to the protocol design. Many solutions to these weaknesses have been proposed, but most require resource intensive cryptographic operations and modifications to the existing protocol and router software. For this reason, none of them have been widely adopted. However, the threat necessitates an effective, immediate solution.

We propose a system that is capable of detecting malicious inter-domain routing update messages through passive monitoring of BGP traffic. This approach requires no protocol modifications and utilizes existing monitoring infrastructure. The technique relies on a model of the autonomous system connectivity to verify that route advertisements are consistent with the network topology. By identifying anomalous update messages, we prevent routers from accepting invalid routes. Utilizing data provided by the Route Views project, we demonstrate the ability of our system to distinguish between legitimate and potentially malicious traffic.

BibTeX
@inproceedings{raid2003bgp,
    title = {{Topology-based Detection of Anomalous BGP Messages}},
    author = {Christopher Kruegel and Darren Mutz and William Robertson and Fredrik Valeur},
    booktitle = {{Proceedings of the International Symposium on Recent Advances in Intrusion Detection (RAID)}},
    month = {September},
    year = {2003},
    location = {{Pittsburgh, PA, USA}},
}

Journal

Davide Balzarotti, Marco Cova, Viktoria Felmetsger, Richard Kemmerer, William Robertson, Fredrik Valeur, and Giovanni Vigna.
In IEEE Transactions on Software Engineering, 36(4), July 2010.
Abstract

Voting is the process through which a democratic society determines its government. Therefore, voting systems are as important as other well-known critical systems, such as air traffic control systems or nuclear plant monitors. Unfortunately, voting systems have a history of failures that seems to indicate that their quality is not up to the task. Because of the alarming frequency and impact of the malfunctions of voting systems, in recent years a number of vulnerability analysis exercises have been carried out against voting systems to determine if they can be compromised in order to control the results of an election. We have participated in two such large-scale projects, sponsored by the Secretaries of State of California and Ohio, whose goals were to perform the security testing of the electronic voting systems used in their respective states. As the result of the testing process, we identified major vulnerabilities in all the systems analyzed. We then took advantage of a combination of these vulnerabilities to generate a series of attacks that would spread across the voting systems and would “steal” votes by combining voting record tampering with social engineering approaches. As a response to the two large-scale security evaluations, the Secretaries of State of California and Ohio recommended changes to improve the security of the voting process. In this paper, we describe the methodology that we used in testing the two real-world electronic voting systems we evaluated, the findings of our analysis, our attacks, and the lessons we learned.

BibTeX
@article{tse2010voting,
    title = {{An Experience in Testing the Security of a Real-World Electronic Voting System}},
    author = {Davide Balzarotti and Marco Cova and Viktoria Felmetsger and Richard Kemmerer and William Robertson and Fredrik Valeur and Giovanni Vigna},
    journal = {{IEEE Transactions on Software Engineering}},
    month = {July},
    year = {2010},
    volume = {36},
    number = {4},
}
In Journal of Computer Security, 17(3), May 2009.
Abstract

Web-based applications have become a popular means of exposing functionality to large numbers of users by leveraging the services provided by web servers and databases. The wide proliferation of custom-developed web-based applications suggests that anomaly detection could be a suitable approach for providing early warning and real-time blocking of application-level exploits. Therefore, a number of research prototypes and commercial products that learn the normal usage patterns of web applications have been developed. Anomaly detection techniques, however, are prone to both false positives and false negatives. As a result, if anomalous web requests are simply blocked, it is likely that some legitimate requests would be denied, resulting in decreased availability. On the other hand, if malicious requests are allowed to access a web application’s data stored in a back-end database, security-critical information could be leaked to an attacker.

To ameliorate this situation, we propose a system composed of a web-based anomaly detection system, a reverse HTTP proxy, and a database anomaly detection system. Serially composing a web-based anomaly detector and a SQL query anomaly detector increases the detection rate of our system. To address a potential increase in the false positive rate, we leverage an anomaly-driven reverse HTTP proxy to serve anomalous-but-benign requests that do not require access to sensitive information.

We developed a prototype of our approach and evaluated its applicability with respect to several existing web-based applications, showing that our approach is both feasible and effective in reducing both false positives and false negatives.

BibTeX
@article{jcs2009anomaly,
    title = {{Reducing Errors in the Anomaly-based Detection of Web-based Attacks Through the Combined Analysis of Web Requests and SQL Queries}},
    author = {Giovanni Vigna and Fredrik Valeur and Davide Balzarotti and William Robertson and Christopher Kruegel and Engin Kirda},
    journal = {{Journal of Computer Security}},
    month = {May},
    year = {2009},
    volume = {17},
    number = {3},
}
In Journal of Computer Networks, 48(5), July 2005.
Abstract
BibTeX
@article{jcn2005webanomaly,
    title = {{A Multi-Model Approach to the Detection of Web-based Attacks}},
    author = {Christopher Kruegel and William Robertson and Giovanni Vigna},
    journal = {{Journal of Computer Networks}},
    month = {July},
    year = {2005},
    volume = {48},
    number = {5},
}
In Journal of Practice in Information Processing and Communication (PIK), 27(4), August 2004.
Abstract

Intrusion detection systems monitor protected networks and attempt to identify evidence of malicious activity. When an attack is detected, an alert is produced, and, possibly, a countermeasure is executed. A perfect intrusion detection system would be able to identify all the attacks without raising any false alarms. In addition, a countermeasure would be executed only when an attack is actually successful. Unfortunately, false alarms are commonplace in intrusion detection systems, and perfectly benign events are interpreted as malicious. In addition, non-relevant alerts are also common. These are alerts associated with attacks that were not successful. Such alerts should be tagged appropriately so that their priority can be lowered.

The process of identifying alerts associated with successful attacks is called alert verification. This paper describes the different issues involved in alert verification and presents a tool that performs real-time verification of attacks detected by an intrusion detection system. The experimental evaluation of the tool shows that verification can dramatically reduce both false and non-relevant alerts.

BibTeX
@article{pik2004alerts,
    title = {{Using Alert Verification to Identify Successful Intrusion Attempts}},
    author = {Christopher Kruegel and Giovanni Vigna and William Robertson},
    journal = {{Journal of Practice in Information Processing and Communication (PIK)}},
    month = {August},
    year = {2004},
    volume = {27},
    number = {4},
}

Workshop

In Proceedings of the Workshop on the Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA).
Dortmund, North Rhine-Westphalia, Germany, July 2004.
BibTeX
@inproceedings{divma2004alerts,
    title = {{Alert Verification: Determining the Success of Intrusion Attempts}},
    author = {Christopher Kruegel and William Robertson},
    booktitle = {{Proceedings of the Workshop on the Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA)}},
    month = {July},
    year = {2004},
    location = {{Dortmund, North Rhine-Westphalia, Germany}},
}

Dissertation

UC Santa Barbara, June 2009.
Abstract

The World Wide Web has evolved from a system for serving an interconnected set of static documents to what is now a powerful, versatile, and largely democratic platform for application delivery and information dissemination. Unfortunately, with the web’s explosive growth in power and popularity has come a concomitant increase in both the number and impact of web application-related security incidents. The magnitude of the problem has prompted much interest within the security community towards researching mechanisms that can mitigate this threat. To this end, intrusion detection systems have been proposed as a potential means of identifying and preventing the successful exploitation of web application vulnerabilities. The current state-of-the-art, however, has failed to deliver on the promise of intrusion detection. Misuse-based detection systems are unable to generalize to previously unknown attacks for which no signatures exist. In the context of the web, this is especially problematic in light of the wide proliferation of unique, custom-written web applications. On the other hand, anomaly-based intrusion detection systems seem well-suited for detecting attacks against web applications. Existing anomaly detection techniques, however, have heretofore proven unfeasible due to several factors: unacceptably high false positive rates, susceptibility to evasion, an inability to adapt to changes in monitored applications, and a lack of explanatory power.

In this dissertation, I present WEBANOMALY, an advanced black-box anomaly detection system that accurately detects attacks against web applications with low performance overhead. WEBANOMALY addresses several of the aforementioned fundamental challenges to anomaly detection using a combination of novel techniques. In particular, the relatively high rate of false positives and lack of explanatory power is ameliorated using anomaly signatures, a technique for clustering related anomalies and classifying the type of attack they represent. The problem of local training data scarcity is addressed through the use of global knowledge bases of well-trained profiles collected from other web applications. Changes in web application behavior over time, known as concept drift, are addressed by treating the web application itself as an oracle of legitimate change. Finally, a novel framework for developing web applications that are secure by construction against many common classes of attacks is presented.

BibTeX
@thesis{ucsb2009thesis},
    title = {{Detecting and Preventing Attacks Against Web Applications}},
    author = {William Robertson},
    institution = {{UC Santa Barbara}},
    month = {June},
    year = {2009},
}