Offensive Anomaly Detection

I hope you are all keeping well and managing as best you can, under whatever lockdown circumstances you may find yourself in. In such stressful times, I find it’s very easy to overwork and subsequently burn out as diving into work takes my mind off the greater situation. Take your holidays when you can, watch your working hours and automate where possible to make things easier for yourself. With vaccines on the way, we’ll all hopefully be back to exploring meat-space in a few short months. For now, welcome to 2021: a hacking odyssey!

Something like this post has been fermenting in the back of my mind for quite a while and I was always too busy when I tried to write it, too distracted to clearly formulate thoughts. I’m going to attempt to touch on some of the fundamentals of blackbox web application security testing. How to approach and rationalize such a thing, what to spot when testing and some discussion of tooling, techniques and automation in general. I have a big year planned, I’ve written more code in 2020 than the rest of my life combined and I’ve been building some fun projects and fuzzing tools in rust and python to assist me with my escapades. I’ll also be undertaking some new business ventures in 2021.


Back in 2016 when I wrote parameth, it was a simple proof of concept (poc) script to automate something I was tired of doing manually in burp intruder. There were no other tools that implemented the same technique at the time and that’s also what I’ve pursued with my follow up research and development. By the very nature of doing something others aren’t or doing things in a different way, you are more likely to find the things they aren’t.

This poc had a great impact and the technique was expanded on further by the community and eventually even fully integrated into Burpsuite as a feature which is awesome. At the time this script solved a simple situation I found myself in quite often while blackbox testing and through writing it, changed how I fundamentally thought about security testing and gave me some baseline code and concepts that could be expanded on as I developed further automation and tooling.

At a most basic level, finding additional input parameters identifies more functionality of a target application. This can also be applied to json/xml object parameters, request headers and cookies… essentially anywhere user or application input can be received and processed. Identifying more inputs means more coverage and a deeper exploration of the application attack surface. When it comes to security testing and fuzzing, coverage is hands-down the most important thing.

Coverage is also why whitebox testing generally yields more and better vulnerabilities than blackbox testing. When you can see all the inputs and what is done to them in the backend, you can spend more time formulating attacks instead of spending it blindly discovering attack surface. It’s also why, as a community, we should be majorly pushing for security testing to occur in the development stages and before security issues have a chance of ending up in production systems.

Since implementing a SDLC is more of a business problem and undertaking, it’s also particular to each business and their individual processes. It isn’t a simple generic technical problem I can solve, hack on or try to write code for. Thus, I’ve been focusing my research efforts on the more fun problem of how to catch vulnerabilities on production systems. This is why I will talk about application responses and the identification of anomalies within them, this very often leads to the discovery of unintentional useful behavior or directly to security vulnerabilities.

As with exploring and mapping an organisations network you need to map out as much of an application and it’s functionality as possible. This can be achieved through crawling the application endpoints and its client side code (Javascript), browsing through the app, brute forcing directories or files or reading and discovering API documentation. Ideally you want to be collecting all the documents, names, formats, emails, scripts, endpoints and parameters as you go as they can be used later to identify additional context and assist greatly when you start fuzzing.

Detection and Automation

For web applications, the vast majority of vulnerabilities can be detected by identifying different HTTP status code or response sizes returned from a request. Other indicators used can be the response times, new request headers or cookies being returned or any noticeable corruption or change of the response content. All of these things are detectable in an automated way and in essence are how DAST tools like Burpsuite work, albeit for specific vulnerabilities. This testing formula distilled is:

  • Take a normal request
  • Mutate it in some way
  • Examine how it changes the response
  • Compare it to previous responses
  • Examine elsewhere in the application these inputs appear

This response could also be examined for signatures, particular keywords and error messages that indicate a known vulnerability type or fingerprint the underlying application libraries or functions in use. In my tools, I try and simply detect any difference in the response that might indicate there is something worth investigating.

The web application security community loves emphasizing how manual testing is superior to all automated testing but what they really mean by this, is that security testing requires human thought to manually interpret what is happening. Understanding the context around the payloads used, requests and responses being returned. All within the greater context of the application functionality and what the application is trying to achieve. HTTP requests being sent and responses being retrieved is an automated process that is performed by a program, regardless if there is someone reading and interpreting the results or not.

I spent a lot of time working on how to best mutate requests in a context aware way and then storing and highlighting the ones that caused interesting or unexpected responses. The idea being to fuzz like a demon from hell and then filter out the chaff and keep the interesting stuff or anomalies for a human eye to further examine. It’s partially automated manual testing, not a scan for known specific things which is how a large majority of tools work. I essentially built a more automated ffuf that stores full requests and response bodies so I can also use it for other things like detecting changes in an application over time, detect new headers/cookies and filter or fingerprint things more efficiently.

Some of the most interesting vulnerabilities can be found by sheer brute force fuzzing, I try replicate this by scanning apps with known vulnerabilities and seeing where my own tools can detect them or what can allow the vulnerability to be identified. In blackbox testing it is widely true that the people who are more persistent and try more things, more quickly, are the ones who find more vulnerabilities faster. Even when doing the same thing that has been done before, there is a new emphasis on optimization for speed. This allows time to be spent manually examining the things that matter.

Until skynet is sentient, no automated tool can have human context awareness. Think of testing tools like an extension of your capabilities, they do data collection and repetitive tasks quicker and better than you can and should be used for that. You still need to interpret the output in an intelligent way. Burpsuite repeater is going to be a useful tool as long as websites communicate over HTTP and is reliable for manually digging in when you have something interesting to poke at.

The field of information security is primarily concerned with securing information. When you are testing applications, it’s worth heavily considering what information the company may want to keep secret and secure. In both the context of the greater business and in the context of the data an application relies on to function. Often developers build functionality that isn’t aware of this or the data security context.

Developers tend to have a consistent perspective of treating data or information as something that needs be processed, as opposed to something that needs to be correctly hidden from prying eyes. So Pry like fuck. Where and how is the data stored, where and how is this sensitive/useful data accessed?

Think About Things!

Released in simpler times before the global pandemic, when things like massive climate change and impending world war were all we had to worry about. I want to leverage what was obviously the most catchy and funky song of 2020. I want you to think about things.

I want you think of an application like a baby you are trying to communicate with. You love your baby and you can’t wait to find out what the baby thinks about things (ie. your requests). You need to find the right thing or quick hack to make it speak your language, feed it the right inputs and simply listen and await lovingly for any interesting output. Like a real baby, applications generally first respond with cries, shits and tears.

You should not run out of new things to try, it shouldn’t be weird to hit a million requests on a single endpoint when you are fuzzing. Have you ran every payload list and permutation of all chars or payloads you can find? against every location in a request? every parameter, every directory, every header, every cookie, every index of every part of the data you are sending? Can you find new cookies, headers, parameters to target? DO YOU EVEN WANT YOUR BABY TO COMMUNICATE WITH YOU?

Try harder etc. You can’t find, what you don’t look for. What type of input is expected? do different types produce useful errors? How does it process the following data types?

  • An integer
  • A String
  • XML / JSON / Serialized Object
  • Binary / Hex / Url Encoding / Ascii / Non-Ascii / Unicode / Encrypted data
  • Really large or really small instances of the above data
  • Any mass assignment or type confusion possible (String -> array?)
  • Null or special characters or symbols, spaces, no data, returns, newlines, any special characters?
  • Can you overload the parameter by including it multiple times?
  • Is their other similarly named parameter? Can you brute force them?
  • Can you inject into the parameter name itself?

Additionally, how might the input be used and abused? Think about how the backend of the application may process the data you are sending.

  • Is it later used as part of a backend request? (SSRF)
  • Can you load local resources or traverse file directories? (LFI/ RFI/ Directory Traversal)
  • Is it used in a template? (SSTI)
  • Is it vulnerable to XSS or SQL or Command Injection or other injection types? (Code/Ldap/Xpath)
  • If XML or JSON is it vulnerable to object injection? (XXE / Deserialization)
  • What happens if you increment or decrement the integer referenced?
  • What happens if you use a different user’s ID or signatures or tokens?
  • Can you make requests or view response while unauthenticated?
  • Can you view the response for the same request while logged in as a different user or under another session?
  • Is DNS being resolved, what about other protocols in the parameter?
  • Is Data mutated in the response?
  • Is the info stored? Where is it reproduced in the application and how?

In April 2015 in a post on the Bugcrowd forum I gave some ideas of how a beginner could approach fuzzing a parameter. I’ve seen this exact info copy and pasted to multiple blog posts elsewhere. So I’ve repeated and built on it here in my own blog! When targeting an application, the above can also apply to any part of a request. What happens if you test for the same things in headers, cookies or the request body or URL path itself? There are too many things to try manually, start automating now!

Wrapping up

My name is Ciaran McNally and I’ve been security testing applications for 8+ years as a contractor. Myself and a close friend are planning on launching a pretty unique security scanning service startup this year called Scanomaly. I do penetration testing and application security reviews. I have a high success rate of uncovering critical and high rated security issues and usually suggest a bare minimum of 5 days for a pentest. Feel free to reach out for any of your security consulting needs via