SIEM use cases development workflow – Agile all the things!

If you are into Splunk rules development, I am pretty sure this post will relate to you. But before entering the main topic, let me quickly define what a SIEM use case is about, which is another trendy, hot topic in the Infosec industry today.

What is a SIEM use case after all?

For answering this question, I will simply promote one slide of a presentation I use in my workshops on Splunk Rules development:

1

In the end of the day, you may have the best, 100% available architecture, ingesting all sorts of logs; but if the platform does not provide value, you fail. That’s all.

And don’t fool yourself, Compliance/Regulation or Forensics use cases are out of this scope. If your use case is storing logs for later usage, it’s better revising the plan.

You likely don’t need Splunk to simply store data and eventually search over it. For that, you rely on on log management solutions which are not only cheaper, but easier to use and ultimately, avoiding wrong expectations (threat detection capability).

OK, now into Agile!

Despite being a bit reluctant in the beginning, the path to adopt the Agile methodology for developing security use cases with Splunk came in naturally.

It comes as no surprise for those who already treat Splunk queries as code, and it is particularly applicable for customers who want to embrace custom development.

In this article, I am going to highlight some of the benefits around that and also propose a workflow for the ones willing to give it a try.

I want it all! #NOT

Just in case you’ve landed here from another planet, here’s a quick summary on “Agile software development” straight from Wikipedia:

It’s a set of principles for software development under which requirements and solutions evolve through the collaborative effort […] It advocates adaptive planning, evolutionary development, early delivery, and continuous improvement.

I’m not here to prove Agile is -the- solution for you nor am I saying you should become a Scrum master or anything like that. But I encourage you to get familiar with the concepts, to go deeper on a particular topic that interests you, try and experiment.

There’s absolutely no need to blindly follow or enforce anything, but to leverage what best suits your development practice.

Nevertheless, some still see that as another heavy process to bring in or something that will put more overhead on developers – which is not true. Agile processes are designed to work and evolve over time, getting tighter and faster.

Here are some of the benefits I’ve noticed over time after employing the Agile approach in my line of work:

  • Transparency, Transparency, Transparency. Some ideas are super cool and sometimes they seem pretty easy as well. Turns out you still need time and resources to make it happen. Following a methodology allows those otherwise blurry requirements to emerge. That’s essential for better planning.
  • The transparency gives you and your team the ability to better handle expectations from stakeholders and management. For instance, to deliver a certain use case, you first need the right data on-boarded. To deliver more rules per development cycle, you need to enable more coders.
  • Easier prioritization. When there’s something actively blocking progress, it’s easy to link blockers/goals together and quickly evaluate the ‘cost’ of an unresolved issue, rapidly realizing the missing value of not tackling certain issues first.
  • Visibility and versatility. The concept of ‘Sprints’ provides the highest impact giving your capabilities (engineering). It’s easier to increase the pace and throughput as well a adapt, once it’s done in small, incremental targets.
  • Better collaboration with easier project tracking. That’s especially needed when working with ‘virtual’ teams located in different timezones. Instead of a big team working together, one or two members working on small tasks.

The list goes on and on but those should give you a hint on what’s possible to achieve.

The workflow

I would call this a draft as you will need to adjust to your own practice or organization giving that many of those boxes can be broke down into multiple sub-processes.

It’s more applicable to rules development (correlation searches) but may be easily adapted for managing more elaborated, long-term use cases.

It’s made with Draw.io which works pretty well (contact me for the XML/VSD version).

In case you are interested in suggestions for ranking or scoring your use case ideas, please refer to the following blog post I wrote on Medium:

Security Analytics: How to rank use cases based on the “Quick Wins” approach?

Feel free to reach out in case you have comments/feedback.

Splunk/ES: dynamic drilldown searches

72345577One of the advantages of Splunk is the possibility to customize pretty much anything in terms of UI/Workflow. Below is one example on how to make dynamic drilldown searches based on the output of aggregated results (post-stats).

Even though Enterprise Security (ES) comes with built-in correlation searches (rules), some mature/eager users leverage Splunk’s development appeal and write their own rules based on their use cases and ideas, especially if they are already familiar with SPL.

Likewise, customizing “drilldown searches” is also possible, enabling users to define their own triage workflows, facilitating investigation of notable events (alerts).

Workflow 101: Search > Analytics > Drilldown

Perhaps the simplest way to define a workflow in ES is by generating alerts grouped by victim or host and later being able to quickly evaluate all the details, down to the RAW events related to a particular target scenario.

As expected, there are many ways to define a workflow, here’s a short summary of the stages listed above:

Search: here you define your base search, applying as many filters as possible so that only relevant data is processed down the pipe. Depending on how dense/rare your search is, enrichment and joins can also be done here.

Analytics: at this stage you should get the most out of stats() command. By using it you systematically aggregate and summarize the search results, which is something desirable given that every row returned will turn into a new notable event.

Drilldown: upon generating a notable event, the user should be able to quickly get to the RAW events building up the alert, enabling rapid assessment without exposing too many details for analysis right from the alert itself.

You may also want to craft a landing page (dashboard) from your drilldown search string, enabling advanced workflows such as Search > Analytics > Custom Dashboard (Dataviz, Enrichment) > RAW Events > Escalation (Case Management).

Example: McAfee ePO critical/high events

Taking McAfee’s endpoint security solution as an example (fictitious data, use case), here’s how a simple workflow would be built based on a custom correlation search that looks for high-severity ePO events.

First, the base search:

index=main sourcetype=mcafee:epo (severity=critical OR severity=high)

Next, using stats command to aggregate and summarize data, grouping by host:

| stats values(event_description) AS desc, values(signature) AS signature, values(file_name) AS file_path, count AS result BY dest

The above command is also performing some (quick) normalization to allow proper visualization within ES’ Incident Review dashboard, and also providing some quick statistics to facilitate the alert evaluation (event count, unique file names, etc).

Finally, it’s time for defining the dynamic drilldown search string based on the output of those two commands (search + stats):

| eval dd="index=main sourcetype=mcafee:epo (severity=critical OR severity=high) dest=".dest

Basically, the eval command is creating a new field/column named “dd” to store the exact search query needed to search for ePO events for a given host (dest).

In the end, putting it all together:

out1

Despite having more than 150 matching events (result) from each of those hosts, the maximum number of alerts that can be possibly generated over each correlation search execution is limited to the number of unique hosts affected.

And here’s how that translates into a correlation search definition:

cs1

cs2

Note that the “Drill-down search” value is based on a token expansion: search $dd$. This way, the value of “dd” is used to dynamically build the drilldown link.

Now, once the correlation search generates an alert, a link called “Search for raw events” should become available under “Contributing Events” after expanding the notable event details at the Incident Review dashboard.

By clicking the link, the user is directed to a new search containing all raw events for the specific host, within the same time window used by the correlation search:

cs4

Defining a “dd” field within your code is not only enabling custom dashboards development with easy access to the drilldown search (index=notable) but also standardizing the value for the drilldown search at the correlation search definition.

As always, the same drilldown search may be triggered via a Workflow Actions. Feel free to get in touch in case you are interested in this approach as well.

Happy Splunking!