Splunkers on Twitter

Below is a list of Splunk users I am following on Twitter, including Splunkers, partners and awesome users. Most of them are also into #Infosec. The list is not sorted in any particular order.

Missing someone, maybe you?! Please feel free to contact me for adding more. In case you want to follow a list, it is also available via Twitter here.

Ryan Kovar @meansec
Staff Security Strategist @Splunk. Enjoys clicking too fast, long walks in the woods, and data visualizations.

Holger Sesterhenn @sesterhenn_splk
Sales Engineer, CISSP, Security Know-How, Machinedata, Security Intelligence, IoT, Industrie 4.0, BigData, Hadoop, NoSQL, User Behavior Analytics

The Dark Overlord @StephenGailey
Towering intellect; effortlessly charming…

Cédric @_CLX
Let me grep you. #infosec and useless stuff. Using security buzzwords since 2005. https://github.com/c-x

Brad Shoop @bradshoop
Security Onion for Splunk app developer, infosec, devops, infrastructure, cloud and homebrewer.

monzy merza @monzymerza
Chief Security Evangelist @Splunk. Thoughts are my own.

Damien Dallimore @damiendallimore
Splunk Dev Evangelist, Golfer, Rugby Player, Musician, Scuba Diver, Thai linguist, Chef.

Adam Sealey @AdamSealey
Information security, both applied and research. CSIRT, DFIR, and analytics Generalist geek. Husband & father of 3. Tweets are my own.

Hacker Hurricane @HackerHurricane
Austin TX. area Information Security Professional

Mika Borner @my2ndhead
Splunk Artisan. Because Splunking is an art.

David Shpritz @automine
I Splunk all the things. Blieve, hon. Splunk, Web App Sec, Open source, EDC

Dimitri McKay @dimitrimckay
Glazed donut connoisseur, plus size hand model, technologist, splunker, replicant, security nerd, CISSP, MMA fighter, zombie killer & lover of pitbulls.

Luke Murphey @LukeMurphey
Developer of network security solutions at #splunk. Founding member of Threatfactor (http://ThreatFactor.com ) and Converged Security (acquired by GlassHouse).

Sebastien Tricaud @tricaud
Principal Security Strategist @Splunk. Playing with data, binary-ascii-utf16-whatever. Opinions are my own, not my employers. Re-tweeting != Agreeing

Dave Herrald @daveherrald
dad | husband | splunk security architect | GIAC GSE | tweets=mine

Ryan Chapman @rj_chap
Security enthusiast. Incident response analyst. Malware hobbyist. Retro game lover. Husband and father. TnVsbGl1cyBpbiB2ZXJiYS4= http://github.com/BechtelCIRT

Michael Porath @poezn
Product Manager for Data Visualization @splunk. Bay Area based Swiss Information Scientist

skywalka @skywalka
my daughter, basketball, hip hop, film, comics, linux, puppet, splunk, nagios, and sensu keep me awake

James Bower (Hando) @jamesbower
Pentester / Threat intelligence / #OSINT / #Honeypots / #Bro_IDS / #Splunk | #Python / Follower of Christ and occasional blogger – http://jamesbower.com

georgestarcher @georgestarcher
Information Security, Log analysis and Splunk, Forensics, Podcasting. Photography and OSX Fan. GnuPGP key ID: 875A3320BD558C9E

Brian Warehime @brian_warehime
Security Analyst | Threat Researcher | #Honeypots | #Splunk | #Python | #OSINT | #DFIR

Michel Oosterhof @micheloosterhof
Splunk // My opinions are my own.

Hal Rottenberg @halr9000
I am the Lorax. I speak for the Developers! @Splunk, Author, Podcaster @powerscripting , Speaker, #PowerShell MVP, #CiscoChampion, husband, father of four!

Jason McCord @digirati82
Security analyst, software developer, #Splunk fan. Log everything. #WLS #DFIR

 

Challenge your MSSP/SOC/CSIRT: what metrics can they provide you?

I was trying to recall a famous quote related to “Metrics” for including here and below is what Mr. Google hints me:

The quote has a few variations, but that seems to be the most famous one. Perhaps now it will finally stick. So, does it make sense or is it just another unquestioned corporate adage?

Basically, the idea here is to give you more food for thought in case you are into this metrics thing and trying to apply it to Security Operations.

Actually, let me start by saying I like measuring data, therefore metrics is an interesting topic to me. Simply put, translating your effort and progress to management is way easier if you are able to come up with a metric from which they can understand what you are doing and why.

As usual, bonus points if a metric ties to a business goal (more info below). So working on a good, easily digestible metric also saves management time assuming this one is not there only for you, nor can it be allocated quickly. Therefore, selecting key metrics and meaningful charts is an opportunity security practitioners cannot miss in order to keep their budgets flowing in.

Many questions, few metrics

How do you evaluate the work done by your SOC or SecOps team? How to verify your MSSP is providing a good service?

Within Security Operations, and I dare using this term to refer to the tasks carried out by MSSPs, SOCs or CSIRTs, you should generate metrics that help or enable answering the following questions:

  1. How many investigations ended up being a false positive (FP) or a real threat (TP)?
  2. From above answers, what scenarios are seen or involved most often? Is there a technology, NIDS signature, correlation rule or process clearly performing better (or worse) than others?
  3. Which analysts are involved in the process of developing or tuning signatures/rules that lead to real investigations?
  4. In a multi-tier environment, which analysts were responsible for the triage of most FP cases?
  5. MSSP only – Are customers responding or interacting with cases that are raised towards their security teams?

Linking Metrics to benefits

Now, read question #1 and ask yourself: Do you really believe a properly deployed security infrastructure will never, ever detect a real threat? So why are you still paying a MSSP to provide you with anything but FPs? Checkbox Security?

No wonder why your Snort/Bro guy, with a single sensor is able to provide 10 times more consumable alerts than your 5 super-duper Checkpoint NG IPS Blades? Track answers from questions #2 and #3 to find out.

From #4 you will have a better idea about where to invest your budget for training and which analysts might need some mentoring.

Many incidents evaluated doesn’t mean people are busy on analysis, nor does it mean good work. The higher the FP rate on the SOC escalations, the less interest your customer will have. That indicates less engagement on following up the investigations. Refer to #5.

And what about the relationship with business goals? That’s easier to exemplify for MSSPs: sounding metrics performing as expected are the best ammunition you can bring to the table for contract renewals or (ups!)selling.

Here are some (measurable) metrics examples:

  • Alerts to Escalations ratio
  • Escalations to real investigations ratio
  • Alerts handled per shift/analyst
  • Time to triage (evaluate a new alert)
  • Time to close an investigation (by outcome)
  • Number of FPs/TPs per rule, signature, use case

If you embrace Gamification, there are many more that might be interesting assuming the risks involved here, for example: Escalations to real investigations (TPs) ratio per analyst or shift.

No Case Management = No Game

An investigation must have a start and an end, otherwise it’s impossible to measure the output of it. Even if you want to monitor an attacker behavior for a while, this decision (observe, follow-up) was most likely the result of an investigation.

Now, scroll up to the list and ask yourself how many of those questions are easily answered by data mining the ticket or case management database. Doing Analytics on your case management DB might be challenging but definitely worth it.

“I don’t have a case management system!”, then, go get one before you start the metrics conversation. If you don’t have an incident workflow in place, those systems might even drive you towards designing one.

Happy to discuss that stuff further? Feel free to comment here or message me on Twitter.

My TOP 5 Security (and techie) talks from Splunk .conf 2015

indexIf you are into Security and didn’t have an opportunity to attend the Splunk conference in Las Vegas this year (maybe you’re busy playing Blackjack instead?), here’s what you can not miss.

The list is not sorted in any particular order and, whenever possible, entries include presenters’ Twitter handles as well as takeaways or comments that might help you choose where to start.

  1. Security Operations Use Cases at Bechtel (recording / slides)
    That’s the coolest customer talk from the ones I could watch. The presenters (@ltawfall / @rj_chap) discussed some interesting use cases and provided a lot of input for those willing to make Splunk their nerve center for security.
  2. Finding Advanced Attacks and Malware with Only 6 Windows EventIDs (recording / slides)
    This presentation is a must for those willing to monitor Windows events either via native or 3rd party endpoint solutions. @HackerHurricane really knows his stuff, which is not a surprise for someone calling himself a Malware Archaeologist.
  3. Hunting the Known Unknowns (with DNS) (recording / slides)
    If you are looking for concrete security use case ideas to build based on DNS data, that’s a gold. Don’t forget to provide feedback to Ryan Kovar and Steve Brant, I’m sure they will like it.
  4. Building a Cyber Security Program with Splunk App for Enterprise Security (recording / slides)
    Enterprise Security (ES) app relies heavily on accelerated data models, so besides interesting tips on how to leverage ES, Jeff Campbell provides ways to optimize your setup, showing what goes under the hood.
  5. Build A Sample App to Streamline Security Operations – And Put It to Use Immediately (recording)
    This talk was delivered by Splunkers @dimitrimckay and @daveherrald. They presented an example on how to build custom content on top of ES to enhance the context around an asset, which is packed to an app available at GitHub.

Now, in case you are not into Security but also enjoy watching hardcore, techie talks, here’s my TOP 5 list:

  1. Optimizing Splunk Knowledge Objects – A Tale of Unintended Consequences (recording / slides)
    Martin gives an a-w-e-s-o-m-e presentation on Knowledge Objects, unraveling what happens under the hood when using tags and eventtypes. Want to provide him feedback? Martin is often found at IRC, join #splunk and say ‘Hi’!
  2. Machine Learning and Analytics in Splunk (recording / slides)
    If you are into ML and the likes of R programming, the app presented here will definitely catch your attention. Just have a quick look on the slides to see what I mean. A lot of use cases for Security here as well.
  3. Beyond the Lookup Glass: Stepping Beyond Basic Lookups (recording)
    Wanna know about the challenges with CSV Lookups and KV store in big deployments? Stop here. Kudos to Duane Waddle and @georgestarcher!
  4. Splunk Search Pro Tips (recording / slides)
    Just do the following: browse the video recording and skip to around 30′ (magic!). Now, try not watching the entire presentation and thank Dan Aiello.
  5. Building Your App on an Accelerated Data Model (recording / slides)
    In this presentation, the creator of the ubberAgent@HelgeKlein – describes how to make the most of data models in great detail.

Still eager for more security related Splunk .conf stuff? Simply pick one below (recordings only).

For all presentations (recordings and slides), please visit the conference website.

Splunk > Self-Learning Path & The Community Factor

Splunk is gaining tremendous traction in the market due to its ability to harness the value of machine data. The idea here is to highlight a few reasons for such success: free-access and community driven approaches.

Being familiar with the ways in which knowledge can be freely attained is a great advantage. Coupled with your curiosity, pretty much nothing more is needed to become an independent learner these days.

Below you will find the main references I’ve been using to learn Splunk and get up to speed with this great technology.

Splunk Platform: Free, Easy Access

Splunk provides free access to its flagship product, Splunk Enterprise. Users evaluating the product can also get a free, perpetual license. That means no initial costs for installing and evaluating most of its primary capabilities.

For developers, there is also a developer license which enables up to 10GB a day for data indexing.

TLDR? Just hit Play!

Besides the excellent Just Ask campaign, the following short videos help showing Splunk’s benefits:

Are you looking for more technical stuff, easy to follow and digest? Below is a YouTube playlist with demo-like lessons available from Splunk’s channel:

Besides, if you are an Infosec pro, don’t forget to check the current Security related apps at the portal. Aside from that, below you will find a few videos that might trigger inspiration for further research and ideas:

Q&A Forum, IRC and Wiki

The Splunk Answers forum is really an important knowledge base, and here’s why:

  • The discussions are around questions and answers, so entries tend to be clear and narrowed to a specific topic, often times matching an issue you are currently facing;
  • Not only Splunk team members provide answers. It’s common to get responses from partners and, of course, the whole Splunk community, including end-users;
  • Script/Code as well as images are allowed for easier understanding of a question or an answer. Top contributors are also awarded with points and badges to promote users interaction;
  • There is a sort of rating to answers, so users can also rely on that for choosing where to start.

I was also surprised when I joined the IRC channel as several Splunk staff members (PS, Devel, Support) take part in the discussions there. Sometimes the answer not found via documentation, or a bug report might well be the subject of a quick chat.

Besides that, there is, of course, a Splunk Wiki! As it applies to other examples listed here, it’s also community driven so anyone is able to add and edit content.

Documentation Portal

Splunk provides a well organized documentation portal, which serves as a quick reference guide (e.g., search commands) and also enables you to learn about more advanced topics such as Distributed Deployment, or the Common Information Model Add-on Manual.

Also, there are some dedicated tutorials available such as the Search Tutorial. I am listing below some doc bookmarks that I am constantly querying on:

It’s worth noting most areas from the documentation portal are provided with a Comments section, from which the answer for your issue might be found, so always keep an eye on that.

UPDATE 9-Mar-15: Also, don’t forget to bookmark Splexicon, a documentation reference that defines technical terms that are specific to Splunk. Definitions include links to related information from the Splunk documentation.

Cheatsheets

For those Splunk Ninjas pros out there who love having those neat docs around, there are some cool versions available for Splunk as well. Some of them are listed below:

The Community Factor: BIG Win!

The community engagement is a huge win in respect to knowledge sharing and as a business strength. Simply setting up a web forum doesn’t enable community integration. In my opinion, here are some of the great initiatives Splunk has been carrying out to accomplish that:

Missing something? Just let me know so I can add them here as well.

My 1st Splunk app: RAW Charts

d3rawAfter some days playing around with a few interesting apps, I’ve decided to give it a try, and learn how to integrate RAW data visualization project into Splunk.

It turns out, by reading the (latest) right App Development documentation (thanks IRC!) and checking good examples, it’s quite an easy job, especially if you are already familiar with web development technologies (HTML, JS/jQuery and the likes).

Here’s a bit of motivation to do it:

  • Connecting with the Splunk community;
  • Getting up to speed with the Splunk Web Framework for quickly developing custom content (views, dashboards, apps, etc);
  • Easily visualizing search results in different formats by leveraging the search bar functionality, rather than editing hard-coded dashboard searches;
  • Helping to spread the word about the power of data visualization by demonstrating the incredible D3 library and the RAW project;
  • Having fun! (a must for any learning experience nowadays, right?)

RAW project?

I will not dare describing it better than the creators of this great project:

“The missing link between spreadsheets and vector graphics.”

A more detailed description is also found from the project’s README file:

RAW is an open web tool developed at the DensityDesign Research Lab (Politecnico di Milano) to create custom vector-based visualizations on top of the amazing d3.js library by Mike Bostock. Primarily conceived as a tool for designers and vis geeks, RAW aims at providing a missing link between spreadsheet applications (e.g. Microsoft Excel, Apple Numbers, Google Docs, OpenRefine, …) and vector graphics editors (e.g. Adobe Illustrator, Inkscape, …).

What you can do instead is simply browsing the project interface here: app.raw.densitydesign.org. Paste your data or just pick one data sample to realize how easy it is to create a chart without a single line of code.

And since we are talking about one line of code, let’s get straight to the point. Here’s a dirty quick hack for automatically copying the search results into RAW’s worklfow:

$scope.text = localStorage.getItem('searchresults')

In fact, I’m not sure if that’s the optimal way to accomplish it, but that’s the only change needed within RAW’s code (controllers.js). The wonderful Italian mafia team at Density Design might be reading this now, so guys please advise! (I know you are very busy).

Nevertheless, after a quick read through AngularJS, that change looks like a quick win. What it does is tell the browser to load the data from a local storage into RAW’s textarea. Local storage? Remember Cookies and HotDog editor? That’s history! Actually, not.

The Splunk Code

By using the Web Framework Toolkit, creating an app is really easy. Just use the splunkdj createapp <app-name> command and start customizing the default view that is built in, home.html. Here’s the main code piece used for this app (JavaScript block):

{% block js %}
<script>

function createIframe(){
    // reset div contents
    document.getElementById("raw-charts").innerHTML = "";

    // create an iframe
    var rawframe = document.createElement("iframe");
    rawframe.id = "rawframe";
    rawframe.src = "{{STATIC_URL}}{{app_name}}/raw/index.html";
    rawframe.scrolling = "no";
    rawframe.style.border = "none";
    rawframe.width = "100%";
    rawframe.height = "3700px";

    // insert iframe
    document.getElementById("raw-charts").appendChild(rawframe);

};

var deps = [
	"splunkjs/ready!",
	"splunkjs/mvc/searchmanager"
];

require(deps, function(mvc) {

	// this guy handles the search/results
	var SearchManager = require("splunkjs/mvc/searchmanager");

	// initial search definition
	var mainSearch = new SearchManager({
		id: "search1",
		//search: "startminutesago=1 index=_internal | stats c by group | head 2",
		search: "",
		max_count: 999999,
		preview: false,
		cache: false
	});

	// count: 0 needed for avoiding the 100 limit (Thanks IRC #splunk!)
	var myResults = mainSearch.data("results", {count: 0});

	// tested with "on search:done" but unexpected results happened
	myResults.on("data", function() {  

		// field names separated by comma
		var searchresults = myResults.data().fields.join();

		// debug code
		//console.log(myResults.collection());

		// loop through the result set
		for (var i=0; i < myResults.data().rows.length; i++) {
			searchresults = searchresults + '\n' + myResults.data().rows[i];
		}

		// better than cookie!
		localStorage.setItem('searchresults',searchresults);

		// search loaded, triggering iframe creation
		createIframe();

	});

	// keep search bar and manager in sync
	var searchbar1 = mvc.Components.getInstance('searchbar1');
	var search1 = mvc.Components.getInstance('search1');

	searchbar1.on('change', function(){
		search1.settings.unset('search');
		search1.settings.set('search', searchbar1.val());
	});
});

</script>

{% endblock js %}

The initial page for the app loads an empty search bar with a table view component right below it. After running a search, the table displays the search results and also triggers the RAW workflow, by loading the textarea with the table’s content.

Meet the workflow

In a nutshell, the visualization workflow works like Splunk’s default. The user runs a search command, formats the results and finally clicks on “Visualization” tab. Likewise, using this app the user is also able to customize chart options and export the results in different formats.

First Example

Here’s the first example in action, reachable via Chart Examples menu. The data comes from Transport of London data portal, this specific data set (CSV) is a sample for the Rolling Origin & Destination Survey (RODS) available under “Network Statistics” section from the portal.

Before handling the CSV file, the following command is needed for cleaning up the file header, basically replacing slashes and spaces by a “_” char:

sed -i '1,1s/[[:blank:]]*\/[[:blank:]]*\|\([[:alnum:]]\)[[:blank:]]\+\([[:alnum:]]\)/\1_\2/g;' rods-access-mode-2010-sample.csv

After clicking at the link example, the search bar gets preloaded with a specific search command, which triggers the table reload:

Example 1 The results are synced to RAW’s input component, which is fully editable just in case:

The user is then able to choose one chart type (multiples available). Here, the Alluvial/Sankey diagram is chosen:

There’s also an option for adding your own chart in case you are willing to integrate your D3 code implementation with the project.

The next step is to select which fields (columns) will be part of the diagram/chart, and also how they will relate to the chart’s components (dimensions, steps, hierarchy, etc). For doing so, a nice drag and drop interface eases the job.

Just follow the instructions included within the example (step-by-step) . The final map setup should look like the following:

Finally, here’s the chart generated in the end:

As you can see from this simple example, the chart better conveys the idea of flow & proportionality among the dimensions as compared to other usual charting options out there.

Optionally, the user is able to customize colors, sorting and other stuff, which may differ depending on the chart chosen. Exporting options are also available (SVG/HTML, PNG, etc).

Second Example

The second example leverages data from the World Bank data portal related to Internet subscribers. For this case, I’ve decided to apply a few constraints so that it becomes a bit simpler to render the results:

  • Only a few countries are filtered in;
  • Time period considered is 2000-2009.

By following roughly the same steps described from example previously shown, the search gets preloaded with a search command and the user is instructed to follow a few steps to generate the graph. In this case, a Bump Chart, similarly to the one featured at NYT.

I hope the screenshots speak for themselves (click for full size). Detailed instructions are available from the app’s documentation and examples.

Here’s a list of currently supported charts/diagrams: Sankey / Alluvial, Bump Chart, Circle Packing, Circular / Cluster Dendogram, Clustered Force Layout, Convex Hull, Delaunay Triangulation, Hexagonal Binning, Parallel Coordinates, Reingold-Tilford Tree, Streamgraph, Treemap, Voronoi Tessellation.

Comments and suggestions are more than welcome! The app is available at Splunk’s app portal, and I will later upload the code to a common place (Github?) so it makes easier for everyone to have access and modify it.

Security Analytics: having fun with Splunk and a packet capture file

It’s been quite a long time since my last post here. I’m now taking the opportunity to share one article I wrote about Splunk , which might be of some help to the community.

Since I’ve been using that technology for a while, I’ve decided to leverage such knowledge in order to renew one GIAC certification I got in the past (GCIA). Basically, the paper’s content is about installing Splunk Enterprise (freely available version) on a Linux machine, getting network data processed based on tshark’s output, and finally extracting some interesting stats and charts out of it.

It was also a fun way to introduce Splunk’s data mining features, which might hopefully enable users to develop new ideas based on the approach presented in there. As expected, there should be many other ways to accomplish the same results while processing IP packet headers, whether it’s using Splunk or not, so I would really appreciate receiving feedback about other approaches used out there.

The link to the paper is provided below:

Security Analytics: having fun with Splunk and a packet capture file
www.giac.org/paper/gcia/5374/security-analytics-fun-splunk-packet-capture-file-pcap/121502

UPDATE: In case you are looking for Splunk transaction examples, I also wrote a post about that here. And of course, the community forum is full of information around this topic as well.