GreyNoise App for Splunk with Nick Roy

Summary

At GreyNoise, we collect, analyze and label data on IPs that saturate security tools with noise. This unique perspective helps analysts waste less time on irrelevant or harmless activity, and spend more time focused on targeted and emerging threats.

Use the GreyNoise Splunk app to reduce false-positives and filter Internet-wide scanners from your logs. The GreyNoise Splunk app provides multiple dashboards to effectively analyze and visualize the contextual and statistical data provided by GreyNoise. It also includes custom commands and alert actions which can be used along with Splunk searches to leverage GreyNoise APIs for custom use cases. It periodically scans the Splunk deployment through saved search to indicate the noise and RIOT IPs in the complete Splunk deployment. Along with this, the workflow action provided can be used to obtain live context information of any CIM compliant field containing an IP address.

Read the transcript

Hi everyone, my name is Nick Roy. Today we're going to take a look at the GreyNoise app on Splunk, how we can set it up and install it, and look at some of the dashboards that are available. And finally, we'll look at the custom commands provided by the app. Now, I already have the app installed on my Splunk instance. If you don't have it installed, we can search Splunkbase for GreyNoise and we can install it right from here.

Now once we have our app installed, we want to come into the configuration tab. And under the GreyNoise setup tab, this is where we can enter in our API key. And once we have our API key entered, we can also configure our scan deployment. What this is going to do is, GreyNoise is going to scan our indexes for any IP addresses. And we can use this data in our dashboards that we'll look at. And we'll see where we can also use this data as part of our searches that we're building. So in my case, I have this configured just to scan my main index, we can specify any SIEM fields that have our IP addresses that we want to scan. And then for any data that's not SIEM compliant, we can always specify specify additional fields here.

And finally, we can set how far back we want to run our scan. For mine, I don't have a ton of data in here. So I'm going to set this for the last 24 hours. If you do have a lot of data in here, we probably want to start with either the last 5 or the last 60 minutes first. Now when we get into looking at our custom commands provided by the GreyNoise app, one of the things we'll see is we can provide a field to look up information about IP addresses, the app is going to cache those IP addresses if we have that enabled, which I do here. And we can set a time to live. So for my case, I have this set to 24 hours. After that, we'll pull down the latest data from GreyNoise. But you can configure the caching mechanisms for the app under this tab. And then finally, we have our logging. If we want to change the log level for the app, we can set that in here as well.

Now that we've set that up, we'll go back to our overview. And we can see a little bit of an overview now of my main index. So I have 9300 IP addresses in there. And of those, 92% of them are found in the noise data set and GreyNoise. Again, these are going to be the IP addresses that we've observed scanning the internet. And then I also have 3% of my IP addresses that are part of the RIOT data set, which are the common business services. And then at the bottom of the page, we can also see additional details and overview data provided from GreyNoise.

And this is just going to give us some information about the top organizations that we've observed scanning the internet, the top tags that we've seen for classifying traffic from these hosts that are performing these scans. And it's a good way where we can just get a quick look at some of these details that we're seeing across our sensor network.

And if we want to filter this down, maybe I just want to see, the top organizations that have been running malicious scans are the top tags associated with malicious scans, we can make those changes from the drop down and see that reflected in our dashboard panels. So we have this overview of the IP addresses in our index that we've specified. But if I want to start to get some more information, and I want to start to drill into the details of these IP addresses in my different index. Now I can come into my noise IP addresses section. And this is where I can start to filter down my data quickly.

So if for example, I only want to see IP addresses that are part of the RIOT data set, I can change my filters here to quickly find those. And same thing, if I want to see any IP addresses associated with the noise dataset, we can do that as well. Now we can also get up to date information, because I can see that the last time we scanned some of these IP addresses was a couple of days ago, we can always click on get live status. And this is going to pull down the latest information from GreyNoise just to make sure that we have that latest information. Now if someone is also asking me about an IP address, we can also query that here. So if there are specific IP addresses I want to look for to see if we've seen them in our indexes, we can search that right here.

So this is going to show us IP addresses that are already in our indexes. But we also have our live investigation. And this is where we can query GreyNoise for additional details about any IP address or other pieces of information like certain tags, or organizations or ASNs. I'm going to use the same IP address that we were looking at before. And if I change my date range here, we'll say over the last 30 days.

So we can see that the last time GreyNoise saw this IP address hitting our sensor network was back on January 17. And we'll also provide all the details associated with this IP address the organization tags associated with the traffic that we've observed, as well as the country, the category of the IP address, and the ASN for it.

So this is going to allow us to query additional details about any IP address. If there are things that I'm just not sure about, whether it's something in our index, or something that we just want to look up, we can always look that up in here. Again, some IP addresses, if we don't observe them scanning will just show that we don't have any information that we're providing here. So we have our different dashboard panels that we can work with inside of our app. But we also provide a number of custom commands that we can use as part of our searching.

So what I want to look at now I want to look at the last 48 hours of my Apache logs. And I can see I have about 640 events in here. Now whether I have alert setup that I want to add these custom commands into, or I just want to filter out some of the noise as I'm hunting through my events. Here. We have our GreyNoise commands that we can use as part of this data.

So one of the first things that I want to do is...I want to do just a quick lookup on any IP address that we've observed connecting to our Apache server in the last 48 hours. I just want to know is it part of the noise dataset? Is it part of the RIOT dataset?

And then we're just going to do a quick stats count so we can get an idea of what's being returned. So we can use these custom commands as part of our searches that we're building can use this in any alerts that we have configured. And we can even use this with, for example Splunk enterprise security. And we can add this data into any of the notables that you're observing as well.

But now that I have my data returned from GreyNoise, I can see my different IP addresses here. A couple of them at the top are part of the GreyNoise noise data set, which are IP addresses that we've observed scanning the internet. You can see this one at the very top. And again, what this is going to allow me to do is it's going to allow me to either reprioritize my alerts that I'm building or as I'm hunting through my events in my log data, understand anything outside the noise, because those are things that I probably want to investigate further.

Now we have our GN quick command that we used here just to get a quick lookup of what dataset that an IP address falls in. We also use GN quick not only as a transforming, but also as a generating command. So if I do want to just look up something quickly, I can always just put in my IP address, and get my details returned from GreyNoise as well. So now one of the other things that we want to do now that we've started to look up some of these IP addresses is we want to also get more information about these IP addresses based off of the different data sets.

So now what I want to do is, I want to look at some sample logs that I have from an ASA. And I want to just perform a lookup against the riot database, I want to understand which IP addresses in which events are going out to these common business services. So I'm going to add that into my custom command here. And then I can output this data. In this case, I just want a table out. The fields returned by GreyNoise, as well as the IP address in our logs. This way I can get an idea for essentially anything connecting outbound, what they're connecting to what kind of service it is. And I can always change this as well.

So in this case, I'm looking where GreyNoise RIOT equals one (1). But if I want to see any outbound traffic that isn't connecting to a known business service IP address, we can get a list of those here. And these might be IPs that I want to start looking at, if I am seeing any kind of anomalous traffic, these are going to be all of the IP addresses that we may need to investigate further.

So using that noise and RIOT dataset, we can start to filter out a lot that we may not need to investigate right away in order to find some of that malicious traffic.

Now we've looked at how we can quickly look things up in Splunk, we've seen how we can get additional details using the right data set. Now what we can also do is if I look at, again, any inbound traffic over the last 48 hours to my Apache server

Now I want to pull down the full data from GreyNoise. So in this case, I want to include the classification provided by GreyNoise, I want to have any tags associated with that traffic, the organization, all of the details that we've seen in the great noise visualizer we're making available as part of our events as well. So now I can see all of the additional metadata that we're adding in, I can see the tags from GreyNoise that we've added to this, we have our classification levels, we have our geo data here as well.

So now if I am looking at my let's say, firewall logs, or my in this case, Apache logs, maybe I just want to understand, I want to filter this down even further. I want to see anything that GreyNoise has classified as malicious as well. And I can start to filter this down even further using that enrichment and tabling of the IP address, as well as all the details provided by GreyNoise.

So again, this is how we can start to really filter and better understand our data that we're looking at. So far, we've been doing just some different lookups, assuming we know what the data is. But let's say for example, we want to start building out some more advanced searches. So I still want to use my Apache logs here. But in this case, we're going to use a new command GN filter. I'm going to look up all the IP addresses and my client IP field. But now I can also specify here whether or not I want to see noise events.

So in this case, I want to set this to true. And this is going to do that same lookup for us. It's going to look up our IP addresses connecting to our Apache server, but this is only going to show us IP addresses that GreyNoise has observed scanning the internet. So what I can also do with this data now is I can change this and I can set this to false.

And now what I can do is I can start to filter out a lot of those noise events. So in our case, the first search, we were looking at anything GreyNoise knows about just to get some additional context about those IPs. Now I can start to filter out all that noise data. And these are going to be just the IP addresses that are maybe a little bit more targeted towards my organization.

So now if I want to better understand what these are looking for, what they're scanning for, this is a good place that I can start hunting through my logs, because we filtered out a lot of the noise that probably fill this up and make it difficult to look through.

Now one of the last commands that we'll look at here, as far as filtering through our data, I'm going to run this a little bit differently now. Now, as the GreyNoise app is scanning our indexes, like we configured, when we first set up the app, we're storing all of that data in a lookup table. So now we can really start to build out some clever queries. Where the first thing I'm going to do is actually want to just check my local lookup table to see whether or not we know is an IP address part of a noise dataset or not. And then from there, what I can start to do is, I can start to combine some of these custom commands. Now I'm filtering out anything that's not noise, and then doing a lookup.

So this way, now we can check our lookup table first. Anything that's that we don't already know about, we can go and query great noise for additional details. In this way, now we can make sure that we're building our searches as efficiently as possible. Now, we also have our caching mechanisms built in, which we'll use here as well. But in this case, we took all of our data. And based off of what we've already observed, you're scanning our indexes, and the GreyNoise data and our lookups here. Now we're down to just two IP addresses that we found in our Apache logs, that GreyNoise has never seen scanning the internet.

So again, if I am doing any kind of hunting through this, or if I'm building alerts, and I see these popping up, these would probably be the ones that I want to focus on first, because again, they could be a little bit more targeted towards our organization, as opposed to just opportunistically scanning the internet, and trying to exploit vulnerabilities at large.

So we've seen a lot how we can work with a lot of our data that we have in our indexes. We do also have two other custom commands that we can add in. Now the first one is we can actually just query GreyNoise directly from Splunk. So in this case, I want to say anything tagged with Tomcat, for example, with the classification of malicious, that we've last seen in the previous two weeks. I want to run that GreyNoise query and I want to ingest those results into my Splunk instance. From here, I can start to if I view my results, we can see all of the details about all these hosts. That GreyNoise has observed running the scans. In this case, we can see all the tags, what it was scanning for. So this is going to give us some additional context. If there are certain products or certain organizations we want to better understand Understand the scans and what they're looking for, we can run these GreyNoise queries that we did through the visualizer previously. Now we can run them right through our Splunk search as well.

And then, finally, the last search that we'll take a look at. We saw on the GreyNoise dashboard on our app, we have just some high-level statistics there. But maybe we want to build our own statistics using the green noise data. So now we're going to use the GN stats command. We're going to specify our query here. In this case, I'm just looking at any host that's classified as unknown over the last two weeks because I want to understand, what are these unknown hosts scanning for? Again, these aren't things that we can attribute to an organization running these scans are they're not doing anything malicious as part of the scans and the traffic that we're observing. So we just want to understand what are these hosts looking for? What are the top tags associated with those hosts? And so now when I run my scan, I can see some high-level details about these tags. I can say for example, if I wanted to limit this down further, maybe I just want to see the top 10 tags. But I can filter that down, and just get a better understanding of what these hosts are looking for.

Now finally, as we are wrapping this up, we do have all of our documentation that we provide as part of our brainwaves app on Splunk. If we go over to our docs page, if we go down to integrations, and we can go to our integration overview for Splunk. On the left hand side, this is where we see information about release notes, configuring the app permissions required. Scan deployment information, information about all of the commands that you can add in, again, whether you're just building ad hoc searches, or if you want to include those in any kind of alerts that you're working with. So all of the details that we've covered in this session are available in the documentation as well.

Thank you very much. And hopefully this helps with some of the custom commands that you can add in to your queries that you're building.