At GreyNoise, we collect, analyze and label data on IPs that saturate security tools with noise. This unique perspective helps analysts waste less time on irrelevant or harmless activity, and spend more time focused on targeted and emerging threats.
The GreyNoise Visualizer is our web user interface that allows users to lookup IP addresses, drill down into the data, and identify emerging internet threats.
At GreyNoise, we collect, analyze and label data on IPs that saturate security tools with noise. This unique perspective helps analysts waste less time on irrelevant or harmless activity, and spend more time focused on targeted and emerging threats.
The GreyNoise Visualizer is our web user interface that allows users to lookup IP addresses, drill down into the data, and identify emerging internet threats.
Hi everyone, my name is Nick Roy. Today we're going to walk through an overview of the GreyNoise Visualizer and some of the datasets that you can find in there. I'm starting out here on the GreyNoise homepage. If you haven't gotten started with GreyNoise before, in the top right corner, you can click on Get Started to register for your GreyNoise account. I'm going to just log in with my account credentials right now and we can go ahead and get started querying the GreyNoise datasets.
Before we start running these queries, a little bit of an overview of GreyNoise. We're listening for what we would consider internet background noise. These are going to be scans conducted that are internet wide for certain services, open ports, as well as scanning for potentially vulnerabilities trying to exploit vulnerabilities opportunistically, and even just brute forcing credentials, on any devices connected to the internet. So we want to be able to understand this data. And in order to do that, we're gonna look at our noise dataset first. This is data that's gathered from our sensor network. And this is going to be all of the scans that we see hitting our GreyNoise sensors. So we'll walk through today and see how we can query that data, and how we can understand the results that we're looking at.
Now, we also have our RIOT data set, right stands for “Rule It Out”. And this is going to be essentially IP addresses for common business services: Office 365, Slack, different CDNs. And the idea behind this is, it's difficult to understand if an IP is something used by one of these services, whether or not we can block it, and a lot of times this information can be difficult to find. So we make this readily available, and something that you can query as part of GreyNoise. So we'll see how we can look at this data as well. And we'll talk about the results and how we can interpret those when we are doing our investigations.
So the first thing that I want to do, I want to start off with just a very basic query. And I say, “Show me anything that is seen in the GreyNoise network over the last one day.” So in the top right corner, we have about 230,000 results that we've observed across our sensors. And on the left hand side, we can see some of the information; we might want to get a better understanding of countries where these scans are originating from, and we can see a high-level overview of which tags are the most popular.
Tags are how we start to understand what type of scan is being run, and the intent of the scan. Whether or not this is just a simple port scan, this is someone crawling for specific directories. or if they're going beyond that and actually trying to brute force credentials or exploit a vulnerability. These tags are going to determine the way that we actually classify the hosts that we're observing. And we can look at our classification system. If we start with benign, benign is where we can assign an actor to the IP address conducting these scans. So these are going to be anything we can think of like Censys or Shodan, any kind of researchers scanning the internet, people who are publishing where they're scanning from, what they're scanning for, and are generally conducting their scans in a responsible manner.
So when we assign an actor, an actor is going to be different from an actor where we would maybe think a threat actor. Again, in our case, when we have something benign, we can assign an actor to it, and an actor or someone taking responsibility for these scans. Now, when we drill into the details for any IP address in GreyNoise, I will start to see additional details about what we've observed. So this can be ports that we've observed this IP scanning for different requests that they were making. We have JA3 and hash fingerprints that we're providing. And in the top right corner, we have information about when we first saw and when we last saw this IP hitting our sensor network. We have metadata about the geolocation of this IP operating system, reverse DNS. And then we also have a list of tags here associated with this IP as well. So this is how we're going to start categorizing the traffic that we've seen originating from this IP address.
One of the other things we have at the top here, we have this noted as not spoofable. And all this really means is we saw a full TCP 3-way handshake from this IP address when it was conducting its scans. We'll see later on that some may show as spoofable, which means we didn't observe that. So there's always a case that someone is spoofing these scans or any of the activity associated with them.
So as we can see, all of our benign traffic here, these are going to be all of our, again, IP addresses that we can assign an actor to. We can see different organizations, even crawlers like Googlebot, in here. And again, these are going to be a lot of the background noise of the internet, a lot of the scans that we're seeing, but where we can start to categorize these as to who's taking responsibility for that activity. Now, if we can't assign an actor to an IP address, conducting these scans and the traffic that we're observing, that's when we move things into this unknown classification. So we don't have an actor we can associate with this. But we're also not seeing any kind of traffic that we would consider malicious from this IP. So if we drill into one here, we can see that we're still providing the same details about this IP address, everything that we've observed. But we're not seeing anything malicious here. They're really just crawling for certain services, looking for different web directories, and scanning for open ports as well.
Now, this is going to be a lot of that internet background noise that we might be thinking about. And if we look at another one, here, we can see where some of these have the spoofable tag. So in this case, we did not see the full 3-way handshake. This could be a SYN scan, it could be a UDP scan, there's a number of reasons. But there's always the possibility that whatever we observed from this IP didn't actually originate from that IP address. So this is one more way where we can start to filter down our data. We can always click on this, if I want to see anything that was spoofable, or, more importantly, anything where it wasn't spoofable, and we observe the full 3-way handshake.
Finally, we have our malicious classification. This is where we're starting to tag the traffic associated with this IP with either looking for exploiting certain vulnerabilities, brute force credentials, we can see tags here for SSH worm traffic we've seen originating from this IP address. So this is going to go beyond just the typical routes, the port scans or looking for directories, and starting to try to exploit vulnerabilities across the internet. Now again, we can still drill in and view all the same details that we've seen with all of the other IPs we've been looking through, and we can also pivot off of this as well. So if there are certain tags in here that I want to see more information about whether or not there are more devices that are trying to exploit a certain vulnerability, I can always click on any one of these tags. And that's going to update my queries. In this case, looking for any hosts that we've observed that are trying to brute force credentials over Telnet.
So this is how we can start to see the data that GreyNoise has been collecting a lot of this background noise of the internet. And this is really going to allow us to answer the question, is this something that's scanning the internet opportunistically, or if I'm seeing an alert for an IP address? Is it something that is targeting my organization specifically? Now before we start to look into the Visualizer further, one of the things that I do want to show is the Documentation. Specifically, in our guides we have all of the information about how to get started, how to start building out a lot of these queries, how to start working with the Visualizer. So everything that we're going to cover today is documented in here. Again, all the information and what this data is, how the GreyNoise Visualizer works, is all captured on this Docs page.
We're also going to take a look at building out some more advanced queries in a minute. All the fields that are searchable are documented here, as well as documenting all of the different GreyNoise datasets. And the classification method for how we assigned that classification that we looked at earlier to an IP address that's scanning GreyNoise.
We do also have some other resources here as well. We do have a blog that we publish, so there’s great information in here that we can use. At the end of the month, we have the Tag Roundup. This is going to have information about all the new tags that were created. And if I do want to see a list of these tags as well, I can always come to my top menu here. And under tags, we can search through any one of those tags available in GreyNoise. So if there are certain things that I'm looking for, for example, I want to see the DNS over HTTPS scanner. I can search that in my list of tags here. And we can see information about this tag, and then any external references. Again, if we want, we can always pivot off this list of tags, and jump right back into our results.
So far, we've been working with some pretty simple queries. But we can start to make these a little more complex. One of the things that we can also start to query for in GreyNoise is whether or not an IP is part of a VPN provider, or if it's a known Tor exit node. So I have both of those filters enabled right now. And again, I can always jump right back into my results if I want to see what this host is scanning for. I can see some additional fingerprints here that I can use.
Some other queries that we can use, we can also query for specific CVEs. So if we want to see any traffic, or any hosts that have been scanning for a particular CVE that we have a tag for. Then I can put in my CVEs here, and I can query the GreyNoise dataset. So about 1,400 or 1,500 IP addresses scanning for this particular CVE that we've observed across our sensor network.
We can also search by ASNs or search by organizations. For example, if I want to look across multiple organizations (in this case, these are some bulletproof hosting providers) I can start to add in my Boolean logic into my queries (AND, OR, NOT). And I can start to build on this as well. So if I want to use any additional tags, I may want to say "Looks Like RDP Worm.” We don't have any results for this. But I can start to build out my tags by organizations and really build some advanced queries against our datasets. Now we've spent a lot of time talking about querying the GreyNoise dataset and how we can start to see what has been scanning the internet.
The other side of this is we have our RIOT dataset. Again, RIOT stands for Rule It Out. And this is going to be a list of common business services and the IP addresses that they're using. So there are about 60 million IP addresses in this database that we can see here. If we start with an example like quad8 (8.8.8.8), Google's DNS, everyone is probably familiar with this. But in case you're not, we look this up in GreyNoise, we see that it's part of our RIOT data set. So what this means is, it's probably not something that we want to just block because everyone's going to say that the internet is down. We also provide some additional details here. We have a description. We have a category for this when this was last updated. But most importantly, we have a trust level that we assign.
Now trust level 1 in this case means that Google owns the IP address and they own the service provided. So barring something catastrophic happening, we can probably trust this IP address. And we need to focus our investigation somewhere else. Maybe we want to see why something has been beaconing out to this. And it gives us that context around how these IP addresses are being used. If we are seeing any outbound alerts, for example. If we look at a different IP, if we pick one part of Akamai's CDN network for example, we can also see that same kind of information is provided. But in this case, we have the trust level set to 2. Again, what this means is Akamai owns this IP address, but in this case, it is a CDN. So we're not necessarily providing all of the content being served from this IP address. But again, if I do see any kind of alerts, and I'm performing my investigation, this is going to allow me to quickly pivot and better understand how these IP addresses are being used.
And now I know I probably need to look at the content that's being served by this IP address. So these are the two different datasets provided by GreyNoise. Now a couple of other features that we have in the Visualizer. We do have our Trends page available. And this is where we can see an overview for various organizations, various tags over the last 24 hours. It just gives us an idea of what the internet background noise is, and what these different devices that we're observing are scanning for. Now, we can also see some anomalies on the right hand side, I can see various ports and protocols and if we do see a spike in them. And again, if I am interested in any one of these, maybe port 31983. I want to know who's been scanning for that. Or maybe I'm seeing that hitting my firewall. And I want to know, “Is anyone else seeing that?” Or, “Is that targeted towards me?” I can jump right back into our dataset and we can see about 1,300 hosts that we've observed that are scanning for that port.
We also have alerts that we can build. Using the same queries that we've been working with, whether we want to monitor for certain organizations, different IP ranges, different user agents, we can build our alerts in here. Once a day, when GreyNoise sees any activity relating to these alerts that we've created, we can send an email notifying you for the hosts that we've observed as part of this query that you've built.
Lastly, we have our analysis page. This is where now we can start to do a bulk analysis of either log files that we're uploading, or we can copy and paste text into here. I'm going to drop in a log file from one of my ASAs, for example. And we'll give this a second for GreyNoise to extract all the IPs and analyze all of our data. But what this is going to let us do is quickly focus on what's important to us.
So in our case, we can see that there are eight IP addresses that were unidentified, these are IPs that we haven't seen in GreyNoise at all. So if I'm looking at anything inbound, that's generating alerts. Now, these are probably the first IPs that I want to focus on, because they're probably more targeted towards my organization. And then from there, I can start to prioritize my alerts based off of the different classifications. I'd probably want to look at anything that's malicious next. Again, GreyNoise has also observed these, so we're going to prioritize them a little bit lower. But I can also get any kind of details right from this page. So we can see, in this case, this IP address is associated with Mirai. So as I am going through and doing my investigations, this is going to give me some additional context for what I might need to be looking for.
So, this was a brief overview of the GreyNoise Visualizer. Thank you very much for your time.