GreyNoise Open Forum

Table of Contents

During this session, Andrew Morris (CEO & Founder), Nate Thai (Research Lead), and Greg Wells (Product) will be covering all things on our RIOT data.

About Open Forum

The GreyNoise Open Forum is a quarterly "town hall" style virtual meeting open to the entire GreyNoise community and any other interested parties. Open Forum is a way for you, our community members, to meet the GreyNoise team, learn more about our products (released and planned) and get your questions answered on everything in between.

Read the transcript

Supriya Mazumdar:

Well, again, thank you so much everyone for joining us. For those of you that are not lucky enough to have met me, my name is Supriya. I am the Community Manager here at GreyNoise. And this is our second ever Open Forum. And if the webinar invitation didn't give it away, we're going to talk about RIOT today. But first, I'd like to talk a little bit about GreyNoise. So, the plan is, we're going to go over kind of who's here, the whole team. We're going to talk about RIOT. And then most importantly, we're going to answer all of your questions.

We can talk a little bit about the team. If this is déjà vu and you joined us for our first ever Open Forum, we have doubled in size since April. It's actually pretty crazy, it's just been insane growth over the last few months. We've added quite a few new team members, including Austin, who started on Monday. So everyone give a very warm welcome to her. But growing really big. You know, lots of people joining in different departments. And yeah, onwards and upwards from here.

I'd like to take a minute though, before we get into the nitty gritty, to talk a little bit about the community. So maybe you are seeing this or hearing this a lot. I certainly am, I think it's one of those things where, when you're in it or you do it, you see it everywhere. Community means a lot of things to different organizations, but specifically at GreyNoise. Community means that we are indebted to write our users, both free and paid, whether that's a free user or customer. And the idea is that we want to gather all of your feedback, you know, treat you the way that we would treat any other type of customer, or rather, "shareholder" is I think is the word that Andrew likes to use. And gather your feedback, be able to deliver features and products based on things that you users need or want. And continue to bring you along in this this lovely journey we're on.

So as always, you can join us on Slack, we have a pretty exciting Slack community, if I do say so myself. And you can also take time to chat with us, I do one-on-one demos, if this is your first exposure to GreyNoise and you'd like to get more familiar. Or if you're a seasoned user, and you want to tell me how you're using our product. We also host "How I Use GreyNoise" sessions where we like to spotlight members of our community doing cool and interesting things. So you can reach out to me at community@greynoise.io, you can DM me on Twitter too, a lot of people seem to do that and I'm totally fine with that. So lots of ways to reach me. But without further ado, I think we're just going to get into it. So Greg, would you like to educate us a little bit on the background about GreyNoise?

Greg Wells:

Sure, let's do that. So before we can talk about RIOT and the RIOT origin story, let's go through the old use cases everyone knows so well. So I'm going to just quickly, as fast as possible, describe GreyNoise. Andrew will go into this in more detail after this. But again, GreyNoise, we have a global network of sensors, we're collecting what we call internet background noise from around the world. It's just noise that is hitting everyone talking to these sensors, that don't provide any business value whatsoever. So we take that noise, and why is that useful? It's for the analysts to use that to separate the signal, basically IPs targeting their networks, from this internet background noise. And they do that by enriching through their SIEM or SOAR or threat intelligence platforms. And that's really that use case number one, our core use case of increasing analyst efficiency, getting rid of that noise.

As a byproduct of collecting all that data, we also see compromised devices. You can think SSH worms, botnets, devices that have been popped and are now scanning their net for other vulnerable devices or services. And we also have emerging threats, which is a lot of effort we're putting into tracking vulnerabilities and CVEs as they come out, modifying that sensor network to capture that activity, and really just exposing as soon as we see that emerge in the wild to our users. But back to that core use case, increasing analyst efficiency. So we asked ourselves, how else can we provide insight into noise? What other sources of noise can we identify and provide context for our users, saving them time? What else can we roll out? And that's where we came up with RIOT.

So really this is specifically traffic leaving your network that are usually outbound connections to common business services. You can think of your employees or device center network talking to Slack, Office 365, Zoom, Atlassian, or CDNs. And having to identify what those IPs are is not the best use of your time. So that's why we pulled together what we call RIOT as a curated list of those IPs against those common business services. And we try to keep that as curated and accurate as possible, updating that every few hours and providing that as IP lookup service to our users, so providing both the noise and the RIOT. And I'm handing off to Andrew for this piece, to go over those two different sets of data.

Andrew Morris:

Thanks, Greg. So basically, the long and short here is that there are tons of different kinds of data that you shouldn't care about. And you can't not care about it if you don't know what it is or don't have any kind of data-driven way of separating what is noise and what isn't. And noise means different things to different organizations. But we at GreyNoise have identified two pretty rock-solid kinds of noise that if an analyst wants to ignore it, they should be able to and be able to kind of whittle it down and spend more time looking at the things that really matter to them and their organization. And you know, kind of wasting as little time as humanly possible. So the two kinds of noise that we're going to talk about here today is what we call actual internet background noise, right? So this is people that are scanning and crawling the internet that, as Greg was saying, just finally creating a ton of distractions for people that are on the internet or for people who are looking at and network perimeter.

The problem is that if you don't know what's hitting everyone, you have no real way of knowing what's hitting you specifically. And you have to be able to whittle that stuff out to be able to figure out the attacks that really matter to your organization, the way that this works, like Greg mentioned, so we have a giant network of passive collector sensors, we can take on honey pots in 1,000s of places around the globe, in dozens of countries around the world. And they sit back, they have no business value whatsoever. So they see the only traffic that they see are the things that are pretty much hitting everyone on the entire internet. They effectively stream all of the telemetry that is rolling through to the stream, all of the telemetry that is hitting them to a central place where Nate and his team are actually applying analytics on to that data, and basically tagging it for a host of different purposes. And then RIOT is actually the data that we're going to talk about mostly today, right is effectively instead of us trying to identify the standard normal amount of distracting telemetry that's hitting your perimeter from the outside, it's actually the exact opposite. It's actually the stuff that is totally normal that's going to be leaving your network or ingressing from your network, right?

And so this is going to be those regular business services that your users are going to be using. These are IP addresses and domains that aren't specifically IP addresses. What we're going to talk about today that are your network is almost certainly talking to and that's not a bad thing. So the examples are, like Greg mentioned – Office 365, different Google products, CDNs, update servers, public DNS servers, things like that. Those IP addresses are going to show up in tons and tons of network security products, and they're going to make up quite a bit of traffic. So one of the first things that security people often do when they're hunting for badness, is you have to get rid of the normal, you have to get rid of the stuff that's normal for your network to be talking to you. So you can find the more interesting things. And so our right list is up to about 70 million unique IP addresses that we're collecting every day. And we're really excited about it because it actually can really, really help cut down the amount of noise and wasted time that you're seeing on your network as you're investigating stuff. It's great.

This is basically a diagram of how the universe works. So we have outbound noise, which is RIOT, that's what we're going to be talking about today. And we have got inbounds. So inbound noise, this is the stuff that's hitting your perimeter, your network’s perimeter, this is the stuff that's made up by people who are scanning the internet crawling the internet. And it's some combination of good guys, security companies, bad guys, security, researchers, universities, and tons and tons of different companies and organizations like that. They're scanning the Internet all the time, we've got at least one person on this open forum right now, that's probably scanning the internet, or is probably scanning the internet right now. So all those different people create this sort of low-level background noise that's happening on people's networks all the time. And at scale, it's actually quite a bit of noise, and it can be really distracting to security analysts. And then on the outbound side, you can see architecturally here, you've got traffic that's leaving the organization that is going to include traffic to Google, Cloudflare, Zscaler, Microsoft, any of those other products that you're using, and it's completely normal for you to be talking to those hosts. But without knowing the list of what they are. It's really difficult or unwieldy to look at your data minus those things, which is probably where the more interesting threats are going to show up right now.

And so that's basically that's basically kind of the paradigm that we've, that we're operating under. And we're, we're this this diagram is really just to give you guys a little bit more understanding of kind of exactly architecturally how these different offerings work and how RIOT actually fits into it.

So RIOT, as we were talking about right now, this is us effectively interrogating in a numerating, the IP address space of all of the CDNs that we can find on the internet. So that's going to be Akamais, Cloudflares, Azure CDNs, your Google CDN, etc., all of the different CDNs that we can find that are just content delivery networks, right? These are normal, these the IP addresses of these things are there that make up just a massive amount of the internet, but be it's also going to be really, really normal for your network to be talking to them. If you have anybody who's browsing the internet update servers, this is going to be your Microsoft update servers, some of the different Linux update servers, etc., public DNS and NTP services, SAS APIs. So the different APIs that your products are that a lot of the devices in your network are supposed to be talking to you and that are that are programmed to be talking to you. And then different cloud security products, etc.

What RIOT isn't, it isn’t a big safelist, allow list, or anything like that. And the reason for that, and from the perspective of a network security product, the reason for that is that effectively, there are ways that an advanced enough attacker can actually abuse your network using these services that doesn't actually remove from the core functionality of what RIOT is, it serves as a time saver to help you very quickly identify these things and not accidentally block something that is effectively going to cause you to it's going to cause problems in your network, you're going to get in trouble, you're going to end up just having to unblock anyway. So basically, the way that you don't want to think about right is like, "Oh, this is, you know, this is a list of IP addresses that if my network is talking to them, it's bad." That's absolutely not the case. But you also don't want to think about it like, "Oh, so I should only allow my network to talk to these things, because then you're probably going to break the internet as well."

This stuff is hairy and hard and complicated. And everyone's different. Everyone's networks and security postures and things kind of threat models are a little bit different. But without knowing these things, you can't possibly make decisions in the SOC without this information. And it's just otherwise a waste of time to eventually, you know, to be looking at some telemetry, you know, analyzing it, investigating it, triaging it for 30 minutes only to find out like, "Oh, man, this is an IP address that belongs to, you know, a StackPath CDN," right? “I just wasted 30 minutes, and now I'm going to have to do something completely different, I'm going to move on something else I can close this out, etc.” So this is kind of the way that we recommend that you think about RIOT a little bit. And with that, Nate, I would like for you to talk about trust levels.

Nate T:

OK, great. So inside of RIOT, we initially put in a lot of these IPs when we were trying out and developing the product. And we realized that it doesn't necessarily make sense for all of them to have the same trust level, for example, like CDNs, or something that is necessary for your business, right? For example, if you block all CDNs, that's going to be really bad for your business and your employees and real mad because a large portion of the internet is just going to be unavailable to them. But also, CDNs have very poor, what I like to call end user attribution, meaning they have high business necessity, poor and user attribution, meaning what comes from or what is hosted behind the CDN is not necessarily attributable, what RIOT tries to do is identify services where that are trustworthy, and also attributable, for example, Microsoft, right? If, if Microsoft has a compromise, or Google has a compromise, right, and your users are browsing towards Google infrastructure, there's really not much anyone could have done about that, except for Google or Microsoft, right? That's just something that we accept in the ecosystem, where it's kind of just not your fault, not your problem. So when we built out, right, we split things into two trust levels. And then also things we would kind of never add to RIOT, even though we know where that infrastructure is.

And this allows you to basically stratify and tear out, you know, different IPs that can be effectively ruled out in one way or another. One way is effectively trust levels, trust level 1, which is, “Yeah, this is just very trustworthy. The users definitely need this. Can't really do anything about it.” And trust level 2, where it’s, “This is something that we definitely need. We can't just block wholesale because it could be problematic. But at the same time, you know, there is some risk potentially associated with that.” Meaning it's not something that you completely ignore, but it's also something where it's, you know, you can't just, you can't just walk away from it.

So the way we kind of flesh this out was we did a bit of analysis and we did a bit of a scoring where you can see here, the top The triangle or the top of this spider chart is owner attribution, that is, "How does is this IP like clearly owned by this organization or this particular service?" Right? In this case, you see Datadog, Zscaler, Google DNS, even Tor, right? It's easy to know if an IP belongs to this organization on the bottom left. That's the business necessity, like how crucial is this to your users, if it was blocked? Would your users be able to not or be unable to do their jobs? And on the bottom right, we have end user attribution, which is how associated is the organization, such as Datadog, or LogMeIn, or Gmail, with the actual content that comes from that IP. So in the case of Zscaler, or Dropbox, or that kind of stuff, these things are business necessarily, users will interface with them often. However, malicious files can be hosted on those. So those have become superior.

Those have become level two and level one respectively. Right? So for example, you see the ones where there's high business necessity, high owner attribution, high-end user attribution, meaning, "OK, Google DNS, we know it's coming from Google, Google owns it, Google has responsibility for this." And a lot of people use it. That's a trust level 1. For something like jsDelivr, it's kind of an arbitrary CDN. We know jsDelivr owns these IPs. We know a lot of people use them. So they're kind of business necessary. If you block them things, my app funky, but really, I can't fully trust the content that is coming out of it, because it's a CDN. So that's trust level 2. However, again, I can't block it because it'll cause things to go poorly.

Finally, here's kind of a bit of a more nuanced breakdown on one particular service, for example, Google's actual infrastructure, we put that as trust level 1. So that's when you're interfacing again, with Google DNS. However, for something like Google CDN, we have this issue where anyone can be kind of behind that. So they can be serving malicious content. But again, since it's so dynamic, your users will have to interface with it. And then finally, there is Google Compute. How do we handle Google Cloud Compute IPs, which are kind of arbitrary? Well, some people host their infrastructure behind static IPS inside Google Cloud computing. And they publish those. And so when they do publish those, we might add those as a separate trust level 1, say like, GreyNoise was hosted inside of, you know, our main, maybe our main webpage was hosted inside Google Cloud for some reason. And your analysts are browsing through it, right? That would be something where it's like, OK, that's trust level 1, even though it's inside Google Cloud, we're normally that would not be trust level 1, or even trust level 2, when you've been being righted at all, because it's arbitrary compute. So happy to answer any further questions about that later on. For now, I will turn it over to Brad.

Brad Chiappetta:

Alright. So of course, you've heard all this wonderful information about what RIOT is. Now the question is, how the heck do I access the data? And how do I actually make the determinations of what to do? So a couple of pretty simple ways, we try to make sure that you can access this the same way you access all of our other data. And so the short and simple answer is, let's start by putting those IP addresses into the Visualizer is just like you would for any other ones. This is a sample of a trust level 1 page that you would get from the Visualizer. So you're going to see that this is one that you could reasonably ignore, based on all the information that Nate just went through and how we make that determination.

And then here is an example of what a trust level 2 looks like as well. So again, we're showing you are calling out the fact that you know what, go ahead and have a moment of pause here, as you sort of take a look at this. And you know, this is a CDN. So something you want to look into a little bit more as you go through. And of course, you know, we try to ensure that all of our data is available, you know, through our API's and through our integrations as well.

We do have a subset of what is available for RIOT available in our free community API. Alright, so you can simply go ahead and just curl, you know, any IP address against the community API, and you'll get that Boolean there that tells you that this IP address is in fact, in RIOT. And so you'll sort of get that all here. And then we also have it in our enterprise API.

OK, so for those with paid subscriptions, you'll be able to hit our dedicated write API, and get the full context of all the details on a right iPad, as well. This is everything that you would see in the Visualizer page with our explanation and the trust level information and categories, and just a more expanded than what you would get out of the community API. And then, for all of those that love integrations, here's just a couple of samples. You know, we do try to make sure that RIOT isn't available in as many integrations as we can offer. So a couple of samples that we're just showing here is just a screenshot from Anomali and from our newly released Maltego enterprise integration. And so we tried to expand These out and make sure that these are available for you. And you'll see these in all of our major enterprise. So for those out there using things like Splunk Phantom, XSOAR, Swimlane, Tines, any of those platforms, they all have these wild lookups built in, so that you can go ahead and take advantage of this data and all those integrations and automation platforms as well. And then go now I'll hand it off to Greg.

Greg Wells:

Thanks. So you've heard some of our sources already, we have about 64 different sources, we use total about 45 million IPs. So this time, again, we're constantly curating that list, adding new sources, or cutting back ones that really don't fit or performing how we expected. We have some examples here of from our sources, we are trying to get as creative as possible with what would be a source of RIOT. So we're, we're always adding to this list. And as you can imagine, I'm guessing a lot of you also have some great ideas of what should be on RIOT.

We'll go through some of some of what's coming up for next within RIOT and one of those is adding additional RIOTs sources and reaching out to the community, both asking for requests and eventually allowing our community to provide submissions straight to RIOT either via API or some other mechanism for that, and just continuing to build out this this list of potentially benign or at least common business services. Some other steps for us from a product perspective, we want to support a higher volume of IP lookups. Right is becoming more popular than we expected at first, which is fantastic. We love that. That means we need to continue to scale up our infrastructure to support at potentially millions and millions of IP lookups per day against RIOT. So that's something we're focusing on right now.

We also need to make sure that RIOTs integrated tightly within our API's. So as Brad mentioned, we have it integrated within the community API. But for the enterprise side of the house, it's a separate API endpoint right now, we want to merge it in with the rest of the API's for that and just continue to build in additional pieces of right throughout the product, both on the API side, and the Visualizer. So right now with the Visualizer, we have right built into it as Brad showed off. So just continuing that that development on the product side, and potentially extensional trust levels, as Nate covered. As we identify certain sources, maybe they're not as trustworthy as we want. Maybe we go to a trust level 3 or 4, where it's something that's you probably need to look at a bit more, but is, again, a business service that if you blocked it, you're going to cause problems throughout the organization.

Andrew Morris:

Yeah. And so I would actually ask, before we even open it up to questions, I want to ask a question of the audience right now, what are some of the other things that you could think of that you didn't see in here that we haven't had listed, that you would like to as as an analyst, or as a security person, but you find yourself often basically saying, "Oh, wait, I think I might have something here only to find out, you know, five minutes later?" Like, "Oh, actually, this is something that I don't care about, we want to know what those things are." So if you are, if you if you aren't multitasking, and you aren't doing something else right now, and if you are you do match the criteria of kind of what I'm describing, we'd love it if you popped it in the chat if you shot us a note or anything like that, because I actually really want to hear some more ideas about you know, the kinds of sources that we can look at. So just from my perspective, that's my question to ask of you guys.

Supriya Mazumdar:

With that, we'd like to open it up for questions. I know that there are few in the Q&A. But yeah, this is our opportunity to answer so I think it'd be valuable to answer this verbally. Greg is asking, do you have known sinkholes in the data?

Andrew Morris:

So we don't have known sinkholes in the data. I think that's actually a really great idea. We found known sinkholes before a number of times throughout the research that we've done, but we don't have that included in RIOT right now. And honestly, I think that's a really good idea. I knew this one a long time ago and I know that's kind of one of those things that that I know can take up space in, you know, in a security analyst’s brain. But I think that's a great idea for what it's worth.

The one more thing, the last thing that I wanted to say is if you guys want to kick the tires on RIOT literally right now, you can just grab a file of a bunch of IP addresses and a bunch of if you have logs, whether they're network logs or anything like that, and you already have a great noise account, just log in with your grandma's account and submit that file to our analysis page, it'll enrich every single IP address that you have against our noise data and against a RIOT data. So you can check out great noise, the RIOT feature that we're talking about, it's live right now you guys can check it out, if you want to, if you want to give it a shot. If you're a great, nice customer, you've got access to this right now you can go ahead and start using these API's. And you, you've already got it. If you if you aren't a great noise customer, and you want to you want to kick the tires before you have any kind of conversation with us, then check out the community API or just create a free account, throw everything against the Analysis page. It's right there.

Supriya Mazumdar:

The next question is, if a customer has a large fleet of systems making GreyNoise API requests, do you recommend a local cache to avoid redundant requests?

Andrew Morris:

So Brad's done some incredible work on our API and on our SDK, effectively, in Python. Our Python one is the most mature and it's the one that we really support internally. There are some other, I don't want to call them aftermarket, because that sounds kind of demeaning. But there are some community SDKs that other people have built that are we just don't technically support that, that that I know that someone hasn't Ruby one. I've seen go ones before. But yeah, so the answer is like, "You're probably going to want to cache it, because it's going to be a high volume of requests." Brad has already built this out in the in the Python SDK. So we have native caching in our Python SDK, you don't even have to change anything. You do one lookup, it's going to be whatever, however many 100 milliseconds, but then you do another lookup on that same IP address again, and it's going to be a fraction of a millisecond, because it's just retrieving it from memory. Great question.

Supriya Mazumdar:

We actually have a request to a live demo the Analysis page. So Andrew, would you like to steer?

Andrew Morris:

Alright, let me make sure there's nothing weird going on my screen right now.

Supriya Mazumdar:

For the audience, I think I just asked him his favorite question.

Andrew Morris:

Let's see. Alright, I'm going to I need I need a file of some time. I don't want to pick on anyone and do a threat feed. So you know what I think I'm going to do, I'm going to just do a netstat from my machine. And I'm going to throw this against, right? So let's see. Right now, I'm going to, why don't I just do a netstat from my computer and grab all of the IP addresses that my computer's talking to. So yeah, this should work. So I think this is a faster version. Yeah. And so then I can, I can rack and stack these real quick. So this is just me running in netstat on my laptop, and then sorting it doing a little bit of bash kung fu. Alright, so these are all of the IP addresses that my laptop is talking to you right, now, I'm going to grab these, pipe it straight into my keyboard into my clipboard, and I'm going to take this, I'm going to go over to the Analysis page. And I'm going to dump all of these in here, blow this up a little bit, and I'm going to hit the Analyze button. So we can see here, that was fast.

OK, so we basically have here, if I go over here to the RIOT data, we effectively have, we can see here that of the connections that my laptop is talking to you right now. Look, there's a lot of non-routable IPs in here and stuff. So I wasn't really planning on doing this. But we've got up to 55 IP addresses that we found probably a lot of the ones that are unidentified, that we don't know anything about are going to be non-routable. Maybe we eliminated those, I don't know. But then if I go over here, and I look at the RIOT data, we can see that 34% of them are or rather 30 of these IP addresses are in RIOT right now. And we can see exactly how my laptop's talking to Amazon CloudFront, Apple, Cloudflare, Google, GitHub, etc. And so these are some of the ones that we're looking at right now. Do you guys have any log files or anything like that, that you that you can tell me where I could find any IP address, you know, any kind of files on the internet that you'd be you'd be interested in knowing about, what they would what they would show up as in the analysis page? Let me know and I'll demo that as well.

I'm going to go over to AbuseIPDB. And I'm actually just going to grab just the list of IP addresses that people are recording. This isn't IP addresses that they're saying are bad per se, so I'm not picking on them. I'm a huge fan of AbuseIPDB. I'm going to just grab all of these, and I'm going to go back over here. I'm not sure that any of these are going to be in RIOT to be completely honest. So I'll run this right now. And then we'll just see how it how it returns its power. possible that there's going to be some right IPs in there. And it's you know, and it's possible that there aren't any. We got six IP addresses. And right here we've got a Cloudflare CDN IP, a Facebook IP, two Office 365 IPs, a Cloudflare CDN IP and a Google IP. So we found, you know, six IPs in here in this list of, you know, random unstructured data. And I just found people that had reported abusive use IPTV and we found that, you know, at least six of these are things that you're more than likely aren't going to have to worry about. So thank you, for whoever asked that question. You totally put me on the spot. But I refuse to totally bend to the will of the demo gods. So I'm glad that it worked out reasonably well.

Supriya Mazumdar:

I think that was great. We have somehow exhausted all of our questions, which is highly unusual for this bunch. So we can sit and hang out for the next 30 minutes and take this more informal and take it offline. But I'm actually open for another minute.

Andrew Morris:

I'm going to roll through so you guys can ask any questions that you want about RIOT, you can ask any questions that you want about GreyNoise or anything broadly or abstractly. I’m looking at some of the comments that are coming in here. It looks like we've got a couple of people are recommending Qualis, Rapid7, Nessus, Bitsight, RiskRecon. So we've got tags for most of those already, right now. So you can actually use GreyNoise to figure out, you know, the IP addresses that those services are delivering their service from you can do that right now. Ryan, you've got your hand up? What's up?

Ryan (attendee):

My question is about RIOT. Would it be appropriate to have at least DigitalOcean and OVHcloud in this, you know, almost level two or level three, not trusting? Because from my experience, their inbound is terrible. And that's already in the normal of GreyNoise, a dataset. But then I'm also worried on outbound traffic to DigitalOcean and OVHcloud. Just want to get your thoughts.

Andrew Morris:

Yeah. So that's a that's a great question. The issue with us giving, putting RIOT in cloud, or at least for our thinking, because at the end of the day, what we want to provide is something that solves problems for our users. So I'll tell you the way that we think about it. And then you can tell me if we're in the right or in the wrong, but at the end of the day, if an IP address, or if one of the members of a group of IP address can be used to attack you pretty easily, launch attacks against you, e.g., there's arbitrary compute, a person can log into a box and do whatever they want from it at your network, we're probably not going to put it into RIOT just because specifically, we don't want to tell you like, "Oh, don't worry about this thing." By and large, it's pretty easy actually, for someone to use it and, or for someone to attack you from it. And I would say more than anything, if you block a DigitalOcean IP address, it doesn't necessarily cause problems in your network. Like one atomic IP address, that isn't necessarily something that you would get in trouble for. Whereas we're really trying to prevent our analysts from blocking anything that's going to disrupt the network or anything like that. Nate, what do you want to add?

Nate T:

Yeah. So on one of those slides, you noted that we kind of had trust level 1, trust level 2 for Google, we broke it down with also the cloud computing there. I think a lot of people have asked us like, "Will you just label all my IPs for that?" I think that's something that we're interested in doing. We're just trying to figure out how to do it in a way where it doesn't get conflated with the traditional rule it out model, right. So for example, in your case, you're kind of asking for DigitalOcean, there are legitimate businesses that run out of DigitalOcean, right? They might even be trust level 1, because they're hosting it, they've been on those nodes for a really long time. They're just very small businesses, and they host one or two IPs there. We'd have to have something like trust level 3. But that becomes a little bit complicated, but it's definitely something we are thinking about. And trying to figure out how to provide that to you as a customer is the context for every single IP, so that you kind of have a one-stop shop. I hope that answers your question kind of gets at what you're thinking about that there's something we want to do if it's useful to the to you or analysts or something like that to just get that context, but rule it out. Or RIOT is effectively for kind of has more trust and more business necessity in that particular dataset.

Ryan (attendee):

Right, and I was just thinking through that same context there, meaning that my users are going to these big brands – Google, Cloudflare, etc., – and just like what was pointed out previously, bad actors can abuse those sites. So I just thought from the volume of a provider, which for OVHcloud and DigitalOcean is pretty big, I just thought that might be useful to say, “Hey, there's these other users that are going to these, these destinations.” And I know that the trust level is kind of tricky, but I just thought that might be worth at least labeling.

Andrew Morris:

Yeah, that makes sense. That's a great point. Thank you.

Supriya Mazumdar:

So we have a couple of more questions. Jerry, I'm going to ask your question first, and then I'll go to this next one, actually. So Jerry is asking, I'm curious if people are using RIOT to truly rule it out, escalating stuff that is not on RIOT versus actually using it to color things bad or block things, raise it incidents, etc. In his case, RIOT means kill it with fire, it's not mission critical.

Andrew Morris:

So my ability to talk intelligently about this is somewhat limited because we've only had people using RIOT for a few months, in production at high volume. So Jerry, my experience right now is that no one is necessarily escalating connections out of the organization that are not in RIOT, because that would just be a massive volume of stuff. That would just be, basically, all of the different things that people are browsing to that we can't prove as safe, in an enumerative and programmatic way, with the same kind of method that we're using with RIOT, what we're finding is that a lot of people are using RIOT as sort of a filter once they think it's doing something bad. Or, or if they're trying to execute a hunt on a bunch of telemetry to basically whittle stuff down to kind of prioritize, if you think about it more like a funnel. The two fundamental ways that I would think about it is, and what I've seen in practice is A) I have an alert for something, before I do anything with it, I'm going to enrich it against RIOT just so that it gives me information about this thing before I before I make a decision or raise it to an analyst; but B) as a threat hunter, I have like all of these different places where bad might be happening. And I want to try to whittle it down to the places to a smaller list of things that are more likely that that is happening.

Supriya Mazumdar:

Awesome, it looks like we have a couple more questions. But I did want to give Brad an opportunity to talk about or kind of answer that question. There's also one asking about the percentage of traffic customers or prospects are seeing as RIOT. Even just anecdotally,

Andrew Morris:

Oh, like a massive amount, right. Brad, you go ahead.

Brad Chiappetta:

Yeah, I mean, it depends on specifically the use case and which data they're enriching. I mean, but we've seen as high as 85% of the IP addresses being looked up and RIOT having hits. You know, of course, that does apply to everybody. And, you know, if you're looking at inbound traffic versus outbound traffic, that can make a huge difference. So I would say more common average is probably in the 30 to 40% range. And that sort of goes up and down as you tie in our invoice dataset as well. But yeah, I mean, we're definitely seeing a lot of pretty high hit rate pretty consistently across the board. And a lot of people are being able to do a lot of different things with that data and filter that noise out.

Supriya Mazumdar:

Great, we have, we've had quite a few questions. So I'm going to take the opportunity to step in, and we've had a few questions about our pricing. So open forum typically is strictly community, right? I'll be pretty strict about that. That being said, if you have asked for it, I can send you the link to our pricing page. And I strongly encourage you to talk to one of our sales representatives, some of you who are on the call today. Or you can reach out to me if you're if you're much, much more early in the stages, but Andrew, you look like you're going to say something,

Andrew Morris:

I was just going to say, we're desperately trying to not shill our products on the open forum, we're really trying to stick to stuff that you can do something with for free. We want to keep this as a sacred place where we don't want to push products in your face. I can talk about pricing at a very high level right now in a way that's relevant without necessarily shilling any of our products. So I'll just like to say it's going to take me two seconds. We've changed our pricing model recently. Basically we used to think about everything from a volume perspective, like volume of queries, however many queries. You entered a few queries, we're going to charge you a few amount of money; you want it do a ton of queries, we're going to charge you a ton amount of money, right? And what we basically found is that it's not necessarily married, the amount of value that a customer is going to get from GreyNoise is not necessarily married to the volume of queries that they're using.

We talked to enough smaller organizations that just were more technical and had more advanced stacks that were like, "Look, this is valuable to us, but we're going to have to do 100,000 queries to do this." We wanted to have a pricing model that actually right size so that we were charging people correctly. And that we were actually we weren't nickel and diming people and making them ration their queries to get what they wanted to get, right, we had a lot of conversations with our customers where they would say, like, "Hey, I would use GreyNoise in these other places, but I don't want to go over my limits." That's not what we want, we want you to get as much value from grade ones as humanly possible all the time.

So basically, we've shifted to a model, that's basically we have a standard package, this is public on our website. Our Investigate packages, a flat rate, $25,000/year. And then if you want to actually automate, basically everything against GreyNoise, and you want to your you want your whole organization and all of the different security products that you have to be able to be hooked up, integrate noise, it's going to be a rate based on the size of your organization. And you can run as many queries as you want. So that's our new the way that we're doing pricing.

If you want to hear more about any of that stuff, reach out to any of our salespeople, and we'll get you we'll get you hooked up with all the info that you want on that. Yeah. That was that was the end of my shilling I promise.

Supriya Mazumdar:

I think it's fair, especially considering prompted, not unprompted.

Andrew Morris:

Either that or it was Brandon and he planted the question. I thought there was something going on with this person who has a picture of Brandon, but with a fake mustache.

Supriya Mazumdar:

I think we've a question that's going back a little bit to the right data. So how about open DNS recursive resolvers?

Andrew Morris:

I mean, so I think we actually do already have OpenDNS’s primary DNS servers, but I don't know that we have some of the other outside of the primary DNS servers, I don't know that we have any of those in there. Because I know for OpenDNS, I think we only have two or three IPs for that. And it's just the ones that everybody knows, if there are more servers outside of that we can look into that.

So one of the things is that the change to RIOT trust levels came somewhere in the middle of the beta, where we were really figuring out what is trustworthy and what is not. And that's why we introduced trust levels, for example, the high level. You know, OpenDNS servers are very trustworthy, right? We know who's attributable for them. However, this whole list of OpenDNS servers, it's kind of a little bit less so, right? So with the advent of trust level 2, and you know, the full release of RIOT, that's something that we can definitely look to add more. And that's similar to the Ubuntu update servers as well, right? There's a couple that are very, very well known, very well used. And then anyone can kind of host and mirror. Although there are things in place to make sure that those aren't bad. We don't want those to be kind of erroneously in the list if they're used out of band for something else as well. And they're using that to build up credibility in a way to hurt your organization through an open-source kind of way. That's really why trust level 2 is there. And we're looking to expand out exactly into that domain. Remy, you have something to add?

Matthew Remacle:

So, open resolvers on the internet. A lot of that depends on the context in which that traffic scene does, we can enrich data for an IP. But it really depends on how that NetFlow data is represented within your organization, outbound or inbound, because open DNS resolvers can be abused for DDoS attacks. And we want to avoid getting into attribution of what it is. But that's certainly something that we can enrich in terms of providing that information in context so that we can aid in investigations.

Andrew Morris:

What else do we have, Supriya?

Supriya Mazumdar:

Going back to Greg. Your question, did we did we answer it correctly? I know that there was some confusion between OpenDNS and open resolvers. OK, great.

There's been a few questions about sharing this. Yes, we will be posting this. It'll be part of the community resources page hosted on our documentation. And I'll also be sharing the slides as well so that you can share them with your respective teams.

Nate T:

Can we talk a little bit about Paul's question about DNS? So Paul's question was, “Have you thought about doing anything with DNS? The shared hosting providers sometimes have 1,000s of domains sharing IP, it'd be great if you could do something more atomic.” And then, Jerry followed up with “I think Paul's question might translate into saying 200+ DNS entries pointing to this IP, so let's call it a shared hosting IP or some tag like that. That's the logic we see by our investigation of IPs, seeing 200+ and PDNS, it immediately leads to some conclusions.”

So this is actually a really big issue. And this is kind of part of what RIOT deems seeks to do with CDNs, right? Largely, if you are seeing a vast number of DNS associate with an IP, it's probably a CDN. And it's kind of one of the things that it's not, there's not much you can do about that at that time. Right. It really works in investigation. And for a junior analyst. It's actually something where they're like, "Oh wow, I found all these malicious domains hooked to one IP," and it's really a shared hosting provider.

So from that perspective, we aim to resolve that kind of problem. And this is this problem, I'll give a really more concrete example. It's really obvious when you put it into Maltego and you do a DNS enrichment on IP, and you get enough nodes to crash your desktop. And that's just really unfortunate and feels really, really bad. So the nice thing would be is if it checked right at first, and was like, "Actually, this is just a CDN." Now, that's certainly possible for something like Cloudflare, or other major CDNs that we have tabs on. But it's actually a very difficult problem with kind of these more low-key shared hosting providers. But we're looking to add those. And if you have particular ones that you have in mind, we're happy to look into them and try and add those. I hope that kind of addresses that a little more thoroughly. Yes, Maltego explodes. This is this is a problem with any graph visualization tool that uses DNS nourishment on an IP. And also anything that does any kind of graph enrichment. It's a very unfortunate problem caused by CDNs.

Supriya Mazumdar:

Awesome. Thank you, Nate. And I see no more questions in Q&A. Again, like I said, we're more than happy to stick around. We've got another 10 minutes earmarked, I'm happy to end the recording as-is and we can stay on.

Andrew Morris:

I'm happy as a clam if we’ve had a chance to spread the good word about RIOT, answer a couple questions, then I'm more than happy to break it out and let everybody go. I just really appreciate everyone taking the time to come here and meet with us. We'll be doing this again next quarter. So in Q1 we’ll be doing it again and talking about some more stuff that we built and we're working on. But look, from my perspective, the way that I think about our users and our community is remarkably similar to the way an artist thinks about their audience or the way that someone thinks about their patrons of what they do. Without people to listen to your music, then nobody's going to be a great, fantastic artist. And without our users, we're not going to have a product that's fantastic. And so we need you guys, is what I'm getting at. So I really appreciate you taking the time to come here and meet with us, and Supriya thank you for getting all this stuff together and for getting everyone here. I really appreciate it a lot. Thanks everyone so much.

Supriya Mazumdar:

We'll be continuing the chat party on the community Slack, so we'll see you there.

Andrew Morris:

GreyNoise everything! Alright, see you guys.

Greg Wells:

Thanks, everybody.

Summary

During this session, Andrew Morris (CEO & Founder), Nate Thai (Research Lead), and Greg Wells (Product) will be covering all things on our RIOT data.

About Open Forum

GreyNoise Open Forum - Nov 2021

Summary