#352: Advanced Encryption Strategies for Beginners
AWS Podcast
0:00
0:00

Full episode transcript -

0:0

this is Episode 352 of the AWS podcast, released on January 20th. 2020 Confirmed Welcome to the official AWS broadcast Live Welcome back. The edit was podcast that someone should be with your great every back, and I'm joined by a special guest. I'm joined by Ron Reagan, who's a software development engineer for our crypto tools team. But, Ryan, I'll let you introduce yourself because you do lots of cool stuff here at Amazon. So welcome to podcast

0:31

Ron very much s o. I'm a software development manager on the U. S. Encryption tools team. I lead our open source Core Libraries group, so we make products like the interview s encryption SdK, and our general goal is to make tools that are easy to use and hard to misuse so that you could encrypt data and not screw it up too much. Yeah,

0:56

kind of kind of important. So I'm gonna call that for for regularly since this this particular podcast, ese, and fairly and squarely at those who are beginners to two encryption security, et cetera. So if you're like a a super duper encryption expert, you're not gonna find this episode particularly interesting. However, if you are someone who is going, I need to be secure and encrypted. But I don't know where to start. This is the episode for you. So, Ron, maybe let's start from from the customer perspective, sitting there going eso someone told me our economy to protect my dada. How? What? Why, where we don't stop?

1:32

Yeah, that's a super similar place to start. And I think that any engineer who handles customer data finds themselves in this situation. At some point and really early on, you need to try to figure out what even is encryption. What does that mean, Toe actually encrypt something so we can kind of dig into what encryption means and some of the definitions of valuable properties that it gets you and how to keep those true. So if I jump right into it for the 1st 1 that you think of is confidentiality, you want to keep your data confidential so that if what's imagine, someone walks up to the hard drive that your data is sitting on that all they see is a bunch of scramble bits, and they can't just read it off, right? So that's that's kind of where we start is I just want to I just want to keep a secret. The next thing that we look at his integrity, this is another property that's very important. We wantto make sure that the data hasn't changed in any way or been tampered with. Um,

and there's actually a long list of cryptographic properties that different algorithms achieve, and they all tend to be a little different, which is really intimidating when you first looking at it because there's there's also authenticity. This is very important, making sure that that data is authentic. By which we mean that the person that said that they created this data actually did so that you're talking to the person that you think you're talking to on Daz. You start investigating this. Um, we're talking mostly about symmetric encryption right now, saving our data at rest. As you start going into this, you'll notice that some algorithms give you may be one of those. Maybe they just give you confidentiality. Maybe they give you confidentiality and integrity, but not authenticity. And so it's tricky to figure out howto take a good algorithm toe, actually give you all of those properties.

3:32

And this is important because, as Vanna Vogel's likes to say, you want to dance like no one's watching. Encrypt like everyone, ese.

3:39

Exactly. I have that poster in my

3:42

aunt. I guess one of these is kind of always known about encryption as a concept. But in the past, the overhead of applying at both for my performance perspective in a management perspective has been very high. But that's not the case anymore, is it?

3:56

It's not. Eight of us has done a really good job making it so that if you want to encrypt your data at rest in a large number of our service is, you can use something called server side encryption, which in many cases is actually a Z Z as checking a box within the eight of US console. So that's the default that we recommend. Starting with is just used server side encryption, and we'll make sure that the data is encrypted at

4:25

rest. So let's let's pick a wide that a little a little more. Because if we think about the fact that you know, if I find out, I have to encrypt my data either because someone's told me to or I recognize it's the right thing to do. You know, I don't want anyone to see my daughter under any circumstances unless it's supposed to. How do I How do I get that? How did I get that encryption? Have a Sam I might know a few few older, all the tops of encryption, your tops of encryption. But how to actually start?

4:51

It's a really good question. So let me kind of start. I'm gonna jump to the end. There's a bunch of things you can do. But my recommendation is that for most use cases, you should let eight of us encrypt your data using a kms key management service. Master Key. This gives you control over the key that we used to encrypt your data. And our servers will make sure that that data is encrypted at rest, which protects you from certain kinds of issues.

5:22

So that encryption process that's really a case with the customer owns the key. So you got, like, the case of the door. Yeah, and it's using using the atavus kindness, technology and well established encryption protocols, which we can talk about shortly to do the encryption. So it's handling that heavy lifting of the encryption. But as a customer, you've got the cake with

5:43

an advocate. Yeah, the way that works is interesting. So the way that it works with Kms in particular is that we run a big fleet of something called hardware security modules, and that sounds fancy, but it's really just a big phrase or a server that encrypts things. So hardware security modules here Lin called hs EMS age ISMs are hardened in some way so that they are protected from various types of attacks, and they support something called non exportable keys. So when a customer creates a customer master key, it gives them the ability to control access and on it to that particular key. And then they can grant access for AWS to use it. And then, of course, if they ever need to revoke that access, it's within the customers control so they can revoke all access to a key, which makes the data that is encrypted with that key completely inaccessible.

6:48

And when we talk about encryption officer, there's different methods of encryption. What are some of the phrases or technologies that people should at least be aware off as to what kind of encryption we use in different points of the active.

7:2

Yeah, so let's start with something you'll hear a lot. It's symmetric versus asymmetric encryption. This is a common point of confusion for people who are getting into the area. And of course, there's a really complicated, mathematical way to think about this. But I'm just gonna say that symmetric encryption uses the same key to encrypt and to decrypt, whereas asymmetric, you get something called a key pair with that which has a public and a private key. And one is used for encryption as one is used for decryption. So starting off, um, asymmetric versus symmetric is important to understand they have some other properties. For example, asymmetric encryption intends to be quite a bit slower in terms of your processor, whereas symmetric encryption is quite a bit faster.

But symmetric encryption suffers from having to share that key. And so what you end up seeing in practice is that usually you'll use an asymmetric algorithm to generate a session key. This is how something like T L s does it. When you're calling an https website s o, you use that asymmetric key which is a little slower, but it allows you to establish a symmetric key over a basically an insecure channel. So that's one of the really cool things about all of the crazy math behind. Asymmetric encryption is that you can use it to establish symmetric session keys that you use going forward. So that's that's one early concept. Another one that I bring up is within symmetric encryption. There's one very, very common block cipher that's called A S. And that's the advanced encryption standard. So we start digging into encryption. You'll start seeing A s a lot because it's the most used symmetric encryption algorithm.

9:2

And it's interesting how they the industry moves from top to Tom around these algorithms. I mean, that was the The old eyes of days and triple days used to be the, uh, the best thing. But I guess is compute evolves. The encryption algorithms have to get more more resistant to compute power.

9:18

Don't exactly Yeah, Dez and Triple does. Like you said, were the standard for a long time. On triple days is a bit of a funny story, because what you see with Triple Daz is it's just doing Dez three times because we realized that that Dez was, uh, not secure because processing power had advanced to the point that the block and key sizes and Dez weren't enough. And so we were able to squeeze a little bit more out of that as a cryptographic community by just doing it a few times. Um, which doesn't prove, but as you can imagine it it improve security. But it's not exactly super fast. And so yes is now the the blessed algorithm that you'll you'll mostly see. In fact, in certain regulatory environments, you have to use a yes, so it's very, very common.

10:14

And so if I'm looking to protect my daughter, I kind of just sort of throw it all in a database. Make sure that that I support some form of encryption and

10:24

call it good. You totally can, and you will start hitting issues. And so one of the first issues you'll hit is okay, Great. I may be generated a key. You encrypted a bunch of data and you're feeling pretty good because you have a bunch of cipher text that you can't access without that key. But you're immediately gonna be stuck holding this key, and it's it's kind of hard to put down is the problem, because whatever server is supposed to access that data is gonna need the key. But if you just put the key on the server than anyone who gets access to the server gets access to the key, and now they have access to data anyway. And so key management becomes a really tricky problem almost immediately when you start handling sensitive data.

11:11

So what you're telling me is it's more complicated than it might look, which is what it

11:16

is yet, unfortunately is very complicated.

11:18

And there's certain categories of data that I shouldn't encrypt. You know, that's probably it probably feeling of Why should just encrypt. I'm encrypting across the why I should. Just in crypto on our store staff are just are just encrypting Krypton crypt. What? Some of the things are Martin don't

11:31

choose to increase. It's a good question. I'd say that you don't need to encrypt anything that you're comfortable being public. If you have your website, for example, maybe you're serving that out of ah, static s three bucket. You don't necessarily do encrypt your public website because it's right there any website read anyway. But in general, the question I tell people to ask themselves is, if this data was sitting at a pace, been somewhere, would that be okay with you? And if the answer is yes, then maybe you can skip

12:2

encryption. So let's then maybe look at look at when I was operational things that seems to pop up in the news from top time, which is data in logs. And obviously there's daughter in laws you want to see. And there's there's out of its sense, even not sensitive. What's what's the best practice? Do you just encrypt the whole logged your encrypted P I in the log? Do you not love that data at all? What's the best practice

12:22

of the moment? That gets really tricky, And I think that if you're if you're doing a lot of logging of data, which is very common, especially in very large systems where perhaps one of the servers closer to the edge of your network is doing logging of perhaps every request that it becomes the super important, like you said that there's a way to not log certain fields, but even configuring it sucks that you don't log those fields often not enough because it means of that data is still on that server somewhere. It also means that you're just one bad convict push away from accidentally starting to log that data. And we've seen that at some pretty large scales, like he said in the news. And so in that case, if there's data that absolutely should not be long, um, it's much better to encrypt it.

13:16

And you talk to me too bad symmetric and asymmetric encryption. We then sort of tangentially touched on service on encryption. But if the service side encryption means his client side encryption and tell us have it two kind of complement each other

13:29

with, I feel yeah, exactly, and I'll refer back your log question. If you are submitting data to a series of servers and you don't have control over all of them than it could make a lot of sense to use clients that encryption. And so what client side encryption is. It's exactly what it sounds like. You have some clients somewhere. Maybe that client is a server. Maybe it's a user's computer or phone, and it's the act of encrypting it on that client before you submit it to whatever server you're submitting it to. Now this gives you end to end control over the data so that you can encrypt it before you make whatever AP I call you're making you can then send it all the way through the system to whatever you're hoping will decrypt it or store it. And you're guaranteed that the the middle where the systems between those two points, uh, can aunts intercept or camper with your data. So for the most sensitive data client side encryptions up in a very good choice, especially for using a lot of perhaps third parties systems. If you have a very complicated environment where it's difficult to actually control what's being long where or you don't trust all of the nodes in your service, then it makes a lot of sense to do client side encryption.

14:55

Sure, now we're going to talk about some of the tools that are available side of his customers to make it easier to in crew tw. But I guess to contextualize what easier looks like. We probably need to talk about what's hard. So what are some of the issues and costs and complexities that traditionally, if you want to roll that encryption across your environment that you'd have to vice,

15:13

that's a good question. So let's imagine that you wanted Thio. Just do client side encryption. You're sold. You think your data is very important. And for this particular use case, you think you need to do clients that encryption. There's a whole lot software you need to write, and it's, unfortunately, software that's very difficult to get correct. It's tricky stuff. So, like we talked about, you need to pick an appropriate block cipher into thick, inappropriate mode. You have to generate the right number of keys per piece of data.

We haven't talked about something called data keys yet, but it's very common pattern where, as opposed to just having one key that encrypts absolutely all of your data, which makes that one keys incredibly valuable. You end up using these individual data keys for individual pieces of data to the objects or messages or whatever makes sense within your application, so that a lot of details to get right on. And that's of course, very costly because many teams don't have experts who can kind of hit the ground running with that says, a lot of a research required on actually building out those tools is quite expensive.

16:23

And then there's also, I guess, the ministry's side of those those keys of doing best practice like irritating case, so you should change them on a frequent basis. And if you suspect a kid's compromising, you definitely to rotate it all that again, it sounds easy for me to say Rotate the key. But as you get it at all, the systems using the case just gets more, very quickly. So what? What is one of the teams that Ida B is built for our customers to make it easy to, to encrypt and to have a life cycle of encryption.

16:54

So within the eight of us cryptography or GE, we've got a number of Web service is that you've probably heard of. So these are things like Kms is a key management service. There's the Amazon certificate manager or a CM. There's Cloudhsm, which is our hosted HSM offering, and then we also have a number of clients, side tools and libraries, which is what my team specifically works on. So the eight of us encryption sdk is a piece of software that you can use to do client side encryption on all of these work with the various eight of us service is so early early on to call you asked me about kind of Ah, hi. Little recommendation and using kms with the other 18 arrests. Service is like s tree and dynamodb is a really good way to get encryption. Working very quickly on DDE have server side encryption that still works well with all of your existing services.

18:1

And that's very good of Oska's it Easy Jesus of chick check box top experience, which is what we want is one of my security easy to be done by default. And you did touch on the client side. I think I want to dive into a little bit with you because obviously you're team spends a lot of time thinking about it and building customers around. That s t K. And there's always that the classic saying in encrypt our circles of Don't write your own crypto because you're really hot And I guess so Your your team is fundamentally involved in writing Kryptos or at least leveraging crypto. So tell us about give at the domain and some of the work that goes

18:37

into that. You know, I'll share that. I've worked in a lot of different types of teams and building product teams and prototypes. It's really important that you move quickly, you get to market quickly, and we work sort of the opposite of that. We work slowly. We work very carefully. We do a lot of security reviews. Our libraries use some advanced security testing topics like fuzzing and formal verification. So we we try really hard to go above and beyond to make sure that the library's we launch our are safe there, well vetted. We actually have a team here, then a devious cryptography called the Algorithms Group The algorithms Berg is run by someone named Matt Campana. And this is how I say this is where we keep all of our fancy PhD. So we have a whole bunch of mathematicians who studied cryptography very,

very, very deeply, and they know a lot more about it than I do on DSO. These are the people that we really keep around as the experts, and so when we're developing a new library, including the eight of US encryption sdk, we spent a lot of time sitting down with our team of PhD cryptographers and making sure that the protocols are are saying and valid and are gonna do we want. And so I like to think that we really go above and beyond and making sure that the quality bar is very high. Um, it's the kind of thing that would be very expensive for every single software team to have to do it on. So I'm very happy that we have invested in this area

20:20

and that that is Dok. Where does it fit? What sort of languages isn't support on Beacon? Wish your customers be taking advantage of it.

20:27

So we have that on get Hub. This is on open source library that anyone can read the code or contribute to. Right now we have the SdK in Java script Java, sea and python. And we also have a command line interface that allows you to do debugging very easily, just using a standard bash set of tooling. So we have those sort of five environments supported right now. Andi, one of the things my team is doing is working on, and it is many more languages to be natively supported as we can, but if you use one of those four languages, you can use it today on it. If not, you should expect a language to be coming. Hopefully soon. And if you don't see the language that you want, please let us know when I get home so that we can weaken care about

21:17

exactly. And that and that s decays provided free of charge. And it's released under the Apache losses. So I make it easy for people to use security. And I'm sure, Ron, you and the team want to get as much involvement in Faye back through the gate, help issues

21:32

as well. Absolutely. We're always happy to hear from people and our get have issues of the right way to do it. We have someone on call who make sure that they're looking at these things. And we're always happy to hear from customers

21:44

and testicle. Ron sounds like if you want to encrypt it, it's pretty easy to get started on DNA. You feel the fundamentals means your journey can be a little bit smoother. Excellent. Makes everyone for listening. We do love to get your feedback Edibles podcast that amazon dot com is a place to do that. And until next time,

powered by SmashNotes