ACM.57 How to prevent all manner of injection attacks in a Lambda function and other types of system components
This is a continuation of my series of posts on Automating Cybersecurity Metrics.
In my last post I showed you how an unvalidated Lambda parameter could lead to a cross-site scripting flaw depending on how it is used. You probably want me to tell you now how to fix it right?
Well, nothing in security is “simple.” There is no single fix. The fix for this problem will depend on many factors that have to do with what languages you are using, where the values entered by your user may end up and what programs are used to view those inputs.
When I started trying to think about how to answer the question of how to validate inputs it was a bit overwhelming to think about because of all the ways I know how to abuse inputs passed to programming languages but I can basically sum it up like this:
Validate everything and only allow exactly what you expect.
I wrote about that in my secure coding series in more detail or I will if I haven’t already. I’m working on another book…
So what do I mean by validate everything? I mean if you’re supposed to get a number, disallow any text. If you’re supposed to get an email as input only allow a properly formatted email address. If someone is submitting a file, check the byte code header and make sure you’re getting the file you expect. For system paths, make sure you’re getting a valid path and only a path that the users are supposed to be accessing. Same for domain names because if you let me pass in any domain name into a redirect I can possibly inflict one of the most dangerous attacks on your systems available in AWS — depending on how you have your system configured.
What does Python have to say about validating inputs and types?
I’ve seen people promoting the EAPF (Easier to Ask Forgiveness) method for validating inputs in Python. From the docs…
Surprisingly, I’ve seen some posts on technical forums suggesting that this is Python’s preferred approach. Not if you want a secure application! How will you feel “asking forgiveness” after a data breach? I’d rather not.
Alternatively you can use the LBYL (Look Before You Leap) method.
I am sure the authors didn’t mean the above text to sound like it is discouraging this approach in their warning message, but that’s what it sounds like by saying “you’ll have many if statements” and “you might get race conditions”.
- As for the if statements, we can create some common functions to do type checking which is essentially what I did in some recent Python libraries I wrote to reduce bugs and errors caused by type problems.
- In regards to the multi-threaded statement, you shouldn’t be altering or writing multi-threaded programs at all if you don’t understand how to perform proper locking on values and methods within the program to prevent race conditions and unprotected data that can be operated on by the wrong thread. That is not an issue with the error handling or validation approach — it is an issue with a programmer that doesn’t understand how to write code for multi-threaded programs. And you need to understand if you are changing an application whether it is multi-threaded or not.
No type checking in Python
One of the basic things we can do to prevent bugs and validate values is to check that they are the proper type. When I first started parsing JSON in python it drove me nuts because I wasn’t sure when to use a list, a dict or when a string was allowed. OK, maybe I should have read the documentation 🙂 but I just jumped in. I immediately wrote some libraries and type checking to give me an appropriate error message that was easy to understand if I made a mistake.
Python doesn’t have type checking but it has the concept of type hints in the latest version. Will this help us?
Out of the gate at the top of the documentation — kind of.
These are hints, not enforced by Python. You still need to add your own type enforcement on top of this functionality, like I did in my own code.
And as you can see here, now our code is getting a bit more verbose:
It is more complex because we are providing the types the function uses and inputs and returns like the Java code I showed you in this post about which programming language you should choose:
Checking type with type ()
One of the functions you can use in Python to check variable types is the type() function. Pass the variable into this function and it will return the type.
In other programming languages you might need to define the type of your variable before you use it. Python just magically guesses the type based on the value you assigned to your variable.
I used the type method to check if the value of a variable met the type restrictions of the method I was going to call before calling it with some common functions.
Now let’s return to some of my thoughts in the above “which programming language should you choose” post. If you’re going to have to add all this type checking on top of what Python is doing, may be it’s best to just choose another language. Specifying types in advance of compiling code generally improves performance and is why other languages are faster than Python. Adding additional lines of code to perform type checking at runtime will only further add to the load. But for now it’s fast and easy for me to do this POC of what I’m trying to build in Python.
Does Type Checking Help Us with Lambda Function Parameters?
Everything passed into our Lambda function is pretty much a string. If we are expecting a number, we need to convert it to a number. Checking that a string is a string doesn’t help us prevent the cross-site scripting flaw in my last post and other types of injection that try to pass in values that will get executed, cause unwanted data dumps, or redirect to invalid locations.
We need to check the value of what got passed in and reject invalid values. There are a number of ways to do this and some are better than others.
The worst way to try to prevent cross-site scripting (XSS) flaws
The worst thing you can do to try to prevent cross site scripting flaws is to check for specific characters in your code and change them to something else.
- For example, let’s say you check for an ampersand in your code and you change it to ‘&’. I might try to just encode my characters some other way that your program doesn’t recognize as an ampersand.
- Let’s say you are trying to prevent some kind of code injection and every time you find a single quote in a string I pass in, you put a slash (\) in front of it to try to escape it. I’m just going to manipulate the value to double escape your escape character to get around it.
You can read more about double escaping and double encoding on the OWASP website:
Use a library that checks for malicious characters for us
The first thing we could try to do is grab a library off the Internet that is designed to check for malicious values and reject them. That’s one approach but it’s not always a good one. On my very first penetration test through 2nd Sight Lab I was testing for injection attacks and I found a cross-site scripting flaw — in a library included in the application that was supposed to be preventing cross site scripting!
If you are using a trusted framework many of them now have built-in protections for cross-site scripting. Here are a couple of examples:
Sometimes the libraries can help but sometime the data gets passed around between multiple systems and at the point the library checks the input it looks OK but by the time the data is transformed and used by another system it causes a problem.
For example, I was testing a site using Microsoft technologies and it appeared they had some sort of malicious character protection with certain types of encoding but at some point I could manipulate some encoding specific to what C# would accept to try to infiltrate the system with a malicious character.
The number one way to prevent SQL Injection
The number one way to prevent and pretty much eliminate SQL injection is to use stored procedures with parameter binding. Do not formulate SQL code in your application. Ever. And I’m sorry if you love your ORM but please, learn SQL. It’s going to help your application run faster too if you’re using a decent database and optimizing your queries. As for the code inside your stored procedures, stop using exec(). That’s it. Problem solved.
The best way to prevent DOM XSS is to use Trusted Types
I did a webinar on this subject for IANS customers (one of the few webinars I’ve done for them to date). This documentation will help you implement trusted types.
Use a Content Security Policy (CSP)
A content security policy can help prevent malicious code from executing in web site code in a browser — and do not bypass it!
Preventing injection using encoding
One of the best ways to to disallow an input from getting interpreted as executable code is to properly encode it so that it is not recognized as executable code by the program that is processing the data.
The only challenge here is, what type of encoding are you using and how do the components in your application handle that encoding. Different types of encoding may be applicable depending on the language processing your code. See my comment about the C# issue I ran across above.
Also, if you properly encode your data throughout the application and then decode it to display it in a web browser an it contains a cross-site scripting flaw you could still have a problem. You need to think about the flow of your data end to end and where your code might end up.
Use regex or other means to check the format of a value
When you are expecting a phone number, validate the string to make sure it is in the proper format for a phone number. Remember to consider international formats if you need those. If you are expecting an email, validate the format of the value is an email.
Regex or some other form of validating the format of an input can be very tricky to write and sometimes bypassed but it is better than no checking at all.
If I can inject a huge long string I have more options than if you limit me to a few characters. But in some cases, I don’t need much. 🙂
Always validate server-side
I ask my clients when performing penetration tests to turn off rate-limiting because I usually only have a couple of weeks to run my fuzz-testing. I want to find as many vulnerabilities as I can and if I get blocked or time-out repeatedly I might miss vulnerabilities attackers will eventually find.
That’s because attackers are not limited to a 3–4 week penetration test (the type I do). They have all the time in the world to go slow and send a few attacks at a time. However, with no rate limiting at all, an attacker can quickly bombard your site with all manner of flaws and find any bugs that are capable of fining to attacker your site.
Using a WAF or other mechanism to apply rate limiting can slow them down. Of course, it won’t stop them because many attackers use multiple IP addresses and as mentioned, they will reverse-engineer your rate limiting at slow down their attacks to accommodate. But you can employ various mechanisms to spot their attacks in your logs if you are watching and defend against them — hopefully before they find a flaw and get too far.
Consider your logs
In the XSS flaw I found on an AWS penetration test, the output value was in the logs as I explained at RSA 2020:
Use a whitelist
I explained the concept of whitelists in my book at the bottom of this post. Only allow the exact, expected, valid values. This is the best approach if you can do it. This is what I’m going to do in our Lambda function. I will only execute the code if the value passed in matches one of our defined batch job names.
I’ll return a message like “Invalid Batch Job” to the caller if they send a BatchJobName that does not exist, as this function will only be used internally. If it were exposed to the Internet (something I’m going to show you how to prevent in an upcoming post) then I would probably handle the error message differently. Notice that I am not going to reflect the data the user entered back to it! Please don’t reflect data back to users in error messages.
For the time being, I am testing and don’t even have one batch job yet, so I’ll test for the batch job name “BatchJobPOC”. I’m going to hard code my check into the Lambda code temporarily but eventually we need a better place to put that. We don’t want to redeploy our Lambda function every time we create a new batch job, but this solves the immediate problem so we can continue testing.
Note that I’ve changed the status code returned from my function from 200 to 422. Different HTTP error code have different meanings and which one you should use is not always clear cut, but 422 seems appropriate here. The request is well-formed. There’s nothing syntactically wrong with it, but an invalid value got passed in that the system cannot process.
I haven’t tested the above code yet. I also already seem room for abstraction. Do you? Follow my GitHub repo for future updates that will include the tested code.
Get a penetration test
Get a penetration test from a qualified penetration tester to help you find any flaws and vulnerabilities you may have missed. If you’re interested in hiring my company, 2nd Sight Lab, the best way to reach me is on Linked in: Teri Radichel.
XSS is not your only problem!
There are so many security flaws caused by injection I can’t even begin to tell you about all of them, but they are all basically resolved the same way. Only allow valid inputs. Don’t allow code input by a customer to get executed — anywhere. This is easier said that done. Here are a few more resources to help you out:
You can find a lot more types of injection cheatsheets on the OWASP website specific to different types of attacks and programming languages.
Now that we have solved that problem we still have a lot more to do to properly secure our function. Follow for more.
If you liked this story please clap and follow:
Medium: Teri Radichel or Email List: Teri Radichel
Twitter: @teriradichel or @2ndSightLab
Requests services via LinkedIn: Teri Radichel or IANS Research
© 2nd Sight Lab 2022
All the posts in this series:
Need Cloud Security Training? 2nd Sight Lab Cloud Security Training
Cybersecurity & Cloud Security Resources by Teri Radichel: Cybersecurity and Cloud security classes, articles, white papers, presentations, and podcasts