ACM.53 Choosing a programming language for short term and long term projects
This is a continuation of my series of posts on Automating Cybersecurity Metrics.
Sometimes a comment completely derails what I was planning to publish next. This is a topic I was thinking about but hadn’t wanted to stop to write it just yet. Then a person made a comment on LinkedIn about my choice of programming and why wasn’t I choosing an “energy efficient programming language?” The comment piqued my interest and prompted me to write about it sooner than later.
There are so many things I need to cover and complete and I hate disrupting the current flow, but this person’s comment should be addressed for multiple reasons. One being the manner in which the person made the comment. The other has to with world events right now and also, curiosity that too often distracts me. =)
This person is in Europe, and Europe is facing a huge problem with energy due to the Ukraine war if you haven’t been following, so energy efficiency may be top of mind. Here in the United States, a law was recently passed that allows people to get all kinds of credits for all manner of energy efficient home upgrades and cars to try to get off our reliance on non-renewable energy sources.
With the issues going on related to blocking the flow of energy to Germany from Russia, for example, it seems that it would be a good idea to do what we can to conserve and improve our efficiency in use of energy.
Don’t hate at me if you are a US Republican for promoting renewable energy. I am an Independent, have voted for both parties in the past because neither of them get it right for my point of view, and I hate politics. I’m just trying to solve problems and cutting down on energy use or moving to renewables seems like a good idea for the short and long term. If you disagree, I'm not here for that. I'm here to address this person's comments as to why I did not choose an "energy efficient programing language." Also some considerations for choosing a programming language in general.
Energy efficient programming languages
I looked into the concept of energy efficient programming and found a link to a report. I’m not clicking on the actual report just yet because I’m a tad busy and need to evaluate the contents of the files from sites I don’t recognize in a secure environment.
I found a summary on a site I recognize:
Having developed a lot of different applications and also using different tools that have wildly different consumption requirements when penetration testing I was curious about the choice of code used for the tests. For example, when I’m performing a penetration test activity like brute forcing passwords that will more CPU power. When I’m testing DOM XSS that may be loading multiple headless browsers behind the scenes and consuming tons of memory. I’m to the point where I use different AWS EC2 configurations for different types of tests.
It looks like they pulled this code from some benchmark tests. This code comes with the following caveats from the developers:
The developers themselves highlight the fact that those doing research should exercise caution when using such microbenchmarks:
My curiosity leads me to the following questions:
- Has the report been peer-reviewed and validated? A developer on one of my teams once told me that he benchmarked and found that python is faster than Golang, which I was positive was inaccurate. Another developer pointed out the flaw in his logic which skewed the results.
- How much is the actual amount of energy savings in terms of a real-world application. Is it significant? I am sure it could be. I am also sure it varies by architecture of the application. For what I’m implementing currently in Lamba, I really can’t image there would be any significant difference in my case — yours may be different.
- Could the code used in this test be further optimized to reduce energy usage, the same way you optimize programs for memory usage and performance? Perhaps that was taken into consideration in this report. But perhaps a language lower on the list could be optimized for energy efficiency over memory or CPU usage. You’ll have to read it yourself to understand the details if you want to explore that path.
- This test does not take into account the overall application architecture. Sometimes you don’t get a complete picture of an application’s performance until you put all the pieces together and test them as a whole. Although I understand the concept of Big O Notation I never liked dwelling on it too much. (OK honestly, I don’t like it at all. Though I understand the concepts of a hash retrieving data faster I would just like to speak in plain English.) Also, generally once you start writing your code and do an actual POC you will find issues you didn’t think of and get much more accurate performance results that take into account things like network latency and retries.
- Would the energy efficiency change for different applications depending on what type of hardware they run on? For example, maybe Java is more energy efficient on one architecture and Golang is more energy efficient on another and vice versa.
Although I thought of the above questions, I imagine the energy efficiency list is generally good information with some potential variance in different scenarios. Python is a less efficient language so I would expect that in general, faster languages would be more energy efficient.
But here’s what is also an interesting concept. C and C++ are in the top five for energy, time, and memory. So we should all just switch right now to one of those two languages because they seem like the best. Right? Is that what people are doing?
Choosing a language that helps you prevent security mistakes
Why don’t people use C/C++ for everything? Because it’s very complicated to get your code right and keep it free from errors, that’s why. It’s a super powerful tool that allows you to squeeze every drop out of your compute resources but that comes with a cost. Don’t make a mistake with pointers. Beware of all kinds of security problems that other languages protect you from such as buffer overflows, concurrency mistakes, and protection from formatting attacks to name a few.
I’ve programmed in Java for over 20 years. Java got a bad rap, I believe, due to the browser plugin that was the source of many, many, security problems. Although Java has had security issues and many of them with that browser plugin, out of the gate it helped many organizations limit the number of buffer overflows in their code compared to C/C++. That was one of the biggest security problems in code at the time. It also runs on multiple platforms.
I have also written about my interest in Golang, a language that can help prevent concurrency problems — which are very hard problems to troubleshoot and can lead to various types of security attacks:
I considered using Golang for my code in these posts and I still might but why didn’t I start with Golang? It’s a more efficient program so my code will probably run faster (though you should not make assumptions and always benchmark your code).
I’m also really curious about Rust.
Code for learning
Why did I choose to start with Python?
Sometimes when trying to teach people something, you choose something that is accessible and easy for them to learn. When I wrote my first automation framework a long time ago, which I greatly improve in my class labs, I wrote it completely with the AWS CLI and Bash, of all things. Why?
Because every security class I took was heavy in bash and command line scripts. My target audience was security professionals, and I figured it would be easier for them to grasp the concepts using bash. It’s not how I would do it or will do it ultimately in this blog series because I think many security pros are past that point now.
But at the time there were no cloud security classes from SANS or anywhere else. I wrote it to show security professionals what is possible in the cloud and how to deploy a security appliance in the cloud at the time when most of them were saying cloud would never happen and it was not secure. If the language was over their heads they wouldn’t use it or see the point. This code isn’t great and was revised for my security classes to avoid certain security flaws with the way it is implemented, but it might help people who are familiar with Bash get used to AWS.
I mean, if you want to be energy efficient and write the most performant code possible, why not use Assembly? I bet it’s energy efficient. People don’t write programs directly in Assembly because it’s hard to read and complicated to do. It would be very error prone and likely many security flaws would exist in beginner code. Could I somehow make Assembly code work for my purposes? Perhaps I could get a bare metal instance. I’m not sure but I don’t really care because I’m not going to do that. Obviously.
I’ve had to deal with Assembly code in reverse engineering malware classes and advanced penetration testing classes. In my case, I can find a lot of bugs without going that deep, but if I ever have time I might get into that more just because it’s interesting and fun. There are so many security problems at higher levels, I figure we can start there for the clients I generally serve.
Python is easy to read and learn. The Python SDK has been around a long time and is probably one of the most fully developed AWS SDKs along with Java. One of the complications for teaching programming these days is all the configuration and metadata surrounding the actual code. That’s one of the nice things about AWS Lambda and Batch. It abstracts some of those things away. But it’s still a bit easier to learn programming in Python. Although I highly recommend type-checking, it can be cumbersome for beginners, as is compiling code. I’m slowly introducing things akin to design patters as we go rather than dumping them on my readers out of the gate.
By the way — I volunteered to help kids learn to program once but they were using some kind of GUI drag and drop thing. I learned to program in TI Basic from a book at the age of 12. I couldn’t do it. I had to bail out. If those concepts are good for very young kids that’s awesome, but I can’t teach programming like that. I want to teach actual code and I think it is possible — if you start with the right programming language and concepts.
Python is widely used. If someone learns Python there’s a good chance that will be applicable for the next security or development-related job to which they apply. Rust and Golang are up and coming. Java and C# would be solid choices. But Python is everywhere.
I always tell people when asking me what programming language or technology they should learn — Go look at job postings. Find the companies where you want to apply, see what technologies they are using, and learn that. If you wanted to apply at Google perhaps you would learn Golang. If you were going to try to get a job at Facebook you might opt to learn React. Maybe for AWS you would learn Java, but it would depend on which team you wanted to work on at AWS. Python is one of the most widely used languages and will appear in many job postings, so it is a good place to start.
Java would also be a pretty good choice for an application running applications on AWS, but I don’t want to have to explain all the concepts that I’d need to explain just to get started. Java is going to be more verbose. I’d have to explain types out of the gate…which I am about to address in relation to a post I just wrote about XSS in a Lambda function (stay tuned) but I wanted to start without it. I’ve addressed types in this blog series on secure software programming.
Java is going to be more verbose and require extra characters. Python is just easier to read and more compact when getting started. Case in point:
public String handleRequest(Map<String,String> event, Context context)
The nice thing about Java in a Lambda function or Batch is that AWS will include the base libraries you need to run in those environments so it’s not quite as much of a hassle to get configured and set up I’m guessing. But you still have to compile your code according to this blog post:
But alas, there’s no Java compiler in AWS Lambda, so you have to upload a compiled function.
Compiling is sometimes good in a way and we will explore building and deploying Docker containers in upcoming posts, but for my small Lambda functions and the quick code I need to write, Python seems sufficient for now.
Usually when I’m compiling little security Lambda functions that are run only occasionally I am not sure the energy efficiency savings would be worth the time and overhead to switch languages — UNLESS I automate everything and start increasing usage which I am working towards. Then the picture changes. I suppose if you add my little Python Lambdas running occasionally to everyone else’s it could add up. I’ll let someone else do that calculation.
That said, I wonder if changing to a renewable energy source would be even better than trying to spend a lot of time re-writing all the code in the world. I'm making the same choice in my house. Upgrade my electricity box to support an electric water heater and pay huge amounts for electricity versus installing gas - or is the balance changing now due to the state of the world? Will gas be more than electric sooner than later? I could also install a heat pump water heater (and get a big rebate) but it takes up a huge amount of space. Or what if I just install solar? Then my energy costs reduce dramatically so the cost of electricity becomes a lower priority part of the equation. Choices to evaluate to determine the most cost-effective short term and long term solution. I don't have that answer yet. But what if all the data centers in the world switched to alternative energy sources? You can find out what Amazon is doing in the renewable energy space here:
Java would be a faster language in terms of time (and apparently more energy efficient). It’s not stellar on memory and that’s an issue in Lambda functions potentially that run with limited resources. Python actually beats Java in the above list when it comes to memory. Memory is the key driver or Lambda performance and increasing the amount of memory you need increases the cost.
Although Python is better from a memory perspective, I will be honest and tell you I primarily chose it from the learning and speed of development perspective. Java used to be the fastest programming language for me to write but I’m not sure if that is still true. But in general environment setup takes longer. I wrote about that back in 2010 and mind you — that was when I had 12 years less experience than I do today and Java has changed a lot since then:
Here’s the Java SDK if you want to try it out.
As mentioned I considered using Golang and I still might eventually. If you weed out the list above to what is actually heavily in use today (based on job descriptions and malware reports) Golang makes the top 10 in all three categories, an it is number one for memory consumption.
I mean, I took a college class where we used Pascal that is number one for memory, but when is the last time you saw it in use in a modern production application? I’m sure some nerdy person out there will tell me there is one but realistically, Pascal is a dead language — like Latin that my mother forced me to take in high school because she said it would help me but I still don’t really see the point. The one thing I remember is a Latin phrase scribbled in my used Latin book by a former student — semper ubi sub ubi (Always where under where). But I digress.
When I used Golang in the past, it seemed like the methods for managing libraries in the past wasn’t fully developed. I found deployments and package management to be cumbersome. I’m sure that’s better now but I haven’t had time to complete my blog post on the topic. There would be some additional overhead for me to use due to the learning curve and figuring out how I want to organize my code, but I may still write some code in Golang later.
As for Rust, I need to get this series out fast and the key points are the security concepts, not how fast or energy efficient the language is. I don’t know Rust yet and the overhead to learn it in conjunction with what I am already trying to get out would just slow down everything. I basically don’t have time to use Rust immediately. Sometimes that is a valid choice. But rust is very interesting as you can see it is all three categories across the board in the list above.
I used .NET extensively over the years (and the precursors before it was .NET or ASPX or C#— ASP and Visual Basic). The current implementation of C# is much better than some past iterations. Depending on what constructs you use there are some protections for memory leaks like leaving database connections and files open — something I saw a lot with Java over the years. I find a lot of security issues when pentesting ASPX for whatever reason. It may be a fine choice. I’m just not using it as I tend to like programming languages that run on Linux. It would be a solid choice for someone already familiar with the language. AWS even has a blog dedicated to C# on AWS:
I wrote about using .Net on Lambda here:
I’m going to stick with languages that work in Lambda for now so I’m restricted to what’s available in those runtimes but once we move into using containers, the sky’s the limit. We can write and test and benchmark all sorts of different code to our heart’s content to see what works best in terms of cost and performance. Unfortunately we won’t be able to see how much energy we have consumed on AWS.
If your priority is that you want to use an energy efficient language from the list above, go for it! You can convert any Python in this blog series to any other language you want. You can even translate this code and architecture to another cloud platform.
The point of this series is: SECURITY. I’m trying to explain and demonstrate how to think about security, a topic I wrote about when I first started teaching my cloud security classes:
In many cybersecurity classes you learn bits and pieces. I’m trying to pull all those bits and pieces together into a complete picture in my posts and code:
And to be honest, I’m trying to get it done fast. This is pretty much like a POC and I’m using whatever is fastest to make my points. But ultimately it may morph to something completely different. For example, I don’t love using all the bash I’m using for deploying code. However, when you start with nothing, you have to start somewhere. I have a vision for how all that code will completely change if I can get it done in time.
This cloud security architecture is still a work in progress at the time of this writing. Follow for updates…
If you liked this story please clap and follow:
Medium: Teri Radichel or Email List: Teri Radichel
Twitter: @teriradichel or @2ndSightLab
Requests services via LinkedIn: Teri Radichel or IANS Research
© 2nd Sight Lab 2022
All the posts in this series:
Need Cloud Security Training? 2nd Sight Lab Cloud Security Training
Cybersecurity & Cloud Security Resources by Teri Radichel: Cybersecurity and Cloud security classes, articles, white papers, presentations, and podcasts