AWS open source news and updates No. 30
July 27th - Instalment #30
Week No.30 and another week of great projects and posts. We have a new open source builder story, the usual selection of projects from robotics to compliance/governance tools and blog posts that cover topics from machine learning to security and serverless. Another good selection of cases studies and customers who are using open source to make a difference as well as some events you should check out and add to your diary.
Open source builders - Justin Garrison
Last week I had the chance to chat with Justin Garrison, developer advocate in the containers team at AWS about his journey in open source. You can listen to the podcast, it is certainly worth listening to as Justin shares some real gems that are relevant to all builders thinking about taking their first steps in open source.
Read more in this post, Podcast - open source builders: Justin Garrison
If you want to take part then please contact me as I am currently organising recording of the next batch.
Event for your diary
A selection of open source related events happening this week and later in August. If you have any open source events you want me to include, let me know and I will add it.
Happening later this week…
Spack Tutorial on AWS July 28 - July 29, at 3PM BST (4:00PM CEST, 7:00am PST)
Spack is an open source package manager that simplifies building, installing, customising, and sharing HPC software stacks. In recent years, Its adoption has grown rapidly: by end-users, by HPC developers, and by the world’s largest HPC centres. Spack is also used to build reproducible scientific workflows in AWS.
This event is broadly targeted at HPC users, developers, and user support teams. There’s something for everyone, from academia to national labs to industry.
Databricks machine learning workshop 30th July, at 3PM BST (4:00PM CEST, 7:00am PST)
A date and time for your diary, on 30th July, at 3PM BST (9:00am CDT) Databricks are running a workshop on Unifying Data Pipelines and Machine Learning with Apache Spark™ and Amazon SageMaker. This event will cover:
- building scalable and reliable pipelines for analytics
- a look at Apache Spark and Databricks
- training a model against data and learn best practices for working with ML frameworks (i.e. - TensorFlow, XGBoost, Scikit-Learn, etc.)
- see how to tack experiments in MLflow, share projects and deploy models in the cloud with Amazon SageMaker
Cloud Robotics Summit August 18th-19th, starting at 5PM BST (6:00PM CEST, 9:00am PST)
Join technical experts from across the robotics industry for a complimentary educational event. We’ve designed our program to help you learn best practices and the latest technology for robotics application development. Check out our schedule of sessions hosted by AWS Robotics engineers and solutions architects with guest speakers from the Open Robotics, ROS-Industrial Consortium, iRobot, and Labrador Robotics.
Kubecon August 17th, 8:00am PST August 19th, 9:00am - 5:00pm ASIA/Shanghai APAC edition August 24th, 9:00am - 5:00pm CEST EMEA edition
Start off your KubeCon 2020 with AWS at Container Day on August 17th. In this full-day virtual event, we’ll cover how Amazon EKS makes it easy to deploy, manage, and scale containerised applications using Kubernetes on AWS. Virtual sessions throughout the day will consist of technical deep dives, product demos, and product announcements. The AWS Kubernetes team will be streaming on Twitch all day, ready to answer your questions.
Your feedback matters!
I have put together a short feedback survey, which I would ask you to take - it will take no more than 2 minutes. You can access here. Many thanks!
Celebrate open source contributors
The articles posted in this series are only possible thanks to contributors and project maintainers and so I would like to shout out and thank those folks who really do power open source and enable us all to build on top of what they have created.
So thank you to Jonathan Rau, Carpe Data, elpy, Sankalp Jonna, Darkbit, Shawn Swyx Wang, Srujan Panuganti, Pahud Hsieh, Matthew Coulter, Tanmoy Sen, Arun Viswanathan, Ken Wu, Steve Gordon, Hakan Ilter, Prasad Rao, Daniel Hochman, Derek Schaller, Daniel Gomez Jaramillo, Denson Pokta, Venkat Subramanian, Jay Yeras, Jack Ellis and Andrew May.
Thank you to my fellow Amazonians for their contributions: Dario La Porta, Nader Dabit, James Saryerwinnie, Eric Johnson, Hussain Karimi, Muhyun Kim, and Will Gleave, Fabio Nonato de Paula and Haichen Li, Marty Jiang, Paavan Mistry, Michael Hausenblas, James Bland and Luke Wells
Make sure you find and follow these builders and keep up to date with their open source projects and contributions.
Latest from open source projects
clutch is an open source project from Lyft that provides an extensible platform for infrastructure management. Clutch provides everything you need to simplify operations and in turn improve your developer experience and operational capabilities and has several out-of-the-box features for managing cloud-native infrastructure, but is flexible enough that it can run where ever you need it. The project is extensible, has been designed so it can easily discover resources through it’s resolver pattern and should be easy for you to get up and running quickly.
If you want to know more, check out the announcement post below as well.
Electric Eye 2.0
ElectricEye is an open source project that looks to keep you safe by monitoring your AWS environments, and looking at specific configurations and how the match up to best practice. I first talked about this project back in No.12 but last week project maintainer Jonathan Rau announced v2 of this project. Via his post on LinkedIn, Jonathan Rau says about v2:
The new version brings the total check count to nearly 250, adds CLI capability to run ElectricEye practically anywhere you have Python installed and adds an integration into the DisruptOps platform.
Check out the post on disruptOps from Jody Brazil at DisruptOps if you want to know more about this project and some of the folk who are contributing.
Scalambda is an open source project from Carpe Data to make it easier to deploy lambda functions for Swift projects. Using Scalambda, you can enable developers to easily build and deploy their own Lambda Functions (and/or ApiGateway instances) with little to no effort or knowledge of AWS required. Make sure you check out this reddit thread too, that provides some additional guidance for optimising cold start times when using Swift.
amazon-keyspaces-toolkit this open source project provides docker image for Cassandra Query Language Shell (CQLSH) to help you use CQLSH with Amazon Keyspaces for functional testing, light operations, and data migration. The container includes configuration settings optimised for Amazon Keyspaces, but will also work with Apache Cassandra clusters. Amazon Keyspaces is a scalable, highly available, and managed Apache Cassandra–compatible database service.
cdwatch is an open source project from elpy that provides a small python wrapper for AWS codedeploy to watch your EC2 deployment until completion and provide output similar to what you’d see in the AWS console. cdwatch will watch a codedeploy EC2 deployment until completion, exit appropriately and provide terminal output (including diagnostic information on failure).
AWS Check Versions Dashboard
AWS Check Versions Dashboard this open source project provides a nice dashboard into your own AWS environment and allows you to provide dashboard giving you insights into some of the open source based services you are running. What to know what different flavours of Amazon RDS you are running? This project provides you will some lambda functions that extract information and then provide a Quicksight dashboard. Expect more updates from the team over the coming months - if there is anything you would like to see added/included, then make sure to raise an issue ticket and the team will then pick it up.
aws-recon is an open source project from Darkbit that provides a way for you to collect information about your AWS environments. The team behind the tool were looking for a way to efficiently collect large amounts of AWS resource attributes, and created this as a result - a multi-threaded AWS inventory collection tool written in Ruby (so you will need Ruby to get this project up and running)
ELSA this open source project from Srujan Panuganti provides everything you need to build a robot that can explore, localise, map simultaneously and act (hence ELSA). If you have a Raspberry Pi and Arduino that you do not know what to do with, then perhaps this is the perfect summer project.
React photo sharing app
Build a Photo Sharing App with React and AWS Amplify is a new workshop from Nader Dabit where you will learn how to build a full stack cloud application with React, GraphQL, & Amplify. This workshop will cover GraphQL API with AWS AppSync, Authentication, Object (image) storage, Authorisation, Hosting and cleaning up the project when you have completed the workshop.
El Chapo is an open source project from Sankalp Jonna that creates a serverless url shortener. He has created a walkthrough of the project via a blog post, Meet El Chapo - an open source & serverless URL shortener to help you get your own version of El Chapo running if this is something you need. This project uses Zappa, an open source project that makes it super easy to build and deploy server-less, event-driven Python applications (including, but not limited to, WSGI web apps) on AWS Lambda + API Gateway. Think of it as “serverless” web hosting for your Python apps.
cdk-serverless-lamp this project from Pahud Hsieh is a contender for project of the week for me, as it continues a current run of PHP related projects and posts. This time though, Pahud has created a JSII construct lib to build AWS Serverless LAMP with AWS CDK. What does that mean? Well it means you can now with a few lines of code in CDK, automate your PHP framework (Laravel)
amplify-vscode-snippets is an open source project from Shawn Swyx Wang for VSCode users that are working with AWS Amplify. Shawn has put this project together with the aim to speed up your development by scaffolding out commonly used code snippets for Amplify CSS, UI Components, API/DataStore calls, and GraphQL Transform directives. Check out the short demo.
Check out his dev.to post introducing this project.
parliament from Duo Labs, is an open source project that provides a linter library you can use when reviewing AWS IAM policies. This linter library helps looking for problems such as: malformed json, missing required elements, incorrect prefix and action names, incorrect resources or conditions for the actions provided, type mismatches and bad policy patterns. I missed this when it was released a few months ago, and found it thanks to this very nice blog post written by Dustin Whited, Analyzing IAM Policies at Scale with Parliament that provides a nice introduction and walkthrough on how to use it.
RStoolkit is an open source project that provides a set of scripts that will help you ensure that your Amazon Redshift clusters are in optimum health. A set of 28 checks that help you with things like integrity, performance and mis configurations. If you are running Amazon Redshift, you should definitely check these out and see how they can help you.
Fresh blog posts for your reading pleasure
Using the AWS Serverless Application Model (SAM) this post from Andrew May compares how AWS SAM and AWS SAM CLI compare with the Serverless framework for deploying a .NET Core application. As Andrew notes, many developers are looking at and comparing the different frameworks available when developing their serverless applications, so this walkthrough and side by side comparison may help you as you decide which tool you want to use. The post also servers as a good introduction into AWS SAM and AWS SAM CLI to boot.
A 1 year review of Laravel Vapor in this retrospective from Jack Ellis, he takes a look at what running Laravel Vapor on serverless has meant over the past 12 months. If you are a PHP developer looking to modernise or exploring how to use PHP in the serverless operating model, then this is a great place to start. If that is not enough to entice you to read this post, then maybe this line from the post will:
Getting a project set-up on Vapor is a piece of delicious cake
From how much to how easy and how reliable, this post will tell you everything you need to know. Hopefully this will drive up your viewing numbers Jack and you can reclaim your top spot from Paul.
Reducing your costs with aws-nuke
AWS cost reduction with aws-nuke - in this post from Arpit Jain, you will see how he has used the open source tool called aws-nuke (in No.11 I shared a similar project called cloud-nuke) to clean up resources from AWS accounts. The post talks about the pitfalls of this approach, as does the project - this is a destructive tool, so proceed with caution. There are a number of use cases where a tool like this is helpful, so take a look at the post and the project which provides a lot of detail on how to use it.
WARNING! This tool is destructive, so you should understand what you are trying to achieve before using this tool
ASP.NET Core and AWS Elastic Beanstalk
Deploy web applications with ASP.NET Core and DotVVM on AWS Elastic Beanstalk - nice walkthrough from Daniel Gomez Jaramillo at DotVVM that shows you how to take a sample .NET application and migrate it to AWS Elastic Beanstalk using the AWS Toolkit for Visual Studio and DotVVM. If you are a .NET developer exploring options for how to deploy on AWS, then take a look at this post.
Pronto! Intuit Releases First Open Source Cassandra Cluster Manager - last week I talked about this new project from Intuit, DSE Pronto, that provided an automation suite that you can use to deploy and manage DataStax Cassandra clusters in Amazon Web Services (AWS). This post from Denson Pokta goes into more details as to what this project is, why they created it and their commitment to open source.
Prometheus and AWS CloudWatch
Prometheus: yet-another-cloudwatch-exporter — collecting AWS CloudWatch metrics this post written by Arseny Zinchenko shows you how to integrate AWS CloudWatch metrics into Prometheus using ‘yet-another-cloudwatch-exporter’ - there have been a few of these walkthroughs over the past few months, and this is another one that will hopefully make it easy for you to implement if this is what you are look to do that.
Introducing RStoolKit — RedShift Cluster Health Check and Optimization in this blog post from Bhuvanesh, he introduces the RStoolkit (See projects above) and shows you how you can use it to run health checks against your Amazon Redshift clusters.
AWS open source posts
Using cost allocation tags with AWS ParallelCluster - Dario La Porta shows you how you can use tagging to manage cost allocation, forecast spending, and set up billing alarms that trigger on defined budget thresholds with your HPC workloads. AWS ParallelCluster is an open source cluster management tool to deploy and manage HPC clusters in the AWS Cloud, and this post will help you can also analyze usage to reduce cost or optimize price and performance.
Snyk, Atlassian Bitbucket and Amazon EKS
Securing Amazon EKS workloads with Atlassian Bitbucket and Snyk - Jay Yeras, Head of Cloud and Cloud Native Solution Architecture, Snyk, and Venkat Subramanian, Group Product Manager, Bitbucket and James Bland from AWS show you how to ‘shift left’, the importance of building security into your development workflows and shows you how to setup secure pipelines using Atlassian Bitbucket and Snyk.
Jenkins on Amazon EKS
Deploying Jenkins on Amazon EKS with Amazon EFS - this post from Luke Wells walks you through how to easily deploy Jenkins on Amazon EKS with Amazon EFS. If this is something you are exploring, then take a look - it might save you a lot of time/effort.
Gluon and Apache MXNet
Deploying custom models built with Gluon and Apache MXNet on Amazon SageMaker - Hussain Karimi, Muhyun Kim, and Will Gleave collaborate on this post to show you how you can use Amazon SageMaker to efficiently deploy models training with open source frameworks such as Gluon and Apache MXNet, and incorporate these within your own web applications.
TensorFlow on Inf1
Deploying TensorFlow OpenPose on AWS Inferentia-based Inf1 instances for significant price performance improvements - Fabio Nonato de Paula and Haichen Li walk you through how to use some open source machine learning tools (TensorFlow and Openpose) and optimise the deployment of the models produced using AWS Neuron to fine tune its inference performance for AWS Inferentia based instances. You will have to read the post to find out how this resulted in a significant cost saving.
Alexa controlled Robots
Build an Alexa Controlled Robot with AWS RoboMaker - Marty Jiang talks about his project to integrate the Alexa Skills Kit with AWS RoboMaker to provide a way for robot developers and manufacturers to build a natural voice interface for their robots.
CIS Amazon EKS Benchmark
Introducing The CIS Amazon EKS Benchmark - last week I shared a short video from Aqua Security’s Liz Rice showing kube-bench (video is also included in this post). This week we have a post from Paavan Mistry and Michael Hausenblas that talks about a new Center for Internet Security (CIS) benchmark for Amazon Elastic Kubernetes Service (EKS). This new benchmark is optimised to help you accurately assess the security configuration of Amazon EKS clusters, including security assessments for nodes to help meet security and compliance requirements. You can use the CIS Amazon EKS Benchmark to accurately assess the security configuration of Amazon EKS cluster nodes. As a bonus, they also include links to other CIS benchmark resources for web frameworks and Amazon Linux/Amazon Linux2.
AWS SAM CLI
The AWS Serverless Application Model CLI is now generally available - Eric Johnson writes about the recent GA of the AWS SAM CLI, something I covered last week. The AWS Serverless Application Model (AWS SAM) is an open-source framework for building serverless applications. Built on AWS CloudFormation, AWS SAM provides shorthand syntax to declare serverless resources. During deployment, AWS SAM transforms the serverless resources into CloudFormation syntax, enabling you to build serverless applications faster. As a companion to AWS SAM, the AWS SAM CLI is a command line tool that operates on AWS SAM templates. It provides developers local tooling to create, develop, debug, and deploy serverless applications and offers a rich set of tools that enable developers to build serverless applications quickly.The post walks you through how to use this.
Configuring custom domain names with AWS Chalice this post from James Saryerwinnie show you how to associate your own domain name with a REST API, a feature that was added in version 1.16.0 of AWS Chalice. AWS Chalice is a framework for writing serverless applications in Python.
Customer stories and case studies
Announcing Clutch, the Open-source Platform for Infrastructure Tooling - in this post, Daniel Hochman and Derek Schaller announce the official unveiling of Clutch, an extensible platform for infrastructure management which is used by Lfyt engineering teams and helps them to build, run and maintain their developer workflows with the appropriate safety mechanisms and controls. This post walks you through the architecture and how this works, they key features and why they felt they needed to create Clutch.
Liberty IT Adopts Serverless Best Practices Using AWS Cloud Development Kit - Matthew Coulter, Lead Technical Architect of Global Risk at Liberty Mutual provides a run down over how Liberty took the AWS Cloud Development Kit (CDK), the AWS Well-Architected framework, and the Serverless Application Lens for the AWS Well-Architected Framework and built a set of CDK patterns in to help developers launch serverless resources. I shared Matt’s post last week (here) which is worth reading again. This is a great effort and provides organisations with a way to accelerate their teams building consistent, bar raising infrastructure to support their applications.
Halter and the Cow Collar
The Cow Collar Wearable: How Halter benefits from FreeRTOS - Tanmoy Sen and Arun Viswanathan have put this post that appeals to my farming roots back in Galicia (dairy farming and potatoes!). Halter is an agri-tech original equipment manufacturer (OEM) that focuses on cattle herd management. Halter creates GPS enabled, solar powered smart collars for cows. The collar hardware allows farmers to interact with an easy-to-use app to remotely set geographic boundaries for cattle or virtual fences. Farmers use Halter’s system to avoid physically herding cows, maximizing farmer time and productivity.
This post shows how FreeRTOS helped Halter solve a number of challenges when building this solution, and shows you how you could adapt this for your own agri-tech use case.
Caresyntax & Reddis
How caresyntax uses managed database services for better surgical outcomes - guest post from Ken Wu, Chief Technology Officer, and Steve Gordon, Director of Engineering at caresyntax, and looks how they combine both managed open source services from AWS, native AWS services and open source projects such as Redis to create Periop Insight, which is a surgical data analytics solution for today’s busy preoperative leaders to improve their operating room (OR) performance. It is an enterprise software-as-a-service (SaaS) web application that works on desktops, tablets, and phone form factors. Although many modern data-centric applications deal with extremely large volumes of data, Periop Insight’s challenges stemmed from the complexity of the data, not volume. This is a fascinating read, so check it out.
How Kloia Helped GoDataFeed Modernize Monolithic .NET Applications with AWS Serverless - this post from Hakan Ilter, Cloud and Big Data Consultant at Kloia and Prasad Rao from AWS walks you through how enterprises can modernise their existing .NET applications, migrating them to servlerless operational models by deploying them onto AWS Lambda. This post shows how GoDataFeed were able to transform their legacy .NET Framework monolithic application into a .NET Core-based decoupled architecture, with the help of AWS Partner Kloia.
Amazon Elasticsearch Service now supports open source Elasticsearch 7.7 and its corresponding version of Kibana. This minor release includes bug fixes and enhancements. This release improves cluster stability by significantly reducing the amount of heap memory that is needed to keep Lucene segments open. It also delivers faster results when querying time-based indices by filtering out shards that do not contain documents with relevant timestamps. We also added support for regular expressions in Painless scripts.
Amazon Elasticsearch Service domains running Elasticsearch 7.7 include support for recently released features like custom dictionaries, UltraWarm, and cross-cluster search. The release also includes support for features like anomaly detection, and SQL Workbench that are components of Open Distro for Elasticsearch, an Apache 2.0-licensed distribution of Elasticsearch.
CIS Benchmark for Amazon EKS
The new CIS Benchmark for Amazon EKS helps you accurately assess the secure configuration of nodes running as part of your Amazon EKS clusters.
Security is a critical consideration for configuring and maintaining Kubernetes clusters and applications. The Center for Internete Security (CIS) Kubernetes Benchmark provides good practice guidance on security configurations for self-managed Kubernetes clusters, but did not accurately help evaluate the security configuration status for the AWS-managed Kubernetes clusters run by Amazon EKS. Not all of the recommendations from the CIS Kubernetes Benchmark were applicable to EKS clusters as customers are not responsible for configuring or managing the control plane.
Now, the CIS Amazon EKS Benchmark provides accurate guidance for node security configurations for EKS. The benchmark is applicable to EC2 nodes (both managed and self-managed) where you are responsible for security configurations of Kubernetes components. The benchmark provides a standard, community-approved way to ensure that you have configured your Kubernetes cluster and nodes securely when using Amazon EKS.
The CIS Amazon EKS Benchmark consists of four sections; control plane logging configuration, node security configurations, policies, and managed services. The benchmark supports the Kubernetes versions currently available from Amazon EKS (v1.15 - v1.17) and can be run using kube-bench, a standard open source tool for checking configuration using the CIS benchmark on Kubernetes clusters.
AWS Serverless Application Model (SAM) CLI now GA
The AWS Serverless Application Model Command Line Interface (SAM CLI) is now generally available. SAM CLI is a deployment toolkit that also allows you to locally build, test, and debug serverless applications. SAM CLI v.1.0.0 is a stable version recommended for building production serverless applications.
Previously, SAM CLI was available in beta, supported by the docker-lambda emulation images developed by Michael Hart (AWS Serverless Hero). Now, v1.0.0 is supported by AWS provided emulation images. This version also includes new build support for custom AWS Lambda runtimes and AWS Lambda Layers.
SAM CLI enables you to easily build serverless applications using a number of commands, including sam init, sam build, and sam deploy. Using sam build, you can compile your application code and dependencies. To compile Custom AWS Lambda runtimes and AWS Lambda Layers, you can include the BuildMethod property in your SAM template under the function or layer resource. The BuildMethod is either an AWS Lambda runtime or Makefile, which defines a set of tasks to be executed.
Amazon EFS CSI driver no GA
The Amazon Elastic File System (EFS) CSI driver is now generally available. The EFS CSI driver makes it simple to configure elastic file storage for both EKS and self-managed Kubernetes clusters running on AWS using standard Kubernetes interfaces. Applications running in Kubernetes can use EFS file systems to share data between pods in a scale-out group, or with other applications running within or outside of Kubernetes. EFS can also help Kubernetes applications be highly available because all data written to EFS is written to multiple AWS Availability zones. If a Kubernetes pod is terminated and relaunched, the CSI driver will reconnect the EFS file system, even if the pod is relaunched in a different AWS Availability Zone.
With the 1.0 release, the EFS CSI driver now has in-transit encryption enabled by default, helping companies meet their security and compliance goals. Additionally, the driver now supports EFS Access Points, application-specific entry points into an EFS file system that make it easier to share a file system between multiple pods. Access points can enforce a user identity for all file system requests that are made through the access point, and enforce a root directory for each pod.
You might be interested in this…
A choice selection of other posts which whilst not directly related to AWS, cover interesting aspects of open source.
How open source development provides a roadmap for digital trust, security, safety, and virtual work - a look at some of the trends the LF are seeing in open source. I thought this was an interesting comment:
We believe that the broader technology industry can use open source governance models to address more widespread industry challenges that could not be as easily solved with more traditional, proprietary solutions
Managing issues in a large-scale open source project - really liked this post on how the Flutter team triage and deal with issues at scale. Even if you are a small project, understand how to effectively work with issues is one of the basis of your open source developer flow and this post will provide some insights on how to do that well.
Solving technical debt with open source - whilst we often talk about the need to stay close to upstream projects to avoid the burden of managing a fork, in this study with Samsung, it is good to see some real world evidence of this.
Choosing open source as a marketing strategy this is a great post, and provide a good example of the network effects of making your source available: whether that is code that developers can use/modify/run or copy, or the ingredients for beer, open source can be an effective way to increase your reach and build brand.
Share your open source projects
Do you have some content you want to share with a broader audience? We are always looking for guest content for the AWS Open blog. Please get in touch (via comments below) and I would love to speak with you about what you are doing in open source. We are always looking for interesting new content.
The best submissions will get some AWS Credit codes as a thank you.