Apache Airflow
-
AWS open source newsletter #204
Oct 22, 2024 | 25 minute read
Edition #204 Welcome to issue #204 of the AWS open source newsletter, the newsletter where we try and provide you the best open source on AWS content. Apologies for the long wait since the last edition, I will have to do better. Thanks for the lovely messages and feedback I have received over the past few weeks, this edition is for you! As always, we have more great new projects to check out, which include projects that surface up your AWS costs in Home Assistant, a tool that you can use to ask questions about your code base that uses generative AI, a git large file storage (LFS) extension that lets you use Amazon S3, and a handy network cost calculator.
- oss-newsletter
- aws open source
- Home Assistant
- Godot
- Valkey
- Keycloak
- Apache Airflow
- MWAA
- PostgreSQL
- deequ
- Kubernetes
- ArgoCD
- OTEL
- Grafana
- Spring Boot
- Amazon Corretto
- ROSA
- OpenShift
- Kubecost
- Amazon Linux 2023
- Karpenter
- MySQL
- MariaDB
- Apache Flink
- Apache Kafka
- OpenZFS
- InfluxDB
- AWS Parallel Cluster
- Lustre
- Prometheus
- Finch
- Ubuntu
- Cedar
-
AWS open source newsletter #203
Aug 27, 2024 | 28 minute read
Welcome to the AWS open source newsletter, the newsletter where we try and provide you the best open source on AWS content. As always, more great new projects are featured in this edition, #203. Projects to check out include: how you can proxy OpenAI requests through Amazon Bedrock, security tools that help you stay one step ahead of bad actors, a way of implementing CDK Pipelines in a less opinionated way, a tool that helps you validate your AWS IAM policies, a toolkit to help get you started with good practices when creating CloudFormation templates, some demo code that demonstrate how you can implement zero downtime updates to your applications, as well as some really cool demos and use cases of generative AI in action (too many to mention, so check them all out!
- oss-newsletter
- aws open source
- Langfuse
- Steampipe
- Ray
- Apache Spark
- AWS Amplify
- Flutter
- Valkey
- O3DE
- AWS CDK
- LangChain
- Kubernetes
- Amazon EKS
- Argo Workflows
- OpenTofu
- Bottlerocket
- Karpenter
- OpenTelemetry
- Apache Flink
- Apache Pinot
- Apache Kafka
- openCypher
- Apache Iceberg
- Apache Airflow
- MWAA
- MySQL
- PostgreSQL
- Deequ
- OCSF
- GraphStorm
- OpenShift
- Amazon EMR
- ActiveMQ
- Red Hat Enterprise Linux
- Cedar
-
AWS open source newsletter #201
Jul 10, 2024 | 28 minute read
Edition #201 Welcome to the AWS open source newsletter, issue #201, your trusted source for the very best open source on AWS content. This weeks new projects for you to practice your four freedoms include generative AI infused projects to help you generate your docs, streamline the setting up of your AWS resources, a new experimental framework for building document based workflows, and a cool demo that showcases how you can use generative AI to help translate American Sign Language.
- oss-newsletter
- aws open source
- PHP
- Apache Airflow
- MWAA
- Node.js
- LLRT
- Kubernetes
- Amazon EKS
- Prometheus
- Grafana
- eksctl
- Valkey
- LangChain
- Project Lakechain
- AWS Amplify
- Itsio
- Apache Iceberg
- Apache Kafka
- Apache Cassandra
- PyTorch
- Apache httpd
- Babelfish for Aurora PostgreSQL
- Apache Flink
- PostgreSQL
- MySQL
- OpenSearch
- OpenZFS
- Amazon Linux
- FreeRTOS
- RabbitMQ
- AWS ParallelCluster
- Open Container Initiative
- Smithy
- Cedar
- sbt
-
AWS open source newsletter #200
Jun 24, 2024 | 20 minute read
Edition #200 Welcome to a milestone edition of this newsletter, number #200!! Wow, it feels like quite an achievement. Before diving into this newsletter, a big thank you for sticking with me. Time has flown by so quickly, and am looking forward to the next 100. As I have done in a few of the previous milestone issues, I wanted to share a few interesting stats from sharing open source projects with you over the past few years.
-
AWS open source newsletter #198
May 28, 2024 | 25 minute read
Edition #198 Welcome to issue #198 of the AWS open source newsletter, the newsletter where we try and provide you the best open source on AWS content. In this issue we feature new projects that provide integration of .NET Aspire with AWS resources, an automated data discovery tool to find data in your AWS environments, a tool to help incorporate good practices when building SaaS solutions, a cost allocation dashboard for your Kubernetes workloads, a project that might help you mitigate costs around Internet Gateway, and a few generative AI demos around food, news, and social media which you should definitely check out.
- oss-newsletter
- aws open source
- Aspire
- Kubernetes
- Amazon EKS
- Leapp
- OpenTelemetry
- AWS CDK
- llrt
- Valkey
- PostgreSQL
- InfluxDB
- High Performance Software Foundation
- Karpenter
- Multus
- Kata
- Grafana
- Prometheus
- Apache Flink
- Zingg
- Apache Hudi
- Apache Iceberg
- MySQL
- Apache Tomcat
- WordPress
- AWS Amplify
- Apache Airflow
- MWAA
- OpenSearch
- Apache Kafka
- Bottlerocket
- Amazon EMR
-
AWS open source newsletter #197
May 13, 2024 | 21 minute read
Edition #197 Welcome to issue #197 of the AWS open source newsletter, the newsletter where we try and provide you the best open source on AWS content. Like in previous editions of this newsletter, we feature new projects for you you practice your four freedoms. We have some great projects, including a sprinkling of repos that look to help you benchmark and assess your generative AI models and agents, a new fruity framework for building document understanding applications, a nice container command line tool that sysadmins will love, a tool to help you migrate your CodeCommit repositories, a really nice application of using generative AI to help automate CVE findings, and a neat generative AI newsletter generation demo.
-
AWS open source newsletter #196
Apr 29, 2024 | 22 minute read
Edition #196 Welcome to issue #196 of the AWS open source newsletter, the newsletter where we try and provide you the best open source on AWS content. As always, more great new projects are featured in this edition of the newsletter, including a link to the Valkey repo, a nice GUI based project to help you build orchestration workflows that uses Apache Airflow under the covers, a tool to help you find signals through the noise of your security logs, a project to help you run serverless tasks in a cron like fashion, a command line runner for Amazon CodeCatalyst, a tool to help you simplify the deployment of Cruise Control on Amazon MSK, a nice Mac client for experimenting with Amazon Bedrock, and some really cool demo apps, the pick of which (for me) is a nice way of surfacing up your Amazon Bedrock models in a way that existing applications that expect an API key can use.
- oss-newsletter
- aws open source
- Valkey
- Apache Airflow
- MWAA
- OpenSearch
- LangChain
- PostgreSQL
- WordPress
- RAGmap
- RAGxplorer
- Cedar
- AWS CDK
- Lambda Web Adapter
- Postfix
- Spring Boot
- Amazon Corretto
- Amazon EKS
- Kubernetes
- Karpenter
- KEDA
- Prometheus
- OPA
- Amazon EMR
- PySpark
- MySQL
- Open JD
- AWS Amplify
- GraphQL
- AWS PDK
- Apache Livy
- Nodestream
-
AWS open source newsletter #195
Apr 15, 2024 | 21 minute read
Edition #195 Welcome to issue #195 of the AWS open source newsletter, the newsletter where we try and provide you the best open source on AWS content. This week I am heading out to Everything Open, and looking forward to meeting the community in Gladstone. I will be talking about Cedar, and showing why it is important and how it works (demo is working lovely now). I am now on the third week of my open source roadshow, which is why I have had to change the publishing of this newsletter to every other week - at least until I get back home.
-
AWS open source newsletter #192
Mar 11, 2024 | 16 minute read
Edition #192 Welcome to issue #192 of the AWS open source newsletter, the newsletter where we try and provide you the best open source on AWS content. As always, this week we start with a round up of some freshly baked new projects for you to practice your four freedoms. A wide variety this week, and we have projects that help you create architecture diagrams from your YAML, visualise and create dashboards for compliance and reporting purposes, a new multi-cloud threat detection tool, a Go implementation of Cedar, an example of load testing your large language models, and more!
-
AWS open source newsletter #191
Mar 4, 2024 | 15 minute read
Edition #191 Welcome to issue #191 of the AWS open source newsletter, the newsletter where we try and provide you the best open source on AWS content. As always, this week we start with a round up of some freshly baked new projects for you to practice your four freedoms. This week we have projects that cover AWS Nitro Enclaves, open source mapping libraries, how to grab secrets into your application configuration files, database performance benchmarking and analysis, improving the logging your applications generate, and a number of very handy tools to help you manage security data from the command line.
-
AWS open source newsletter #190
Feb 26, 2024 | 15 minute read
Edition #190 Welcome to issue #190 of the AWS open source newsletter, the newsletter where we try and provide you the best open source on AWS content. As always, this week we start with a round up of some freshly baked new projects for you to practice your four freedoms. This week we have projects that can help you keep on top of your cost optimisation, a tool to help you automate Well Architected reviews, a tool to help you map out your RDS instances, as well as sample projects and demos.
-
AWS open source newsletter #189
Feb 18, 2024 | 14 minute read
Edition #189 Welcome to issue #189 of the AWS open source newsletter, the newsletter where we search high and low to provide you with the best open source on AWS content. As always, this week we start with a round up of some freshly baked new projects for you to practice your four freedoms. This week we have projects that help you find your RDS instances, automate tasks from your online Chime calls, a very nice visual file browser for your Amazon S3 buckets, a tool to help you track and manage copying files from your S3 storage buckets, an active-active multi region cluster solution for Redis, and more!
-
Using Finch to run Apache Airflow using mwaa-local-runner
Feb 12, 2024 | 9 minute read
I show you how you can use the Finch to run Apache Airflow using the mwaa-local-runner tool, and how you can do this for your applications too As some of you may know, I have been creating content on Apache Airflow for a few years now. One of the open source projects that AWS has produced to make it easier for developers to get started with Apache Airflow, is mwaa-local-runner.
-
AWS open source newsletter #188
Feb 12, 2024 | 15 minute read
Edition #188 Welcome to issue #188 of the AWS open source newsletter, the newsletter where we try and provide you the best open source on AWS content. As always, this week we start with a round up of some freshly baked new projects for you to practice your four freedoms. This week we have freshly cut repos that help you migrate your DNS configurations, improve your prompts when working with large language models, a new lightweight Javascript runtime, some reference code that shows you how you can deploy modern Java applications a number of different ways, and sample repos that show you how you can do remote debugging in Amazon EMR, as well as the usual cool demos that showcase some of the ways you can use generative AI.
-
AWS open source newsletter #187
Feb 4, 2024 | 15 minute read
Edition #187 Welcome to issue #187 of the AWS open source newsletter, the newsletter where we try and provide you the best open source on AWS content. As always, this week we start with a round up of some freshly baked new projects for you to practice your four freedoms. This week we have new projects that help you optimise working with EBS volumes on EC2, a tool to help you document your architectures, a large language model benchmarking tool, a tool to help you optimise your S3 storage files, a data validation framework, and a really nice Java workshop.
-
AWS open source newsletter #185
Jan 22, 2024 | 13 minute read
Edition #185 Welcome to issue #185 of the AWS open source newsletter, the newsletter where we try and provide you the best open source on AWS content. As always, this week we start with a round up of some freshly baked new projects for you to practice your four freedoms. This week we have projects that allow you to export your Partyrock applications, a tool to help you reduce hallucinations in your large language models, a new client for Redis, a tool to help you access the AWS Partner Network, as well as sample projects that look at how you can use large langue models to build a new reader and building pipelines using Cloudformation.
-
AWS open source newsletter #183
Jan 8, 2024 | 26 minute read
January 8th, 2024 - Instalment #183 Happy new year and welcome to the first edition of the AWS open source newsletter of 2024, number #183. The big news for 2024 is the move from dev.to to community.aws as the “home” for the AWS open source newsletter, although it will still be posted on dev.to as well. Let me know what you think, community.aws has some top notch content that many readers might not be aware of.
- oss-newsletter
- aws open source
- Cedar
- Projen
- Prometheus
- Grafana
- Kubernetes
- Amazon EKS
- Mage
- OpenSearch
- AWS CDK
- eBPF
- Istio
- Kubecost
- PostgreSQL
- MySQL
- MariaDB
- ActiveMQ
- Apache Airflow
- MWAA
- Apache Spark
- Ray
- AWS Amplify
- Spring Boot
- Amazon EMR
- AWS Neuron
- Apache Cassandra
- OpenTelemetry
- Amazon Linux
- AWS ParallelCluster
- RabbitMQ
-
AWS open source newsletter #180
Nov 20, 2023 | 20 minute read
November 20th, 2023 - Instalment #180 Welcome to #180 of the AWS open source newsletter, the place for all your AWS and open source needs. As we ramp up to re:Invent, it is good to see that pre:Invent is giving us plenty of open source goodies. In this weeks newsletter, we have some of those in the way of new projects such as res and aws-iatk, but we also have lots of really great content too.
- oss-newsletter
- aws open source
- Ragna
- ezsmdeploy
- SnapStart
- GraalVM
- Amazon Corretto
- Amazon EMR
- Apache Airflow
- LangChain
- AWS Copilot
- Karpenter
- MWAA
- Grafana
- Prometheus
- Amazon EKS
- Kubernetes
- Apache Flink
- Apache Kafka
- Avro
- Apache Cassandra
- Apache Spark
- Red Hat Linux
- Amazon Linux 2023
- NodeJS
- AWS Amplify
- Redis
- MySQL
- PostgreSQL
- eksctl
- MapLibre
- Overture Maps
-
AWS open source newsletter #179
Nov 13, 2023 | 18 minute read
November 13th, 2023 - Instalment #179 Welcome to #179 of the AWS open source newsletter, the place for all your AWS and open source needs. This weeks new projects include an open source tool that provides similar capabilities to AWS Control Tower, a tool for enrolling your Mac based EC2 instances into mobile device management (MDM) solution, a very neat tool to help you compare costs of running your CodePipeline jobs, as well as sample code that shows you how you can use Karpenter to optimise IP address use, examples of using Test Containers with AWS CDK, generative AI technologies such as LangChain, and more.
-
AWS open source newsletter #178
Nov 6, 2023 | 19 minute read
November 6th, 2023 - Instalment #178 Welcome to #178 of the AWS open source newsletter, the place for all your AWS and open source needs. This week we feature more new open source projects for you to practice your four freedoms. We have a useful tool that helps you synchronise your AWS Identity Centre users with the users you provision in your Amazon RDS databases, we share the AWS data solutions framework that helps you build data solutions following opinionated best practices, a resource explorer for your AWS accounts, a guardrails solution for your AWS account, and a couple of demo repositories that take a look at Localstack and RSS.
-
AWS open source newsletter #177
Oct 30, 2023 | 19 minute read
October 30th, 2023 - Instalment #177 Welcome to #177 of the AWS open source newsletter, the Halloween special. You will find no tricks in this edition, only treats, with more new projects for you to check out and content that are a feast for your eyes. This weeks new projects include a tool to help you easily deploy vector databases on Kubernetes, an observability toolkit, a tool to help you benchmark network latency, as well as lots of new demos on generative AI.
- oss-newsletter
- aws open source
- Bottlerocket
- KubeArmor
- NGINX
- Wordpress
- Milvus
- Falcon-40B
- JupyterHub
- Dask
- Flux GitOps
- Crossplane
- Kubernetes
- Amazon EKS
- Babelfish for Aurora PostgreSQL
- PostgreSQL
- Linux
- Apache Hive
- Apache Spark
- Apache Kafka
- Apache Hudi
- Delta Lake
- Apache Iceberg
- OpenSearch
- Dremio
- OpenShift
- OpenCLIP
- Apache Airflow
- MWAA
- Amazon Corretto
- OpenJDK
- AWS CDK
-
AWS open source newsletter #175
Oct 16, 2023 | 25 minute read
October 16th, 2023 - Instalment #175 Welcome to #175 of the AWS open source newsletter, back after recharging in the wonderful countyside of Yorkshire. I am publishing this weeks newsletter from Raleigh, North Carolina. All Things Open is happening this week, and you will catch me at the AWS booth where I will be showing off some cool open source stuff (Cedar, Apache Airflow, and a few others), and I also have a talk on Tuesday.
- oss-newsletter
- aws open source
- Redis
- MariaDB
- and PostgreSQL
- Cloud Native Operational Excellence
- CNOE
- Crossplane
- Apache Airflow
- MWAA
- Apache Flink
- Amazon EMR
- Powertools for AWS Lambda
- Kubernetes
- Amazon EKS
- Prometheus
- Grafana
- AWS Distributed OpenTelemetry(ADOT)
- Karpenter
- CoreDNS
- etcd
- Istio
- SUSE
- AWS-LC
- Spring Boot
- SOCI
- Apache Kafka
- Amazon Corretto
- AWS CDK
- Amazon Linux
- Bottlerocket
- cdk8s
- AWS Amplify
- Stable Diffusion
- NextJS
-
AWS open source newsletter #173
Sep 11, 2023 | 22 minute read
September 11th, 2023 - Instalment #173 Welcome to #173 of the AWS open source newsletter, bringing you all the news and latest projects for AWS developers. This weeks new projects include a Golang based SDK for kernel eBPF operations, a project that helps you to optimise your network performance, a couple of projects for Apache Flink users, as well as a handful of different tools and demos featuring open source technologies helping to drive innovation in generative AI.
-
AWS open source newsletter #172
Sep 4, 2023 | 21 minute read
September 4th, 2023 - Instalment #172 Welcome to #172 of the AWS open source newsletter, your reliable source for all open source on AWS goodness. What do we have for you this week? Well, more new projects to check out, and plenty of fresh content on the open source projects you all love. We have tools to help you export your DynamoDB tables as csv files, a tool that goes beyond tracking cost and actually shuts down resources to help you manage your AWS budget, a cool dashboard to help you stay on top of your EC2 configurations, a couple of useful utilities to simplify working with files on Amazon S3, and then a sample Cedar project that helps you implement a Lambda authoriser.
-
AWS open source newsletter #170
Aug 21, 2023 | 21 minute read
August 21st, 2023 - Instalment #170 Welcome to edition #170 of the AWS open source newsletter, an oasis of open source goodness that features the latest new projects, essential reading, and must view videos to quench the thirst of every open source developer. This weeks edition we have new projects that help you get on top of your IAM actions, a handy tool for knowing what your current AWS account service limits are from the command line, a tool to help you do database migrations, and some interesting and very detailed reference solutions for gaming, live streaming, and managing/exporting of your Amazon Cognito profiles.
- oss-newsletter
- aws open source
- AWS-LC
- Threat Composer
- AWS CDK
- AWS SAM
- AWS SDK for Java
- GitLab
- GraphQL
- AWS AppSync
- AWS Distro for OpenTelemetry (ADOT)
- PostgreSQL
- Apache Airflow
- SBOM
- Syft
- Apache Hudi
- Apache Iceberg
- Apache Spark
- collectd
- Grafana
- O3DE
- ROS
- Next.js
- PostgreSQL
- OpenZFS
- MWAA
- Cedar
- Powertools for Lambda
-
AWS open source newsletter #169
Aug 14, 2023 | 26 minute read
August 14th, 2023 - Instalment #169 Welcome to #169 of the AWS open source newsletter, featuring the latest and greatest open source news, projects, videos, and community content that you need to know about. Featured in this weeks edition we have more great projects, including a new ODBC driver for Amazon Timestream database, a nice tool to simplify your ssh tunnelling, an essential VSCode extension for working with Cedar policies, a couple of projects that help you shift left and validate / monitor your policies, a solution to help you monitor your Apache Kafka environments, as well as some great sample applications.
- oss-newsletter
- aws open source
- OpenSearch
- AWS CDK
- Juypter AI
- dbt
- Apache Airflow
- Managed Workflows for Apache Airflow
- MWAA
- Cedar
- cfnguard
- Grafana
- Prometheus
- Apache Kafka
- Amazon Timestream
- Open Cybersecurity Schema Framework
- OCSF
- AWS Lambda Web Adapter
- Smithy
- Apache Spark
- Linux
- Amazon Linux
- AWS ParallelCluster
- PostgreSQL
- Spring Boot
- Amazon EKS
- Kubernetes
- Mountpoint for Amazon S3
- MySQL
- Lustre
- OpenZFS
- Redis
- Amazon EMR
- Karpenter
- Seekable OCI
- SOCI
- Firecracker
-
A look at airflowctl, a tool to help you manage Apache Airflow projects
Aug 14, 2023 | 10 minute read
I have written in the past about setting up developer environments and tools when working with Apache Airflow. Today I came across a new tool from Kaxil Naik, directory of engineering at Astronomer and all round Apache Airflow good guy. Kaxil has put together airflowctl, a command-line tool for managing Apache Airflowâ„¢ projects, and making it super easy to get up and running. What does it do? Well, it helps you install and use different versions of Apache Airflow, work with Variables and Connections, provide live logs, and more.
-
AWS open source newsletter #166
Jul 24, 2023 | 22 minute read
July 24th, 2023 - Instalment #166 Welcome to #166 of the AWS open source newsletter. As always, we search high and low for the best and latest open source content, and I think you will love what we have lined up this week. This weeks new projects include a library to help you managed and validate your environment variables when working with AWS Lambda, a new Rust based tool for interacting with your S3 buckets, an essential tool to help CDK developers remove a lot of the setup work, and a tool that helps you run Yocto embedded Linux build jobs in AWS.
-
AWS open source newsletter #165
Jul 17, 2023 | 17 minute read
July 17th, 2023 - Instalment #165 Welcome to #165 of the AWS open source newsletter, the only* newsletter that brings you the best and latest open source content. We have some great new projects this week, including a tool for IoT developers to help you validate your SQL statements, a command line interface tool for Amazon Verified Permissions, an Amazon DynamoDB estimation tool, and more. Also featured this week is content on Apache Iceberg, OpenSearch, PostgreSQL, Kubernetes, Power Tools for AWS Lambda, Spring Boot, Babelfish for Aurora PostgreSQL, Karpenter, Apollo GraphQL, JupyterHub, dbt, Apache Airflow, Cedar, and Apache Flink.
-
AWS open source newsletter #164
Jul 10, 2023 | 17 minute read
July 10th, 2023 - Instalment #164 Welcome to #164 of the AWS open source newsletter. As always, we search high and low for the best and latest open source content, and I think you will love what we have lined up this week. New projects this week will help you implement single table designs easily on Amazon DynamoDB, an experimental project to help you get to grips with Cedar, a comprehensive clickstream analytics project for your applications, web sites, and mobile applications, and some cool projects to help you with edge and hybrid use cases.
- oss-newsletter
- aws open source
- Apache Flink
- Apache Airflow
- Kubernetes
- Amazon EKS
- AWS Lambda Powertools
- Spring Boot
- Linux
- Apache Parquet
- OpenZFS
- OpenSearch
- Mountpoint for Amazon S3
- PostgreSQL
- AWS Amplify
- Next.js
- AWS Distro for OpenTelemetry
- Grafana
- Babelfish for Aurora PostgreSQL
- AWS ParallelCluster
- Consul
- Apache Iceberg
- Cedar
- Steampipe
- VS Code Server
- nomad
-
AWS open source newsletter #161
Jun 19, 2023 | 19 minute read
June 19th, 2023 - Instalment #161 Welcome to #161 of the AWS open source newsletter, and another week for fresh, new open source projects and code for you to practice your four freedoms. This weeks projects include tools that will help you create temporary elevated credentials, a new Java library that provides methods for encrypting and decrypting cryptographic materials, an AWS DynamoDB wrapper for Node/TypeScript developers, and a solution to help you find and visualise data assets.
- oss-newsletter
- aws open source
- Falcon
- AWS CDK
- Keycloak
- Cedar
- FreeRTOS
- Apache Airflow
- MWAA
- Apache Spark
- Amazon EMR
- Apache Hudi
- Apache Iceberg
- and Delta Lake
- Apache Flink
- OpenChatkit
- Kubernetes
- Pinniped
- Kubecost
- Karpenter
- ONNX
- Apache Kafka
- Babelfish for Aurora PostgreSQL
- AWS Amplify
- Next.js
- OpenSearch
- Flux
- ArgoCD
- KVM
-
AWS open source newsletter #160
Jun 12, 2023 | 17 minute read
June 12th, 2023 - Instalment #160 Welcome to #160 of the AWS open source newsletter, where we try and share all the important open source news, projects, events, and content that open source builders want. This week we have new projects that include tools to help you build data workflows, Terraform modules to help you incorporate temporary elevated access controls, integrating Tailscale to change your traffic flows, a neat AWS Lambda debugging tool, Go bindings for Cedar, and more.
-
AWS open source newsletter #158
May 30, 2023 | 21 minute read
May 30th, 2023 - Instalment #158 Welcome Hello and welcome to the AWS open source newsletter, #158. I hope some of you were able to catch the last episode of season two of Build on Open Source where we looked at some of the projects featured in this newsletter (specctl, eksdemo, and ec2-spot-placement-score-tracker). As always we pride ourself on this newsletter on giving you the newest, shiniest open source projects and this week we have some really great ones to share with you.
-
AWS open source newsletter #157
May 22, 2023 | 19 minute read
May 22nd, 2023 - Instalment #157 Welcome Hello and welcome to the AWS open source newsletter, #157. Apologies for the lack of newsletter last week, but hopefully this week will make up for that as we have a bumper selection of great open source content for you. This weeks new projects include repos that help you get OpenEMR up and running (“host-openemr-on-aws-fargate”), two new security related open source projects that you definitely need to check out, (“cedar” and “snapchange”), integration of clickstream analytics using Swift (“clickstream-swift”), deployment of Backstage to serve up access to your AWS resources, (“app-development-for-backstage-io-on-aws”), a tool to help you clean up ecs tasks definitions (“aws-ecs-task-definition-cleanup”) and many more.
-
AWS open source newsletter #156
May 8, 2023 | 21 minute read
May 8th, 2023 - Instalment #156 Welcome Hello and welcome to the AWS open source newsletter, #156, the newsletter that just keeps on giving….in this case, keeps giving you brand new open source projects to practice your four freedoms on. So what do we have for you this week? Coming up later in this newsletter we have projects such as “sustainability-scanner” helps you check your Cloudformation templates against sustainability good practices, “synthtable” helps you create synthetic data for different use cases, “neptune-gremlin-client” a Java based Gremlin client, “s3zipper” a tool to quickly download entire S3 buckets, “chataws” a nice demo of how you can use ChatGPT to aid your AWS deployments, and plenty of other great projects.
-
AWS open source newsletter #154
Apr 24, 2023 | 22 minute read
April 24th, 2023 - Instalment #154 Welcome Hello and welcome to the AWS open source newsletter, #154, the newsletter that just keeps on giving….in this case, keeps giving you brand new open source projects to practice your four freedoms on. We have another great selection of projects for you as always, starting off with “cfn-teleport” an essential cli tool for Cloudformation users, “aither” an interesting collaborative development tool using virtualised desktops on containers, “tabular-column-semantic-search” a tool to help you find similar types of data in your data lakes, “resource-lister” and “komiser” tools that help you manage your AWS resources, “resource-utilization” helps you track your AWS resource utilisation, “iot-network-traffic-control-and-load-testing-simulator” an interesting load and chaos testing example, and more!
- oss-newsletter
- aws open source
- Apache Oozie
- Apache Airflow
- Deep Java Library
- DJL
- mwaa-local-runner
- MWAA
- Trusted Language Extensions for PostgreSQL
- Supabase
- PostgreSQL
- Jupyter
- Grafana
- Opus
- Papermill
- Apache Spark
- HiveQL
- Amazon EKS
- Kubernetes
- Amazon EMR
- RStudio
- USBGuard
- Amazon Corretto
- AWS Amplify
- Apache Hive Metastore
- LoRaWAN
- gMSA
- Python
- OpenSearch
- AWS Copilot
- Marten
- Flutter
-
Getting mwaa-local-runner up on AWS Cloud9
Apr 17, 2023 | 5 minute read
Here is a quick recipe if you are looking to get mwaa-local-runner up and running on your Cloud9 developer setup. This might not be the most optimised way, so I am very happy to received suggestions on how to improve this. What I will cover here is how to deploy mwaa-local-runner onto a standard Cloud9 IDE, deployed in a default VPC. Updating my AWS Cloud9 environment The first thing I needed to do was to increase the size of my local disk as Cloud9 only provides 10gb of storage.
-
Exploring Shell Launch Scripts on Managed Workflows for Apache Airflow (MWAA) and mwaa-local-runner
Apr 17, 2023 | 9 minute read
Managed Workflows for Apache Airflow (MWAA) recently launched a new feature that a lot of folk had been asking for, which was the ability to add additional libraries, binaries, or environment variables when launching Airflow workers. If you missed the announcement, Amazon MWAA now supports Shell Launch Scripts, this new capability allows you to easily do this by creating a script and then configuring your MWAA environments to use that script during the start-up phase.
-
AWS open source newsletter #153
Apr 17, 2023 | 19 minute read
April 17th, 2023 - Instalment #153 Welcome Hello and welcome to the AWS open source newsletter, #153 as featured in the latest episode of Build on Open Source (S2E5) . We have lots of great projects for you this week, with a strong chatGPT influence. “pg_gpt”, “cw-logs-insights-gpt”, and “aiws” all integrate chatGPT to help you do different things on AWS, “semantic-search-aws-docs” is a very interesting demo on how to build a more coherent search for your documentation, “aws-chime-chat-demo” a very nice demo using the Chime SDK, “ckia” is an open source AWS Trusted Advisor tool, “AWS_ED” helps you keep your local IP in sync with external DNS records, “cfnctl” provides a Terraform like cli experience to CloudFormation, and more!
-
AWS open source newsletter #152
Apr 10, 2023 | 16 minute read
April 10th, 2023 - Instalment #152 Welcome Hello and welcome to the AWS open source newsletter, #152 an Easter special. This week sees more great new projects including, “redshift-test-drive” a set of essential tools for Amazon Redshift users, “simple-database-archival-solution” a nice tool to help you archive your data, “attribution-gen” a Go tool to help you build open source attribution documents, “aws-glue-data-catalog-federation” a library to help you federate your Glue catalog, “subnet-utilization-monitor-for-amazon-vpc” a handy tool to keep on top of your IP address allocation, “AlexaGPT” a demo of integrating Alexa with you know what, and more!
-
Working with Managed Workflows for Apache Airflow (MWAA) and Amazon Redshift
Apr 7, 2023 | 19 minute read
Working with Managed Workflows for Apache Airflow (MWAA) and Amazon Redshift I was recently looking at some Stack Overflow questions from the AWS Collective and saw a number of folk having questions about the integration between Amazon Redshift and Managed Workflows for Apache Airflow (MWAA). I thought I would put together a quick post that might help folk address what I saw were some of the common challenges. There is some code that accompanies this post, which you can find at the GitHub repository cdk-mwaa-redshift.
-
AWS open source newsletter #151
Apr 3, 2023 | 22 minute read
April 3rd, 2023 - Instalment #151 Welcome Hello and welcome to the AWS open source newsletter, #151. This week sees more great new projects, including those covered in the latest episode of Build on Open Source, such as “ec2-former2” a way to host this great project to reverse engineer your CloudFormation templates, “protonizer” a cli tool for those using AWS Proton, “fortuna”, a library for Uncertainty Quantification, “aws-resilience-hub-tools” a set of tools and scripts for working with the AWS Resilience Hub, “jenkins-unity-build-on-aws” a nice reference solution for those needing to build Unity projects, “amazon-cognito-passwordless-auth” a nice demo of how to do authentication sans password, and more.
-
AWS open source newsletter #150
Mar 27, 2023 | 12 minute read
March 27th, 2023 - Instalment #150 Welcome Hello and welcome to a milestone edition of the AWS open source newsletter, #150. Over two hundred thousand words later, thousands of contributors, hundreds of new open source projects, I hope this newsletter brings as much joy for readers as it does for me to put this together. Thank you all for your amazing support so far. What do we have in store for you this week?
-
Self managed Apache Airflow with Data on EKS
Mar 22, 2023 | 11 minute read
I have written in the past about how you can get started with Apache Airflow using the AWS managed service, Managed Workflows for Apache Airflow. But what if you want to self managed Apache Airflow? When I speak with developers, there are sometimes reasons why a managed service might not fit their needs. Some of the common things that come up include: whether you need the increase level of access, a greater level of control of the configuration of Apache Airflow have the need to have the very latest versions or features of Apache Airflow if you have the need to run workflows that use more resources that managed services provide (for example, need significant compute) Total Cost Ownership One thing to consider when assessing managed vs self managed is the cost of the managed service against the total costs of you having to do the same thing.
-
AWS open source newsletter #149
Mar 19, 2023 | 14 minute read
March 20th, 2023 - Instalment #149 Welcome Hello and welcome to edition #149 of the AWS open source newsletter, the only newsletter on the planet that serves you up a weekly dose of the freshest, latest open source projects on AWS. I hope some of you were able to catch this episode reviewed on our last Build on Open Source livestream. If not, you can catch the replay here. This week we have projects such “mountpoint-s3”, “s3-access-for-squash”, and “amazon-s3-tar-tool” which provide some useful tools for managing your files on S3, “aws-serverless-ai-stories” a creative masterclass in storytelling, “earthquake-notifier” a serverless solution to keep you alerted and ready, and more!
-
VSCode and Apache Airflow
Feb 20, 2023 | 8 minute read
VSCode and Apache Airflow In this short post, I wanted to highlight how you can use a VSCode plugin to work with a local running instance of Apache Airflow to improve the developer experience. This post was inspired by a tweet from Kaxil Naik who was asking about what features developers are looking for when using VSCode and Pycharm and Apache Airflow. In this post I will show you how you can configure mwaa-local-runner, an open source project that provides you with an easy way to get a local Apache Airflow environment up and running (that is configuration wide, aligned to the Amazon Managed Workflows for Apache Airflow service MWAA), together with some VSCode plugins.
-
AWS open source newsletter #145
Feb 20, 2023 | 24 minute read
Feb 20th, 2023 - Instalment #145 Welcome to edition #145 of the AWS open source newsletter. I hope some of you were able to catch the new Build on Open Source show we live streamed last Friday. You can catch up and replay the session by clicking on this link, where we went over a number of projects from this and a few previous newsletters, and we had special guest Valter who walked us through his project terraform-dev-containers.
-
Configuring the KubernetesPodOperator on Managed Workflows for Apache Airflow (MWAA) - non OIDC Amazon EKS Clusters
Jan 26, 2023 | 5 minute read
Configuring the KubernetesPodOperator on Managed Workflows for Apache Airflow (MWAA) - non OIDC Amazon EKS Clusters Today I came across an interesting question around the use of the KubernetesPodOperator working on EKS Clusters where you have not configured OIDC. They had followed my blog post, and when it came to running the DAG, they got the following error: [2023-01-26, 13:03:18 UTC] {{kubernetes_pod.py:566}} INFO - Creating pod mwaa-pod-test.0ab20a7075b84175b2a9a3fe32796f53 with labels: {'dag_id': 'kubernetes_pod_example_iam_authenticator', 'task_id': 'pod-task', 'execution_date': '2023-01-26T130310.
-
AWS open source newsletter #142
Jan 23, 2023 | 14 minute read
January 23rd, 2023 - Instalment #142 Welcome Welcome to edition #142 of the AWS open source newsletter. We have another great round up of new projects for you to get stuck into. Here are just a taste of some of the projects, kicking off with “sls-mentor” a new tool to help you assess your serverless applications, “subnet-watcher”, a tool to help you monitor your IP addresses, “aws-cdk-web-administered-apps” a very nice reference solution for applications that have a user and admin component, “serverless-newsletter-app” if you are looking for a newsletter solution and want to host your own, look here first, “aws-iot-with-privatelink” shows you how you use private networks for your IoT traffic, “emr-spark-benchmark” benchmarking tool for assessing your Amazon EMR environments, and “update-aws-ip-ranges” keep automatically updated on Amazon’s IP address ranges.
-
AWS open source newsletter #141
Jan 15, 2023 | 13 minute read
January 16th, 2023 - Instalment #141 Welcome Welcome to the AWS open source newsletter of 2023, edition #141. This week we have more new projects for you to practice your four freedoms, including “distributed-compute-on-aws-with-cross-regional-dask”, a solution to simplify distributed compute using Dask, “amazon-emr-serverless-image-cli” a tool to verify your Amazon EMR custom container images, “serverless-run-watch” a tool to help accelerate your local development if you are using the Serverless Framework, “aws-sso-auto-expand-accounts” a quick browser extension for those using AWS SSO, “basti” a cool Bastion Host alternative, “klotho” generate cloud native code from your code, “amazon-route53-hosted-zone-sync” a nice solution for hybrid DNS use cases, and many more.
-
Running the KubernetesPodOperator in different AWS accounts when using Amazon Managed Workflows for Apache Airflow v2.x
Jan 9, 2023 | 18 minute read
Running KubernetesPodOperator in different AWS accounts update August, 14th I wanted to update to newer version of MWAA, so I have tested the original blog post against EKS 1.24 and MWAA version 2.4.3. I also had a few messages about whether this would work across different AWS regions. The good news is that it does. I have also put together a repo for this here I thought that I would also check/update that it works for newer versions of MWAA, so I had 2.
-
AWS open source newsletter #140
Jan 9, 2023 | 19 minute read
January 9th, 2023 - Instalment #140 Welcome Happy New Year and welcome to the first AWS open source newsletter of 2023, edition #140. If you have not already checked it out, I put together a short retrospective summary of 2022 in the post, AWS open source newsletter - 2022 in review. There are some interesting facts and figures in there. I am also taking time to collect feedback from readers to help shape where this newsletter goes in 2023.
- oss-newsletter
- aws open source
- MySQL
- PostgreSQL
- Next.js
- AWS SDK for Java
- AWS ParallelCluster
- Docker
- MariaDB
- Amazon EKS
- Kubernetes
- MQTT
- ArgoCD
- AWS Distro for OpenTelemetry
- Prometheus
- DAMON
- Crossplane
- Log4Shell
- .NET
- Apache Spark
- Apache Kafka
- Apache Flink
- Apache Pinot
- Apache Superset
- Apache NiFi
- Delta Lake
- OpenShift
- Redis
- Red Hat Enterprise Linux
- SUSE Linux Enterprise Server
- Ubuntu Pro
- AWS Copilot
- RabbitMQ
- Apache Airflow
- Rust
- Terraform
- Amazon EMR
- Apache ShardingSphere-Proxy
-
AWS open source news and updates #137
Nov 25, 2022 | 30 minute read
November 25th, 2022 - Instalment #137 Welcome Welcome to the AWS open source newsletter, edition #137. As it is re:Invent next week, I will be publishing the newsletter early as I am heading out on Monday. I will be in Las Vegas talking with open source Builders, hanging out on the Open Source Kiosk in the AWS Village, and doing some talks. If you are coming, I would love to meet some of you, so get in touch.
- oss-newsletter
- aws open source
- re:Invent
- GraphQL
- Grafana
- Prometheus
- AWS Toolkits for JetBrains
- AWS Toolkits for VS Code
- AWS Amplify
- NodeJS
- MariaDB
- PostgreSQL
- Flutter
- React
- Apache Iceberg
- Apache Airflow
- Apache Flink
- Apache ShardingSphere
- AutoGluon
- AWS ParallelCluster
- Kubeflow
- NGINX
- Finch
- Amazon EMR
- Trino
- Apache Hudi
- O3DE
- Apache Kafka
- OpenSearch
- MLFlow
-
AWS open source news and updates #136
Nov 21, 2022 | 20 minute read
November 21st, 2022 - Instalment #136 Welcome Welcome to the AWS open source newsletter, edition #136, as featured on the latest episode of Build on Open Source. This week we feature new projects including “dynamoit” a JavaFX gui for Amazon DynamoDB, “building-apache-kafka-connectors”, “msk-config-providers”, and “msk-serverless-data-pipeline” projects to help make your life easier when working with Apache Kafka, “stowrs-to-s3” a tool for working with STOWRS data on AWS, “aws-device-lobby” a tool to make onboarding devices into AWS IoT Core easier, “aws-graviton-run-confidential-ml-workloads-using-nitro-enclaves” a nice example of how you can do Confidential Computing for machine learning use cases, “aws-hpc-builder” a tool to help you manage your open source HPC tools, and many more.
-
AWS open source news and updates #129
Sep 30, 2022 | 22 minute read
September 30th, 2022 - Instalment #129 Welcome Welcome to the AWS open source newsletter, edition #129. We have loads of great new projects this week, with plenty of variety to keep you all interested. We have “aws-ecr-cleaner”, a great tool to help you manage your container images, “dotnet-lambda-sql-server-proxy” that shows you how you can use RDS Proxy with SQL Server and why, “minecraft-server-dashboard” perfect for those running their own minecraft servers, “YATAS” and “aws-security-survival-kit” for those working on security and governance, “aws-lambda-handler-cookbook” useful recipes to get you going, “autonomous-driving-data-framework” for folks working in the automotive space, and many more - make sure you check all the projects out.
- oss-newsletter
- aws open source
- Apache TinkerPop
- Gremlin
- AWS CDK
- Apache Hudi
- Apache Iceberg
- Delta Lake
- Apache Cassandra
- GraphQL
- Hive
- Apache Spark
- Apache Kafka Streams
- Apache Flink
- Apache Pinot
- Apache Superset
- Apache Airflow
- Keycloak
- Kubeapps
- Babelfish for PostgreSQL
- Linux
- Ubuntu
- PHP
- MySQL
- Bottlerocket
-
AWS open source news and updates #128
Sep 23, 2022 | 15 minute read
September 23rd, 2022 - Instalment #128 Welcome Welcome to the AWS open source newsletter, edition #128. I hope some of you were able to catch Derek and myself sharing a peek at this edition, and enjoyed our special guest, Gethin Webster as he walked us through the open source Cloudscape project. If you want to catch up on that event, check out the video here. This weeks opens new open source projects include “Guardian”, a command line tool that produces nice reports on your AWS environments, “cdk-scheduler”, a new construct that helps you schedule your CDK deployments, “terraform-iam-policy-validator” a script that helps you validate your Terraform scripts, “aws-cdk-golden-ami-pipeline” an example of how to build an automated pipeline to build Amazon Machine Images (AMI’s), and many more.
-
AWS open source news and updates #120
Jul 15, 2022 | 16 minute read
July 15th, 2022 - Instalment #120 Welcome to regular and new readers alike, to the AWS open source newsletter episode #120. The observant will have noticed that it has been two weeks since the last newsletter. We all need some time off, and I spent most of the time cycling around the hills of a new cycle route called King Alfreds Way - highly recommended. There has been some great new projects created over the past couple of weeks, and of course I have you covered and have shared them below.
-
AWS open source news and updates #119
Jul 1, 2022 | 15 minute read
July 1st, 2022 - Instalment #119 Welcome to regular and new readers alike, to the AWS open source newsletter episode #119. This week we feature more new open source projects, such as “cdk-bill-bot”, a tool that can help you reduce AWS bill surprises, “steampipe-mod-aws-perimeter” helps you look for resources that are publicly accessible, “aws-cloudformation-diagrams” is a nice visualisation tool for CloudFormation users, “aws-swagger-ui” a project to help you set up Swagger UI for API Gateway, “kinesis-hot-shard-advisor” a handy tool that helps you identify whether you have hot key or hot shard issues on your Kinesis data streams, and many more.
-
AWS open source news and updates #118
Jun 24, 2022 | 14 minute read
June 24th, 2022 - Instalment #118 Welcome to regular and new readers alike, to the AWS open source newsletter episode #118. This week we feature more new open source projects, such as “seed-farmer”, and orchestration tool modelled after GitOps deployments, “aws-proton-plugins-for-backstage” Backstage plugins for interacting with AWS Proton, “dcv-gnome-shell-extension” is a GNOME Shell extension to provide functionalities required by NICE DCV, “simpleiot-arduino” an Arduino library to integrate with the SimpleIOT framework, “event-driven-weather-forecasts” an event driven weather forecasting demo, and many more.
-
AWS open source news and updates #110
Apr 29, 2022 | 14 minute read
April 29th, 2022 - Instalment #110 Newsletter #110. Welcome to edition #110 of the AWS open source newsletter. It has been a busy week, with the AWS Summit London happening this week (where I was lucky enough to do a session on Apache Airflow) meaning I am publishing this a little later than I had planned. We have more great new projects this week, including a project that helps make it easier to deploy your static and dynamic applications, a tool that provides help in managing the long term health of your AWS Data Lake, a cool project to help you replicate data from a Kinesis Data Stream across regions, a nice CloudWatch dashboard widget that summarises your CloudFormation stacks, and many more - so check them out.
-
AWS open source news and updates #107
Apr 4, 2022 | 17 minute read
April 4th, 2022 - Instalment #107 Newsletter #107. Welcome to edition #107 of the AWS open source newsletter, and we have a bumper edition this week packed with more great new open source projects and content for you to consume. Topics featured this week include optimising open source big data tools, developer tooling, case studies and we even some some great open source content for .NET core developers. This weeks projects include a really nice handy browser plugin called “aws-search-extension”, that lets you search and find developer information from the AWS docs, a tool that will help you detect whether you have configured or using dockershim in your Kubernetes clusters, a library to help you integrate Amazon Cognito in your Laravel PHP applications, and plenty more developer tools and sample projects.
-
AWS open source news and updates #105
Mar 20, 2022 | 15 minute read
March 21st, 2022 - Instalment #105 Newsletter #105. Welcome to edition #105 of the AWS open source news and updates, where we bring you the latest open source projects, posts, events, and much more. This weeks new projects include the latest work in progress from AWS Hero Ian Mckay, “iamfast” is an AWS IAM policy generation tool that is in early stages but promises to be very useful indeed. “iasql-engine” is a tool that models cloud infrastructure as data, “ssm-patch-portal” provides a nice gui front end to simplify patching with AWS System Manager, a new crowdsource guide that contains learning resources for AWS, a business intelligence platform built using open source technologies from the NHS, and many more.
-
AWS open source news and updates #104
Mar 14, 2022 | 17 minute read
March 14th, 2022 - Instalment #104 Newsletter #104. Welcome to #104 of the AWS open source news and updates newsletter, bringing you the latest updates from around the AWS and Communities. This week we have yet more great new open source projects, including a Deno runtime for your Lambda functions, data lineage and data testing tools, a performance testing tool for Apache Kafka, an ELT tool for Amazon Redshift, an Amazon S3 archive tool, and many more.
-
Contributing to the Apache Airflow project - Part Two
Mar 11, 2022 | 11 minute read
This is the second and concluding post providing an overview of the experience and journey contributing to the Apache Airflow project. You can catch Part One here. Contributing to Apache Airflow - Part Deux In Part One of this series, we took our first steps in contributing to the Apache Airflow project. With a little bit more knowledge and experience, our first interactions with the Airflow community, we are ready to start exploring how the code works and see how we might go about fixing this.
-
Orchestrating hybrid workflows using Amazon Managed Workflows for Apache Airflow (MWAA)
Mar 7, 2022 | 46 minute read
Using Apache Airflow to orchestrate hybrid workflows In some recent discussions with customers, the topic of how open source is increasingly being used as a common mechanisms to help build re-usable solutions that can protect investments in engineering and development time, skills and that work across on premises and Cloud environment. In 2021 my most viewed blog post talked about how you can build and deploy containerised applications, anywhere (Cloud, your data centre, other Clouds) and on anything (Intel and Arm).
-
AWS open source news and updates #103
Mar 7, 2022 | 15 minute read
March 7th, 2022 - Instalment #103 Newsletter #103. Welcome to edition #103 of the AWS open source news and updates. This weeks featured new open source projects include botocove (a decorator that helps you run your functions across your AWS accounts easily), functionless (a TypeScript plugin that transforms TypeScript code into Service-to-Service integrations), replibyte (a tool to replicate your PostgreSQL data), aws-security-bulletin-alert (notifies you of new AWS Security Bulletins) and sends out E-Mail notifications via Amazon SES), and many more.
-
AWS open source news and updates #102
Feb 28, 2022 | 13 minute read
Feb 28th, 2022 - Instalment #102 Newsletter #102. Welcome to edition #102 of the AWS open source news and updates newsletter, and this week we have a super collection of new open source projects that I am really excited to share. First up we have the AWS DataOps Development Kit, which uses AWS CDK under the covers, and is an open source development framework to help you build data workflows. Threatmapper is an open source cloud native security observability platform, which looks easy to use and has some good visualisations.
-
AWS open source news and updates #101
Feb 21, 2022 | 12 minute read
Feb 21st, 2022 - Instalment #101 Newsletter #101. There is nothing basic and fundamental about edition 101 of the AWS open source newsletter, with another great round up of new open source projects including eks-creation-engine from the folks at Lightspin helping you all to stay safer with this handy tool you should check out, idp-scim-sync to help users of AWS SSO who want to synchronise with their Google Workspace Directory, typecart an analysis tool for proof evolution and many other great projects and sample code.
-
AWS open source news and updates #100
Feb 14, 2022 | 12 minute read
Feb 14th, 2022 - Instalment #100 Newsletter #100. Happy Valentines everyone, and welcome to this landmark 100st edition of this newsletter. This week we celebrate the love that many builders have for open source with more great new open source projects and content. Cuddle up to new projects that will help you build scalable systems, simplify your work with AWS DynamoDB, integrate your .NET applications with OpenSearch, keep on top of your VPC networks, and more.
-
Contributing to the Apache Airflow project - Part One
Feb 10, 2022 | 18 minute read
Contributing to Apache Airflow Introduction In this series of posts, I am going to share what I learn as embark on my first upstream contribution to the Apache Airflow project. The purpose is to show you how typical open source projects like Apache Airflow work, how you engage with the community to orchestrate change and hopefully inspire more people to contribute to this open source project. I will post regular updates as a series of posts, as the journey unfolds.
-
AWS open source news and updates #99
Feb 7, 2022 | 12 minute read
Feb 7th, 2022 - Instalment #99 Newsletter #99. While Nena gave you 99 red balloons, I give you the latest version of the AWS open source news letter. This week we feature more great new open source projects including a project to help you with drift detection in your CloudFormation stacks, new Terraform modules, an open-source prometheus exporter, some AWS CDK resources and sample projects and more. This weeks AWS and Community posts cover PostgreSQL, Apache Airflow, AWS CDK, Redis, GraphQL, Apollo GraphQL, Kubernetes, AWS EKS and more.
-
AWS open source news and updates #98
Jan 29, 2022 | 14 minute read
Jan 31st, 2022 - Instalment #98 Newsletter #98. Welcome to another edition of AWS open source news and updates, featuring more new open source projects. This week, these include eventbridge-assistant (a VScode plugin to help you whilst you are developing with Amazon EventBridge), stratus-red-team (a tool you can use to emulate offensive attack techniques), critter (AWS Config rule integration testing), syne-tune-s3-transfer (an example of how to apply the distributed parameter search library to optimise download performance), karpenter-terraform (a Terraform module to help you automate deployment of karpenter), and a couple of super interesting open source solutions covering last mile delivery and software defined radio.
-
AWS open source news and updates #97
Jan 22, 2022 | 12 minute read
Jan 22nd, 2022 - Instalment #97 Newsletter #97. Welcome to another edition of the AWS open source newsletter, packed with more great new open source projects, content, and events. This week, we have new projects that help you improve security by de-obfuscating strings, a library to help you automate the configuration of your build pipelines, a new Terraform module, a nice new VSCode plugin that will help you when working with IAM, and several more.
-
Setting up MWAA to use a KMS key
Dec 14, 2021 | 6 minute read
Introduction In a previous post, I shared how you can using AWS CDK to provision your Apache Airflow environments using the Managed Workflows for Apache Airflow service (MWAA). I was contacted this week by Michael Grabenstein, who flagged an issue with the code in that post. The post used code that configured a kms key for the MWAA environment, but when trying to deploy the app it would fail with the following error:
-
Integrating Amazon Timestream in your Amazon Managed Workflows for Apache Airflow v2.x
Sep 23, 2021 | 28 minute read
Integrating with Amazon Timestream in your Apache Airflow DAGs Amazon Timestream is a fast, scalable, and serverless time series database service perfect for use cases that generate huge amounts of events per day, optimised to make it faster and more cost effective that using relational databases. I have been playing around with Amazon Timestream to prepare for a talk I am doing with some colleagues, and wanted to see how I could integrate it with other AWS services in the context of leveraging some of the key capabilities of Amazon Timestream.
-
Reading and writing data across different AWS accounts with Amazon Managed Workflows for Apache Airflow v2.x
Sep 7, 2021 | 13 minute read
Reading and writing data across different AWS accounts in you Apache Airflow DAGs As regular readers will know, I sometimes lurk in the Apache Airflow slack channel to see what is going on. If you are new to Apache Airflow, or want to get a deeper understanding then I highly recommend spending some time here. The community is super welcoming and eager to help new participants. It was during a recent session I came across an interesting problem that one of the builders was having, which was how to access (read/write) data in an S3 bucket which was in a different account to the one hosting Amazon Managed Workflows for Apache Airflow (MWAA).
-
Working with parameters and variables in Amazon Managed Workflows for Apache Airflow
Jul 27, 2021 | 36 minute read
Maximising the re-use of your DAGs in MWAA During some recently conversations with customers, one of the topics that they were interested in was how to create re-usable, parameterised Apache Airflow workflows (DAGs) that could be executed dynamically through the use variables and/or parameters (either submitted via the UI or the command line). This makes a lot of sense, as you may find that you repeat similar tasks in your workflows, and so this approach allows you to maximise the re-use of that work.
-
Working with Amazon EKS and Amazon Managed Workflows for Apache Airflow v2.x
Jun 10, 2021 | 11 minute read
Introduction The Apache Airflow slack channel is a vibrant community of open source builders that is a great source of feedback, knowledge and answers to problems and use cases you might have when trying to do stuff with Apache Airflow. This week I picked up on someone seeing errors with Amazon EKS, and so I thought what better time to try out the new Apache Airflow 2.x version that was recently launched in Amazon Managed Workflows for Apache Airflow (MWAA).
-
Working with the RedshiftToS3Transfer operator and Amazon Managed Workflows for Apache Airflow
May 15, 2021 | 18 minute read
Introduction Inspired by a recent conversation within the Apache Airflow open source slack community, I decided to channel the inner terrier within me to tackle this particular issue, around getting an Apache Airflow operator (the protagonist for this post) to work. I found the perfect catalyst in the way of the original launch post of Amazon Managed Workflows for Apache Airflow (MWAA). As is often the way, diving into that post (creating a workflow to take some source files, transform them and then move them into Amazon Redshift) led me down some unexpected paths to here, this post.
-
Using AWS CDK to deploy your Amazon Managed Workflows for Apache Airflow environment
Apr 28, 2021 | 11 minute read
update I am grateful to Michael Grabenstein for spotting some mistakes in the original post/code. I hope these have now been rectified in this post. Using AWS CDK to deploy your Amazon Managed Workflows for Apache Airflow environment What better way to celebrate CDK Day than to return to a previous blog where I wrote about automating the installation and configuration of Amazon Managed Workflows for Apache Airflow (MWAA), and take a look at doing the same thing but this time using AWS CDK.
-
Automating your ELT Workflows with Managed Workflows for Apache Airflow - Part Two
Apr 21, 2021 | 17 minute read
Part Two - Automating Amazon EMR In Part One, we automated an example ELT workflow on Amazon Athena using Apache Airflow. In this post, Part Two, we will do the same thing but automate the same example ELT workflow using Amazon EMR. Make sure you recap the setup from Part One. All the code so you can reproduce this yourself can be found in the GitHub repository here. Automating Amazon EMR
-
Automating your ELT Workflows with Managed Workflows for Apache Airflow - Part One
Apr 21, 2021 | 18 minute read
update: I have changed the post to use standard Apache Airflow variables rather than using AWS Secrets Manager. Part One - Automating Amazon Athena As part of an upcoming DevDay event, I have been working on how you can use Apache Airflow to help automate your Extract, Load and Transform (ELT) Workflows. Amazon Athena and Amazon EMR are two AWS services that help customers who have existing SQL skills/expertise and are looking at tools such as Presto or Apache Hive when undertaking those transformations.
-
Monitoring and logging with Amazon Managed Workflows for Apache Airflow
Feb 9, 2021 | 12 minute read
Part of a series of posts to support an up-coming online event, the Innovate AI/ML on February 24th, from 9:00am GMT - you can sign up here Part 1 - Installation and configuration of Managed Workflows for Apache Airflow Part 2 - Working with Permissions Part 3 - Accessing Amazon Managed Workflows for Apache Airflow Part 4 - Interacting with Amazon Managed Workflows for Apache Airflow via the command line Part 5 - A simple CI/CD system for your development workflow Part 6 - Monitoring and logging <- this post Part 7 - Automating a simple AI/ML pipeline with Apache Airflow In this post I will be covering Part 6, where to find logs to help you understand and troubleshoot your Apache Airflow workflows, and how you can monitor your Apache Airflow environments.
-
A simple CI/CD system for your Amazon Managed Workflows for Apache Airflow development workflow
Feb 3, 2021 | 16 minute read
updated Feb 19th Part of a series of posts to support an up-coming online event, the Innovate AI/ML on February 24th, from 9:00am GMT - you can sign up here Part 1 - Installation and configuration of Managed Workflows for Apache Airflow Part 2 - Working with Permissions Part 3 - Accessing Amazon Managed Workflows for Apache Airflow Part 4 - Interacting with Amazon Managed Workflows for Apache Airflow via the command line Part 5 - A simple CI/CD system for your development workflow <- this post Part 6 - Monitoring and logging Part 7 - Automating a simple AI/ML pipeline with Apache Airflow In this post I will be covering Part 5, how you can setup a very simple CI/CD setup to enable faster development of your Apache Airflow DAGs.
-
Interacting with Amazon Managed Workflows for Apache Airflow via the command line
Feb 1, 2021 | 12 minute read
Part of a series of posts to support an up-coming online event, the Innovate AI/ML on February 24th, from 9:00am GMT - you can sign up here Part 1 - Installation and configuration of Managed Workflows for Apache Airflow Part 2 - Working with Permissions Part 3 - Accessing Amazon Managed Workflows for Apache Airflow environments Part 4 - Interacting with Amazon Managed Workflows for Apache Airflow via the command line < this post Part 5 - A simple CI/CD system for your development workflow Part 6 - Monitoring and logging Part 7 - Automating a simple AI/ML pipeline with Apache Airflow In this post I will be covering Part 4, how you can interact and access the Apache Airflow via the command line.
-
Accessing your Amazon Managed Workflows for Apache Airflow environments
Jan 28, 2021 | 8 minute read
Part of a series of posts to support an up-coming online event, the Innovate AI/ML on February 24th, from 9:00am GMT - you can sign up here Part 1 - Installation and configuration of Managed Workflows for Apache Airflow Part 2 - Working with Permissions Part 3 - Accessing Amazon Managed Workflows for Apache Airflow environments < this post Part 4 - Interacting with Amazon Managed Workflows for Apache Airflow via the command line Part 5 - A simple CI/CD system for your development workflow Part 6 - Monitoring and logging Part 7 - Automating a simple AI/ML pipeline with Apache Airflow In this post I will be covering Part 3, how you can interact and access the Apache Airflow environments.
-
Working with permissions in Amazon Managed Workflows for Apache Airflow
Jan 27, 2021 | 10 minute read
Part of a series of posts to support an up-coming online event, the Innovate AI/ML on February 24th, from 9:00am GMT - you can sign up here Part 1 - Installation and configuration of Managed Workflows for Apache Airflow Part 2 - Working with Permissions <- this post Part 3 - Accessing Amazon Managed Workflows for Apache Airflow environments Part 4 - Interacting with Amazon Managed Workflows for Apache Airflow via the command line Part 5 - A simple CI/CD system for your development workflow Part 6 - Monitoring and logging Part 7 - Automating a simple AI/ML pipeline with Apache Airflow In this post I will be covering Part 2, how to ensure that you control access to Apache Airflow following best practices such as default no access/least privilege.
-
Automating the installation and configuration of Amazon Managed Workflows for Apache Airflow
Jan 26, 2021 | 15 minute read
updated, August 25th Thanks to Philip T for spotting a typo in the cloudformation code below - it is ok in the GitHub repo, but I have fixed it now below. Part of a series of posts to support an up-coming online event, the Innovate AI/ML on February 24th, from 9:00am GMT - you can sign up here Part 1 - Installation and configuration of Managed Workflows for Apache Airflow <- this post Part 2 - Working with Permissions Part 3 - Accessing Amazon Managed Workflows for Apache Airflow environments Part 4 - Interacting with Amazon Managed Workflows for Apache Airflow via the command line Part 5 - A simple CI/CD system for your development workflow Part 6 - Monitoring and logging Part 7 - Automating a simple AI/ML pipeline with Apache Airflow In this post I will be covering Part 1, automating the installation and configuration of Managed Workflows for Apache Airflow (MWAA).