MWAA
-
AWS open source newsletter #204
Oct 22, 2024 | 25 minute read
Edition #204 Welcome to issue #204 of the AWS open source newsletter, the newsletter where we try and provide you the best open source on AWS content. Apologies for the long wait since the last edition, I will have to do better. Thanks for the lovely messages and feedback I have received over the past few weeks, this edition is for you! As always, we have more great new projects to check out, which include projects that surface up your AWS costs in Home Assistant, a tool that you can use to ask questions about your code base that uses generative AI, a git large file storage (LFS) extension that lets you use Amazon S3, and a handy network cost calculator.
- oss-newsletter
- aws open source
- Home Assistant
- Godot
- Valkey
- Keycloak
- Apache Airflow
- MWAA
- PostgreSQL
- deequ
- Kubernetes
- ArgoCD
- OTEL
- Grafana
- Spring Boot
- Amazon Corretto
- ROSA
- OpenShift
- Kubecost
- Amazon Linux 2023
- Karpenter
- MySQL
- MariaDB
- Apache Flink
- Apache Kafka
- OpenZFS
- InfluxDB
- AWS Parallel Cluster
- Lustre
- Prometheus
- Finch
- Ubuntu
- Cedar
-
AWS open source newsletter #203
Aug 27, 2024 | 28 minute read
Welcome to the AWS open source newsletter, the newsletter where we try and provide you the best open source on AWS content. As always, more great new projects are featured in this edition, #203. Projects to check out include: how you can proxy OpenAI requests through Amazon Bedrock, security tools that help you stay one step ahead of bad actors, a way of implementing CDK Pipelines in a less opinionated way, a tool that helps you validate your AWS IAM policies, a toolkit to help get you started with good practices when creating CloudFormation templates, some demo code that demonstrate how you can implement zero downtime updates to your applications, as well as some really cool demos and use cases of generative AI in action (too many to mention, so check them all out!
- oss-newsletter
- aws open source
- Langfuse
- Steampipe
- Ray
- Apache Spark
- AWS Amplify
- Flutter
- Valkey
- O3DE
- AWS CDK
- LangChain
- Kubernetes
- Amazon EKS
- Argo Workflows
- OpenTofu
- Bottlerocket
- Karpenter
- OpenTelemetry
- Apache Flink
- Apache Pinot
- Apache Kafka
- openCypher
- Apache Iceberg
- Apache Airflow
- MWAA
- MySQL
- PostgreSQL
- Deequ
- OCSF
- GraphStorm
- OpenShift
- Amazon EMR
- ActiveMQ
- Red Hat Enterprise Linux
- Cedar
-
AWS open source newsletter #201
Jul 10, 2024 | 28 minute read
Edition #201 Welcome to the AWS open source newsletter, issue #201, your trusted source for the very best open source on AWS content. This weeks new projects for you to practice your four freedoms include generative AI infused projects to help you generate your docs, streamline the setting up of your AWS resources, a new experimental framework for building document based workflows, and a cool demo that showcases how you can use generative AI to help translate American Sign Language.
- oss-newsletter
- aws open source
- PHP
- Apache Airflow
- MWAA
- Node.js
- LLRT
- Kubernetes
- Amazon EKS
- Prometheus
- Grafana
- eksctl
- Valkey
- LangChain
- Project Lakechain
- AWS Amplify
- Itsio
- Apache Iceberg
- Apache Kafka
- Apache Cassandra
- PyTorch
- Apache httpd
- Babelfish for Aurora PostgreSQL
- Apache Flink
- PostgreSQL
- MySQL
- OpenSearch
- OpenZFS
- Amazon Linux
- FreeRTOS
- RabbitMQ
- AWS ParallelCluster
- Open Container Initiative
- Smithy
- Cedar
- sbt
-
AWS open source newsletter #200
Jun 24, 2024 | 20 minute read
Edition #200 Welcome to a milestone edition of this newsletter, number #200!! Wow, it feels like quite an achievement. Before diving into this newsletter, a big thank you for sticking with me. Time has flown by so quickly, and am looking forward to the next 100. As I have done in a few of the previous milestone issues, I wanted to share a few interesting stats from sharing open source projects with you over the past few years.
-
AWS open source newsletter #198
May 28, 2024 | 25 minute read
Edition #198 Welcome to issue #198 of the AWS open source newsletter, the newsletter where we try and provide you the best open source on AWS content. In this issue we feature new projects that provide integration of .NET Aspire with AWS resources, an automated data discovery tool to find data in your AWS environments, a tool to help incorporate good practices when building SaaS solutions, a cost allocation dashboard for your Kubernetes workloads, a project that might help you mitigate costs around Internet Gateway, and a few generative AI demos around food, news, and social media which you should definitely check out.
- oss-newsletter
- aws open source
- Aspire
- Kubernetes
- Amazon EKS
- Leapp
- OpenTelemetry
- AWS CDK
- llrt
- Valkey
- PostgreSQL
- InfluxDB
- High Performance Software Foundation
- Karpenter
- Multus
- Kata
- Grafana
- Prometheus
- Apache Flink
- Zingg
- Apache Hudi
- Apache Iceberg
- MySQL
- Apache Tomcat
- WordPress
- AWS Amplify
- Apache Airflow
- MWAA
- OpenSearch
- Apache Kafka
- Bottlerocket
- Amazon EMR
-
AWS open source newsletter #197
May 13, 2024 | 21 minute read
Edition #197 Welcome to issue #197 of the AWS open source newsletter, the newsletter where we try and provide you the best open source on AWS content. Like in previous editions of this newsletter, we feature new projects for you you practice your four freedoms. We have some great projects, including a sprinkling of repos that look to help you benchmark and assess your generative AI models and agents, a new fruity framework for building document understanding applications, a nice container command line tool that sysadmins will love, a tool to help you migrate your CodeCommit repositories, a really nice application of using generative AI to help automate CVE findings, and a neat generative AI newsletter generation demo.
-
AWS open source newsletter #196
Apr 29, 2024 | 22 minute read
Edition #196 Welcome to issue #196 of the AWS open source newsletter, the newsletter where we try and provide you the best open source on AWS content. As always, more great new projects are featured in this edition of the newsletter, including a link to the Valkey repo, a nice GUI based project to help you build orchestration workflows that uses Apache Airflow under the covers, a tool to help you find signals through the noise of your security logs, a project to help you run serverless tasks in a cron like fashion, a command line runner for Amazon CodeCatalyst, a tool to help you simplify the deployment of Cruise Control on Amazon MSK, a nice Mac client for experimenting with Amazon Bedrock, and some really cool demo apps, the pick of which (for me) is a nice way of surfacing up your Amazon Bedrock models in a way that existing applications that expect an API key can use.
- oss-newsletter
- aws open source
- Valkey
- Apache Airflow
- MWAA
- OpenSearch
- LangChain
- PostgreSQL
- WordPress
- RAGmap
- RAGxplorer
- Cedar
- AWS CDK
- Lambda Web Adapter
- Postfix
- Spring Boot
- Amazon Corretto
- Amazon EKS
- Kubernetes
- Karpenter
- KEDA
- Prometheus
- OPA
- Amazon EMR
- PySpark
- MySQL
- Open JD
- AWS Amplify
- GraphQL
- AWS PDK
- Apache Livy
- Nodestream
-
AWS open source newsletter #195
Apr 15, 2024 | 21 minute read
Edition #195 Welcome to issue #195 of the AWS open source newsletter, the newsletter where we try and provide you the best open source on AWS content. This week I am heading out to Everything Open, and looking forward to meeting the community in Gladstone. I will be talking about Cedar, and showing why it is important and how it works (demo is working lovely now). I am now on the third week of my open source roadshow, which is why I have had to change the publishing of this newsletter to every other week - at least until I get back home.
-
AWS open source newsletter #187
Feb 4, 2024 | 15 minute read
Edition #187 Welcome to issue #187 of the AWS open source newsletter, the newsletter where we try and provide you the best open source on AWS content. As always, this week we start with a round up of some freshly baked new projects for you to practice your four freedoms. This week we have new projects that help you optimise working with EBS volumes on EC2, a tool to help you document your architectures, a large language model benchmarking tool, a tool to help you optimise your S3 storage files, a data validation framework, and a really nice Java workshop.
-
AWS open source newsletter #185
Jan 22, 2024 | 13 minute read
Edition #185 Welcome to issue #185 of the AWS open source newsletter, the newsletter where we try and provide you the best open source on AWS content. As always, this week we start with a round up of some freshly baked new projects for you to practice your four freedoms. This week we have projects that allow you to export your Partyrock applications, a tool to help you reduce hallucinations in your large language models, a new client for Redis, a tool to help you access the AWS Partner Network, as well as sample projects that look at how you can use large langue models to build a new reader and building pipelines using Cloudformation.
-
AWS open source newsletter #183
Jan 8, 2024 | 26 minute read
January 8th, 2024 - Instalment #183 Happy new year and welcome to the first edition of the AWS open source newsletter of 2024, number #183. The big news for 2024 is the move from dev.to to community.aws as the “home” for the AWS open source newsletter, although it will still be posted on dev.to as well. Let me know what you think, community.aws has some top notch content that many readers might not be aware of.
- oss-newsletter
- aws open source
- Cedar
- Projen
- Prometheus
- Grafana
- Kubernetes
- Amazon EKS
- Mage
- OpenSearch
- AWS CDK
- eBPF
- Istio
- Kubecost
- PostgreSQL
- MySQL
- MariaDB
- ActiveMQ
- Apache Airflow
- MWAA
- Apache Spark
- Ray
- AWS Amplify
- Spring Boot
- Amazon EMR
- AWS Neuron
- Apache Cassandra
- OpenTelemetry
- Amazon Linux
- AWS ParallelCluster
- RabbitMQ
-
AWS open source newsletter #180
Nov 20, 2023 | 20 minute read
November 20th, 2023 - Instalment #180 Welcome to #180 of the AWS open source newsletter, the place for all your AWS and open source needs. As we ramp up to re:Invent, it is good to see that pre:Invent is giving us plenty of open source goodies. In this weeks newsletter, we have some of those in the way of new projects such as res and aws-iatk, but we also have lots of really great content too.
- oss-newsletter
- aws open source
- Ragna
- ezsmdeploy
- SnapStart
- GraalVM
- Amazon Corretto
- Amazon EMR
- Apache Airflow
- LangChain
- AWS Copilot
- Karpenter
- MWAA
- Grafana
- Prometheus
- Amazon EKS
- Kubernetes
- Apache Flink
- Apache Kafka
- Avro
- Apache Cassandra
- Apache Spark
- Red Hat Linux
- Amazon Linux 2023
- NodeJS
- AWS Amplify
- Redis
- MySQL
- PostgreSQL
- eksctl
- MapLibre
- Overture Maps
-
AWS open source newsletter #178
Nov 6, 2023 | 19 minute read
November 6th, 2023 - Instalment #178 Welcome to #178 of the AWS open source newsletter, the place for all your AWS and open source needs. This week we feature more new open source projects for you to practice your four freedoms. We have a useful tool that helps you synchronise your AWS Identity Centre users with the users you provision in your Amazon RDS databases, we share the AWS data solutions framework that helps you build data solutions following opinionated best practices, a resource explorer for your AWS accounts, a guardrails solution for your AWS account, and a couple of demo repositories that take a look at Localstack and RSS.
-
AWS open source newsletter #177
Oct 30, 2023 | 19 minute read
October 30th, 2023 - Instalment #177 Welcome to #177 of the AWS open source newsletter, the Halloween special. You will find no tricks in this edition, only treats, with more new projects for you to check out and content that are a feast for your eyes. This weeks new projects include a tool to help you easily deploy vector databases on Kubernetes, an observability toolkit, a tool to help you benchmark network latency, as well as lots of new demos on generative AI.
- oss-newsletter
- aws open source
- Bottlerocket
- KubeArmor
- NGINX
- Wordpress
- Milvus
- Falcon-40B
- JupyterHub
- Dask
- Flux GitOps
- Crossplane
- Kubernetes
- Amazon EKS
- Babelfish for Aurora PostgreSQL
- PostgreSQL
- Linux
- Apache Hive
- Apache Spark
- Apache Kafka
- Apache Hudi
- Delta Lake
- Apache Iceberg
- OpenSearch
- Dremio
- OpenShift
- OpenCLIP
- Apache Airflow
- MWAA
- Amazon Corretto
- OpenJDK
- AWS CDK
-
AWS open source newsletter #175
Oct 16, 2023 | 25 minute read
October 16th, 2023 - Instalment #175 Welcome to #175 of the AWS open source newsletter, back after recharging in the wonderful countyside of Yorkshire. I am publishing this weeks newsletter from Raleigh, North Carolina. All Things Open is happening this week, and you will catch me at the AWS booth where I will be showing off some cool open source stuff (Cedar, Apache Airflow, and a few others), and I also have a talk on Tuesday.
- oss-newsletter
- aws open source
- Redis
- MariaDB
- and PostgreSQL
- Cloud Native Operational Excellence
- CNOE
- Crossplane
- Apache Airflow
- MWAA
- Apache Flink
- Amazon EMR
- Powertools for AWS Lambda
- Kubernetes
- Amazon EKS
- Prometheus
- Grafana
- AWS Distributed OpenTelemetry(ADOT)
- Karpenter
- CoreDNS
- etcd
- Istio
- SUSE
- AWS-LC
- Spring Boot
- SOCI
- Apache Kafka
- Amazon Corretto
- AWS CDK
- Amazon Linux
- Bottlerocket
- cdk8s
- AWS Amplify
- Stable Diffusion
- NextJS
-
AWS open source newsletter #170
Aug 21, 2023 | 21 minute read
August 21st, 2023 - Instalment #170 Welcome to edition #170 of the AWS open source newsletter, an oasis of open source goodness that features the latest new projects, essential reading, and must view videos to quench the thirst of every open source developer. This weeks edition we have new projects that help you get on top of your IAM actions, a handy tool for knowing what your current AWS account service limits are from the command line, a tool to help you do database migrations, and some interesting and very detailed reference solutions for gaming, live streaming, and managing/exporting of your Amazon Cognito profiles.
- oss-newsletter
- aws open source
- AWS-LC
- Threat Composer
- AWS CDK
- AWS SAM
- AWS SDK for Java
- GitLab
- GraphQL
- AWS AppSync
- AWS Distro for OpenTelemetry (ADOT)
- PostgreSQL
- Apache Airflow
- SBOM
- Syft
- Apache Hudi
- Apache Iceberg
- Apache Spark
- collectd
- Grafana
- O3DE
- ROS
- Next.js
- PostgreSQL
- OpenZFS
- MWAA
- Cedar
- Powertools for Lambda
-
AWS open source newsletter #169
Aug 14, 2023 | 26 minute read
August 14th, 2023 - Instalment #169 Welcome to #169 of the AWS open source newsletter, featuring the latest and greatest open source news, projects, videos, and community content that you need to know about. Featured in this weeks edition we have more great projects, including a new ODBC driver for Amazon Timestream database, a nice tool to simplify your ssh tunnelling, an essential VSCode extension for working with Cedar policies, a couple of projects that help you shift left and validate / monitor your policies, a solution to help you monitor your Apache Kafka environments, as well as some great sample applications.
- oss-newsletter
- aws open source
- OpenSearch
- AWS CDK
- Juypter AI
- dbt
- Apache Airflow
- Managed Workflows for Apache Airflow
- MWAA
- Cedar
- cfnguard
- Grafana
- Prometheus
- Apache Kafka
- Amazon Timestream
- Open Cybersecurity Schema Framework
- OCSF
- AWS Lambda Web Adapter
- Smithy
- Apache Spark
- Linux
- Amazon Linux
- AWS ParallelCluster
- PostgreSQL
- Spring Boot
- Amazon EKS
- Kubernetes
- Mountpoint for Amazon S3
- MySQL
- Lustre
- OpenZFS
- Redis
- Amazon EMR
- Karpenter
- Seekable OCI
- SOCI
- Firecracker
-
AWS open source newsletter #166
Jul 24, 2023 | 22 minute read
July 24th, 2023 - Instalment #166 Welcome to #166 of the AWS open source newsletter. As always, we search high and low for the best and latest open source content, and I think you will love what we have lined up this week. This weeks new projects include a library to help you managed and validate your environment variables when working with AWS Lambda, a new Rust based tool for interacting with your S3 buckets, an essential tool to help CDK developers remove a lot of the setup work, and a tool that helps you run Yocto embedded Linux build jobs in AWS.
-
AWS open source newsletter #165
Jul 17, 2023 | 17 minute read
July 17th, 2023 - Instalment #165 Welcome to #165 of the AWS open source newsletter, the only* newsletter that brings you the best and latest open source content. We have some great new projects this week, including a tool for IoT developers to help you validate your SQL statements, a command line interface tool for Amazon Verified Permissions, an Amazon DynamoDB estimation tool, and more. Also featured this week is content on Apache Iceberg, OpenSearch, PostgreSQL, Kubernetes, Power Tools for AWS Lambda, Spring Boot, Babelfish for Aurora PostgreSQL, Karpenter, Apollo GraphQL, JupyterHub, dbt, Apache Airflow, Cedar, and Apache Flink.
-
AWS open source newsletter #161
Jun 19, 2023 | 19 minute read
June 19th, 2023 - Instalment #161 Welcome to #161 of the AWS open source newsletter, and another week for fresh, new open source projects and code for you to practice your four freedoms. This weeks projects include tools that will help you create temporary elevated credentials, a new Java library that provides methods for encrypting and decrypting cryptographic materials, an AWS DynamoDB wrapper for Node/TypeScript developers, and a solution to help you find and visualise data assets.
- oss-newsletter
- aws open source
- Falcon
- AWS CDK
- Keycloak
- Cedar
- FreeRTOS
- Apache Airflow
- MWAA
- Apache Spark
- Amazon EMR
- Apache Hudi
- Apache Iceberg
- and Delta Lake
- Apache Flink
- OpenChatkit
- Kubernetes
- Pinniped
- Kubecost
- Karpenter
- ONNX
- Apache Kafka
- Babelfish for Aurora PostgreSQL
- AWS Amplify
- Next.js
- OpenSearch
- Flux
- ArgoCD
- KVM
-
AWS open source newsletter #160
Jun 12, 2023 | 17 minute read
June 12th, 2023 - Instalment #160 Welcome to #160 of the AWS open source newsletter, where we try and share all the important open source news, projects, events, and content that open source builders want. This week we have new projects that include tools to help you build data workflows, Terraform modules to help you incorporate temporary elevated access controls, integrating Tailscale to change your traffic flows, a neat AWS Lambda debugging tool, Go bindings for Cedar, and more.
-
AWS open source newsletter #154
Apr 24, 2023 | 22 minute read
April 24th, 2023 - Instalment #154 Welcome Hello and welcome to the AWS open source newsletter, #154, the newsletter that just keeps on giving….in this case, keeps giving you brand new open source projects to practice your four freedoms on. We have another great selection of projects for you as always, starting off with “cfn-teleport” an essential cli tool for Cloudformation users, “aither” an interesting collaborative development tool using virtualised desktops on containers, “tabular-column-semantic-search” a tool to help you find similar types of data in your data lakes, “resource-lister” and “komiser” tools that help you manage your AWS resources, “resource-utilization” helps you track your AWS resource utilisation, “iot-network-traffic-control-and-load-testing-simulator” an interesting load and chaos testing example, and more!
- oss-newsletter
- aws open source
- Apache Oozie
- Apache Airflow
- Deep Java Library
- DJL
- mwaa-local-runner
- MWAA
- Trusted Language Extensions for PostgreSQL
- Supabase
- PostgreSQL
- Jupyter
- Grafana
- Opus
- Papermill
- Apache Spark
- HiveQL
- Amazon EKS
- Kubernetes
- Amazon EMR
- RStudio
- USBGuard
- Amazon Corretto
- AWS Amplify
- Apache Hive Metastore
- LoRaWAN
- gMSA
- Python
- OpenSearch
- AWS Copilot
- Marten
- Flutter
-
Getting mwaa-local-runner up on AWS Cloud9
Apr 17, 2023 | 5 minute read
Here is a quick recipe if you are looking to get mwaa-local-runner up and running on your Cloud9 developer setup. This might not be the most optimised way, so I am very happy to received suggestions on how to improve this. What I will cover here is how to deploy mwaa-local-runner onto a standard Cloud9 IDE, deployed in a default VPC. Updating my AWS Cloud9 environment The first thing I needed to do was to increase the size of my local disk as Cloud9 only provides 10gb of storage.
-
Exploring Shell Launch Scripts on Managed Workflows for Apache Airflow (MWAA) and mwaa-local-runner
Apr 17, 2023 | 9 minute read
Managed Workflows for Apache Airflow (MWAA) recently launched a new feature that a lot of folk had been asking for, which was the ability to add additional libraries, binaries, or environment variables when launching Airflow workers. If you missed the announcement, Amazon MWAA now supports Shell Launch Scripts, this new capability allows you to easily do this by creating a script and then configuring your MWAA environments to use that script during the start-up phase.
-
AWS open source newsletter #153
Apr 17, 2023 | 19 minute read
April 17th, 2023 - Instalment #153 Welcome Hello and welcome to the AWS open source newsletter, #153 as featured in the latest episode of Build on Open Source (S2E5) . We have lots of great projects for you this week, with a strong chatGPT influence. “pg_gpt”, “cw-logs-insights-gpt”, and “aiws” all integrate chatGPT to help you do different things on AWS, “semantic-search-aws-docs” is a very interesting demo on how to build a more coherent search for your documentation, “aws-chime-chat-demo” a very nice demo using the Chime SDK, “ckia” is an open source AWS Trusted Advisor tool, “AWS_ED” helps you keep your local IP in sync with external DNS records, “cfnctl” provides a Terraform like cli experience to CloudFormation, and more!
-
AWS open source newsletter #152
Apr 10, 2023 | 16 minute read
April 10th, 2023 - Instalment #152 Welcome Hello and welcome to the AWS open source newsletter, #152 an Easter special. This week sees more great new projects including, “redshift-test-drive” a set of essential tools for Amazon Redshift users, “simple-database-archival-solution” a nice tool to help you archive your data, “attribution-gen” a Go tool to help you build open source attribution documents, “aws-glue-data-catalog-federation” a library to help you federate your Glue catalog, “subnet-utilization-monitor-for-amazon-vpc” a handy tool to keep on top of your IP address allocation, “AlexaGPT” a demo of integrating Alexa with you know what, and more!
-
Working with Managed Workflows for Apache Airflow (MWAA) and Amazon Redshift
Apr 7, 2023 | 19 minute read
Working with Managed Workflows for Apache Airflow (MWAA) and Amazon Redshift I was recently looking at some Stack Overflow questions from the AWS Collective and saw a number of folk having questions about the integration between Amazon Redshift and Managed Workflows for Apache Airflow (MWAA). I thought I would put together a quick post that might help folk address what I saw were some of the common challenges. There is some code that accompanies this post, which you can find at the GitHub repository cdk-mwaa-redshift.
-
Configuring the KubernetesPodOperator on Managed Workflows for Apache Airflow (MWAA) - non OIDC Amazon EKS Clusters
Jan 26, 2023 | 5 minute read
Configuring the KubernetesPodOperator on Managed Workflows for Apache Airflow (MWAA) - non OIDC Amazon EKS Clusters Today I came across an interesting question around the use of the KubernetesPodOperator working on EKS Clusters where you have not configured OIDC. They had followed my blog post, and when it came to running the DAG, they got the following error: [2023-01-26, 13:03:18 UTC] {{kubernetes_pod.py:566}} INFO - Creating pod mwaa-pod-test.0ab20a7075b84175b2a9a3fe32796f53 with labels: {'dag_id': 'kubernetes_pod_example_iam_authenticator', 'task_id': 'pod-task', 'execution_date': '2023-01-26T130310.
-
AWS open source newsletter #142
Jan 23, 2023 | 14 minute read
January 23rd, 2023 - Instalment #142 Welcome Welcome to edition #142 of the AWS open source newsletter. We have another great round up of new projects for you to get stuck into. Here are just a taste of some of the projects, kicking off with “sls-mentor” a new tool to help you assess your serverless applications, “subnet-watcher”, a tool to help you monitor your IP addresses, “aws-cdk-web-administered-apps” a very nice reference solution for applications that have a user and admin component, “serverless-newsletter-app” if you are looking for a newsletter solution and want to host your own, look here first, “aws-iot-with-privatelink” shows you how you use private networks for your IoT traffic, “emr-spark-benchmark” benchmarking tool for assessing your Amazon EMR environments, and “update-aws-ip-ranges” keep automatically updated on Amazon’s IP address ranges.
-
AWS open source newsletter #141
Jan 15, 2023 | 13 minute read
January 16th, 2023 - Instalment #141 Welcome Welcome to the AWS open source newsletter of 2023, edition #141. This week we have more new projects for you to practice your four freedoms, including “distributed-compute-on-aws-with-cross-regional-dask”, a solution to simplify distributed compute using Dask, “amazon-emr-serverless-image-cli” a tool to verify your Amazon EMR custom container images, “serverless-run-watch” a tool to help accelerate your local development if you are using the Serverless Framework, “aws-sso-auto-expand-accounts” a quick browser extension for those using AWS SSO, “basti” a cool Bastion Host alternative, “klotho” generate cloud native code from your code, “amazon-route53-hosted-zone-sync” a nice solution for hybrid DNS use cases, and many more.
-
Running the KubernetesPodOperator in different AWS accounts when using Amazon Managed Workflows for Apache Airflow v2.x
Jan 9, 2023 | 18 minute read
Running KubernetesPodOperator in different AWS accounts update August, 14th I wanted to update to newer version of MWAA, so I have tested the original blog post against EKS 1.24 and MWAA version 2.4.3. I also had a few messages about whether this would work across different AWS regions. The good news is that it does. I have also put together a repo for this here I thought that I would also check/update that it works for newer versions of MWAA, so I had 2.
-
AWS open source news and updates #107
Apr 4, 2022 | 17 minute read
April 4th, 2022 - Instalment #107 Newsletter #107. Welcome to edition #107 of the AWS open source newsletter, and we have a bumper edition this week packed with more great new open source projects and content for you to consume. Topics featured this week include optimising open source big data tools, developer tooling, case studies and we even some some great open source content for .NET core developers. This weeks projects include a really nice handy browser plugin called “aws-search-extension”, that lets you search and find developer information from the AWS docs, a tool that will help you detect whether you have configured or using dockershim in your Kubernetes clusters, a library to help you integrate Amazon Cognito in your Laravel PHP applications, and plenty more developer tools and sample projects.
-
Orchestrating hybrid workflows using Amazon Managed Workflows for Apache Airflow (MWAA)
Mar 7, 2022 | 46 minute read
Using Apache Airflow to orchestrate hybrid workflows In some recent discussions with customers, the topic of how open source is increasingly being used as a common mechanisms to help build re-usable solutions that can protect investments in engineering and development time, skills and that work across on premises and Cloud environment. In 2021 my most viewed blog post talked about how you can build and deploy containerised applications, anywhere (Cloud, your data centre, other Clouds) and on anything (Intel and Arm).
-
AWS open source news and updates #100
Feb 14, 2022 | 12 minute read
Feb 14th, 2022 - Instalment #100 Newsletter #100. Happy Valentines everyone, and welcome to this landmark 100st edition of this newsletter. This week we celebrate the love that many builders have for open source with more great new open source projects and content. Cuddle up to new projects that will help you build scalable systems, simplify your work with AWS DynamoDB, integrate your .NET applications with OpenSearch, keep on top of your VPC networks, and more.
-
AWS open source news and updates #97
Jan 22, 2022 | 12 minute read
Jan 22nd, 2022 - Instalment #97 Newsletter #97. Welcome to another edition of the AWS open source newsletter, packed with more great new open source projects, content, and events. This week, we have new projects that help you improve security by de-obfuscating strings, a library to help you automate the configuration of your build pipelines, a new Terraform module, a nice new VSCode plugin that will help you when working with IAM, and several more.
-
Setting up MWAA to use a KMS key
Dec 14, 2021 | 6 minute read
Introduction In a previous post, I shared how you can using AWS CDK to provision your Apache Airflow environments using the Managed Workflows for Apache Airflow service (MWAA). I was contacted this week by Michael Grabenstein, who flagged an issue with the code in that post. The post used code that configured a kms key for the MWAA environment, but when trying to deploy the app it would fail with the following error:
-
Integrating Amazon Timestream in your Amazon Managed Workflows for Apache Airflow v2.x
Sep 23, 2021 | 28 minute read
Integrating with Amazon Timestream in your Apache Airflow DAGs Amazon Timestream is a fast, scalable, and serverless time series database service perfect for use cases that generate huge amounts of events per day, optimised to make it faster and more cost effective that using relational databases. I have been playing around with Amazon Timestream to prepare for a talk I am doing with some colleagues, and wanted to see how I could integrate it with other AWS services in the context of leveraging some of the key capabilities of Amazon Timestream.
-
Reading and writing data across different AWS accounts with Amazon Managed Workflows for Apache Airflow v2.x
Sep 7, 2021 | 13 minute read
Reading and writing data across different AWS accounts in you Apache Airflow DAGs As regular readers will know, I sometimes lurk in the Apache Airflow slack channel to see what is going on. If you are new to Apache Airflow, or want to get a deeper understanding then I highly recommend spending some time here. The community is super welcoming and eager to help new participants. It was during a recent session I came across an interesting problem that one of the builders was having, which was how to access (read/write) data in an S3 bucket which was in a different account to the one hosting Amazon Managed Workflows for Apache Airflow (MWAA).
-
Working with parameters and variables in Amazon Managed Workflows for Apache Airflow
Jul 27, 2021 | 36 minute read
Maximising the re-use of your DAGs in MWAA During some recently conversations with customers, one of the topics that they were interested in was how to create re-usable, parameterised Apache Airflow workflows (DAGs) that could be executed dynamically through the use variables and/or parameters (either submitted via the UI or the command line). This makes a lot of sense, as you may find that you repeat similar tasks in your workflows, and so this approach allows you to maximise the re-use of that work.
-
Working with the RedshiftToS3Transfer operator and Amazon Managed Workflows for Apache Airflow
May 15, 2021 | 18 minute read
Introduction Inspired by a recent conversation within the Apache Airflow open source slack community, I decided to channel the inner terrier within me to tackle this particular issue, around getting an Apache Airflow operator (the protagonist for this post) to work. I found the perfect catalyst in the way of the original launch post of Amazon Managed Workflows for Apache Airflow (MWAA). As is often the way, diving into that post (creating a workflow to take some source files, transform them and then move them into Amazon Redshift) led me down some unexpected paths to here, this post.
-
Using AWS CDK to deploy your Amazon Managed Workflows for Apache Airflow environment
Apr 28, 2021 | 11 minute read
update I am grateful to Michael Grabenstein for spotting some mistakes in the original post/code. I hope these have now been rectified in this post. Using AWS CDK to deploy your Amazon Managed Workflows for Apache Airflow environment What better way to celebrate CDK Day than to return to a previous blog where I wrote about automating the installation and configuration of Amazon Managed Workflows for Apache Airflow (MWAA), and take a look at doing the same thing but this time using AWS CDK.
-
Automating your ELT Workflows with Managed Workflows for Apache Airflow - Part Two
Apr 21, 2021 | 17 minute read
Part Two - Automating Amazon EMR In Part One, we automated an example ELT workflow on Amazon Athena using Apache Airflow. In this post, Part Two, we will do the same thing but automate the same example ELT workflow using Amazon EMR. Make sure you recap the setup from Part One. All the code so you can reproduce this yourself can be found in the GitHub repository here. Automating Amazon EMR
-
Automating your ELT Workflows with Managed Workflows for Apache Airflow - Part One
Apr 21, 2021 | 18 minute read
update: I have changed the post to use standard Apache Airflow variables rather than using AWS Secrets Manager. Part One - Automating Amazon Athena As part of an upcoming DevDay event, I have been working on how you can use Apache Airflow to help automate your Extract, Load and Transform (ELT) Workflows. Amazon Athena and Amazon EMR are two AWS services that help customers who have existing SQL skills/expertise and are looking at tools such as Presto or Apache Hive when undertaking those transformations.
-
Monitoring and logging with Amazon Managed Workflows for Apache Airflow
Feb 9, 2021 | 12 minute read
Part of a series of posts to support an up-coming online event, the Innovate AI/ML on February 24th, from 9:00am GMT - you can sign up here Part 1 - Installation and configuration of Managed Workflows for Apache Airflow Part 2 - Working with Permissions Part 3 - Accessing Amazon Managed Workflows for Apache Airflow Part 4 - Interacting with Amazon Managed Workflows for Apache Airflow via the command line Part 5 - A simple CI/CD system for your development workflow Part 6 - Monitoring and logging <- this post Part 7 - Automating a simple AI/ML pipeline with Apache Airflow In this post I will be covering Part 6, where to find logs to help you understand and troubleshoot your Apache Airflow workflows, and how you can monitor your Apache Airflow environments.
-
A simple CI/CD system for your Amazon Managed Workflows for Apache Airflow development workflow
Feb 3, 2021 | 16 minute read
updated Feb 19th Part of a series of posts to support an up-coming online event, the Innovate AI/ML on February 24th, from 9:00am GMT - you can sign up here Part 1 - Installation and configuration of Managed Workflows for Apache Airflow Part 2 - Working with Permissions Part 3 - Accessing Amazon Managed Workflows for Apache Airflow Part 4 - Interacting with Amazon Managed Workflows for Apache Airflow via the command line Part 5 - A simple CI/CD system for your development workflow <- this post Part 6 - Monitoring and logging Part 7 - Automating a simple AI/ML pipeline with Apache Airflow In this post I will be covering Part 5, how you can setup a very simple CI/CD setup to enable faster development of your Apache Airflow DAGs.
-
Interacting with Amazon Managed Workflows for Apache Airflow via the command line
Feb 1, 2021 | 12 minute read
Part of a series of posts to support an up-coming online event, the Innovate AI/ML on February 24th, from 9:00am GMT - you can sign up here Part 1 - Installation and configuration of Managed Workflows for Apache Airflow Part 2 - Working with Permissions Part 3 - Accessing Amazon Managed Workflows for Apache Airflow environments Part 4 - Interacting with Amazon Managed Workflows for Apache Airflow via the command line < this post Part 5 - A simple CI/CD system for your development workflow Part 6 - Monitoring and logging Part 7 - Automating a simple AI/ML pipeline with Apache Airflow In this post I will be covering Part 4, how you can interact and access the Apache Airflow via the command line.
-
Accessing your Amazon Managed Workflows for Apache Airflow environments
Jan 28, 2021 | 8 minute read
Part of a series of posts to support an up-coming online event, the Innovate AI/ML on February 24th, from 9:00am GMT - you can sign up here Part 1 - Installation and configuration of Managed Workflows for Apache Airflow Part 2 - Working with Permissions Part 3 - Accessing Amazon Managed Workflows for Apache Airflow environments < this post Part 4 - Interacting with Amazon Managed Workflows for Apache Airflow via the command line Part 5 - A simple CI/CD system for your development workflow Part 6 - Monitoring and logging Part 7 - Automating a simple AI/ML pipeline with Apache Airflow In this post I will be covering Part 3, how you can interact and access the Apache Airflow environments.
-
Working with permissions in Amazon Managed Workflows for Apache Airflow
Jan 27, 2021 | 10 minute read
Part of a series of posts to support an up-coming online event, the Innovate AI/ML on February 24th, from 9:00am GMT - you can sign up here Part 1 - Installation and configuration of Managed Workflows for Apache Airflow Part 2 - Working with Permissions <- this post Part 3 - Accessing Amazon Managed Workflows for Apache Airflow environments Part 4 - Interacting with Amazon Managed Workflows for Apache Airflow via the command line Part 5 - A simple CI/CD system for your development workflow Part 6 - Monitoring and logging Part 7 - Automating a simple AI/ML pipeline with Apache Airflow In this post I will be covering Part 2, how to ensure that you control access to Apache Airflow following best practices such as default no access/least privilege.
-
Automating the installation and configuration of Amazon Managed Workflows for Apache Airflow
Jan 26, 2021 | 15 minute read
updated, August 25th Thanks to Philip T for spotting a typo in the cloudformation code below - it is ok in the GitHub repo, but I have fixed it now below. Part of a series of posts to support an up-coming online event, the Innovate AI/ML on February 24th, from 9:00am GMT - you can sign up here Part 1 - Installation and configuration of Managed Workflows for Apache Airflow <- this post Part 2 - Working with Permissions Part 3 - Accessing Amazon Managed Workflows for Apache Airflow environments Part 4 - Interacting with Amazon Managed Workflows for Apache Airflow via the command line Part 5 - A simple CI/CD system for your development workflow Part 6 - Monitoring and logging Part 7 - Automating a simple AI/ML pipeline with Apache Airflow In this post I will be covering Part 1, automating the installation and configuration of Managed Workflows for Apache Airflow (MWAA).