A collection of configuration templates for AWS Glue and AWS Glue DataBrew resources as well as security controls for monitoring and protecting AWS Glue configuration such as Config Rules, CloudWatch Alarms, EventBridge Rules, IAM policies, and more.
This template creates a Glue table and a Kinesis Firehose delivery stream. The Glue table is used to store data with specific columns and partition keys, while the Kinesis Firehose delivery stream is used to deliver data to an S3 bucket in a specific format.
This template creates an on-demand trigger that triggers one job.
This template creates a scheduled trigger that runs every two hours and triggers two jobs. It declares an argument for prod-job3.
This template creates a conditional trigger that starts a job based on the successful completion of the job run.
This CloudFormation template creates an AWS Glue Crawler that crawls an S3 bucket and updates a database with the crawled data. It also creates the necessary IAM roles and policies for the crawler to function properly.
This template creates an AWS Glue job with an associated role. The job is configured with a command to run a Glue ETL script located in an Amazon S3 bucket.
Sets up an AWS Glue Catalog Table optimized for Athena, using Parquet format with SNAPPY compression, stored in S3.
Defines and uses a custom connector for AWS Glue to connect to Snowflake using a JDBC driver stored in S3.
This template generates a Python script for AWS Glue using a Directed Acyclic Graph (DAG) to define data flow between nodes.
This template generates Scala code for AWS Glue using a Directed Acyclic Graph (DAG) to define data flow between nodes.
Defines and uses a custom connector for AWS Glue, specifically for a Snowflake JDBC connection, utilizing AWS Secrets Manager for credentials.
Defines an AWS Glue Crawler for catalog targets with schema change policies.
Configures encryption settings for AWS Glue Data Catalog, including connection password encryption and encryption at rest using AWS KMS.
Creates an AWS Glue Data Quality Ruleset targeting a specific database and table with rules for data completeness.
Configures an AWS Glue job to use Ray framework with specific worker type and Python version.
This template sets up an AWS Glue ML Transform for finding matches, along with a Glue Catalog Database and a Glue Catalog Table configured with various properties including storage, serialization, and partitioning.
This template creates an AWS Glue catalog database and table, and defines a partition index on the table.
This template sets up a Glue resource policy allowing the creation of tables across all Glue resources in the specified AWS account and region.
Creates an AWS Glue Schema with a specified name, registry ARN, data format, compatibility setting, and schema definition.
Creates an AWS Glue catalog database and a user defined function within that database.
Creates an AWS Glue workflow with two triggers and two jobs, forming a simple directed acyclic graph (DAG).
This template creates a new schedule for one or more AWS Glue DataBrew jobs. The schedule can be set to run at a specific date and time or at regular intervals.
This template creates a new AWS Glue DataBrew transformation recipe with the specified name, description, steps, condition expressions, and tags.
This template creates a new AWS Glue DataBrew project with the specified properties. The project will have a name, recipe name, dataset name, role ARN, and a sample size and type. It also includes tags for additional metadata.
This template creates a DataBrew profile job with the specified properties. The job is of type PROFILE and has a name, dataset name, role ARN, job sample mode, output location, and tags.
This template creates a new DataBrew dataset with the specified properties. The dataset is created with a name, input location in an S3 bucket, format options, and optional tags.