AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easy to prepare and load data for analytics. With AWS Glue, users can automate time-consuming data preparation tasks, making it easier to organize, understand, and analyze their data.

AWS Glue discovers and catalogs metadata about your data in a centralized metadata repository known as the AWS Glue Data Catalog. This makes your data readily searchable and available for ETL.

Key features of AWS Glue include:

AWS Glue Data Catalog

The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. It enables users to store, annotate, and share metadata in the AWS Cloud in a simple, scalable, and secure manner. Its key features include:

AWS Glue Data Catalog integrates with AWS Kinesis Data Firehose by allowing you to catalog the streaming data as it arrives in real-time. It can automatically infer schema of the incoming data stream and store it in the Data Catalog. Once the metadata is stored, it can be used to query the data in Amazon S3 using services like Amazon Athena and Amazon Redshift Spectrum.

ML Transforms

Development Endpoint

Crawlers

DataBref