til.duyet.net
til.duyet.net
blog
til.duyet.net
til.duyet.net
Today I Learned
Python
date_range_generator
get_all_s3_keys
Pipenv
YAML config file with environment variables
Database
Athena - Add Partition
Redshift - Check the table size
Redshift - tables and their owners
Redshift - GRANT
Postgres - List tables
Hive - SHOW schemas/tables/create
Postgres - Index Summary
Bigquery - Sample queries for audiences based
Bigquery - UNNEST in SELECT
Data Engineer
Tools
EMR
Kubernetes
Helm Charts
Apache Airflow
Unix
Docker cleanup
Git - Pretty git branch graphs
K8S - Port Forward
Checking files in Docker build context
Bash get the directory of the current script
Articles
Miscellaneous
[Fig] Single machine and distributed system structure
Deploying Machine Learning Models at Scale
Go
Functions and Methods in Go?
Powered by GitBook

Tools

  • Dremio - The Data Lake Engine - https://www.dremio.com​

  • Apache Airflow - programmatically author, schedule and monitor workflows - https://airflow.apache.org​

  • Perfect - The New Standard in Dataflow Automation - https://www.prefect.io​

  • Flyte - Develop, execute, and monitor distributed workflows reliably at scale - https://github.com/lyft/flyte​

  • Amundsen - Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data - https://github.com/lyft/amundsen

​

Database - Previous
Bigquery - UNNEST in SELECT
Next - Data Engineer
EMR
Last updated 10 months ago
Edit on GitHub