Project Setup

Project Setup

This project sets up the data modelling and day-to-day-operations of theLook e-commerce DWH leveraging:

dbt-core
BigQuery
Cloud Composer
Google Cloud Provider for Terraform

Data Modelling Principles & Guidelines

The DWH transformations of theLook e-commerce data were architected under the following principles and guidelines

Be Analyst Friendly

Analysts shouldn't have to do multiple joins to retrieve meaningful data

Be Subject-Oriented

Tables are organized around major topics of interest, such as customers, products, orders
Each subject represents One-Big-Table with nested arrays and structs
- child objects should never be orphans
- child objects will always be queried within the context of the parent object

Be Relevant

Data should reflect how current underlying platform functions
Data should reflect the topics of interest to business

Be Cost Efficient

Only process pieces of information that have changed
Avoid scanning too much data per run

Be Easy to Maintain

Backfilling historical data should be possible via the scheduled run without the need for extra code adjustments
Changes in data should be easy to trace and audit

Avoid complex dependencies

Processing by topic instead of monolitic schedules of all topics together

Enforcing Code Quality

The following linters are in place

SQL linting with custom configuration for .sqlfluff
YAML linting with custom configuration for .yamllint
Python linting with default configuration via pylint
Markdown linting with default configuration with pymarkdownlint

SQL Linting

To see if your SQL is compliant to the defined standard, you can run the following commands

# lint a specific file
sqlfluff lint path/to/file.sql

# lint a file directory
sqlfluff lint directory/of/sql/files

# let the linter fix your code
sqlfluff fix folder/model.sql

SQL linting (and fixing) is enforced via pre-commit hooks for sqlfluff

YAML Linting

# check which files will be linted by default
yamllint --list-files .

# lint a specific file
yamllint my_file.yml

# OR
yamllint .

pre-commit hooks

Pre-commit have been set up in this repo to check and fix for:

missing lines at the end
trailing whitespaces
violations of sql standards
errors in yaml syntax

dbt pre-commit hooks

dbt pre-commit hooks have been set up to check that:

there are no compilation errors
no dbt script is directly referring to a table
script contains only existing sources or macros
no semi-colons have been forgotten at the end of sql queries

Hence, when working with the repo, make sure you've got the pre-commit installed so that they run upon your every commit

# install the githook scripts
pre-commit install

# run against all existing files
pre-commit run --all-files

Name	Name	Last commit message	Last commit date
Latest commit History 78 Commits 78 Commits
airflow	airflow
dbt	dbt
terraform	terraform
.gitignore	.gitignore
.pre-commit-config.yaml	.pre-commit-config.yaml
.sqlfluff	.sqlfluff
.yamllint	.yamllint
README.md	README.md
dbt_project.yml	dbt_project.yml
package-lock.yml	package-lock.yml
packages.yml	packages.yml
profiles.yml	profiles.yml
requirements.txt	requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Setup

Data Modelling Principles & Guidelines

Be Analyst Friendly

Be Subject-Oriented

Be Relevant

Be Cost Efficient

Be Easy to Maintain

Avoid complex dependencies

Enforcing Code Quality

SQL Linting

YAML Linting

pre-commit hooks

dbt pre-commit hooks

Setting up Local Testing Environments

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Search code, repositories, users, issues, pull requests...

ateneva/dbt-data-transformations

Folders and files

Latest commit

History

Repository files navigation

Project Setup

Data Modelling Principles & Guidelines

Be Analyst Friendly

Be Subject-Oriented

Be Relevant

Be Cost Efficient

Be Easy to Maintain

Avoid complex dependencies

Enforcing Code Quality

SQL Linting

YAML Linting

pre-commit hooks

dbt pre-commit hooks

Setting up Local Testing Environments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages