Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Appearance settings

ateneva/dbt-data-transformations

Open more actions menu

Repository files navigation

Project Setup


This project sets up the data modelling and day-to-day-operations of theLook e-commerce DWH leveraging:

  • dbt-core

  • BigQuery

  • Cloud Composer

  • Google Cloud Provider for Terraform


Data Modelling Principles & Guidelines

The DWH transformations of theLook e-commerce data were architected under the following principles and guidelines

Be Analyst Friendly

  • Analysts shouldn't have to do multiple joins to retrieve meaningful data

Be Subject-Oriented

  • Tables are organized around major topics of interest, such as customers, products, orders

  • Each subject represents One-Big-Table with nested arrays and structs

    • child objects should never be orphans
    • child objects will always be queried within the context of the parent object

Be Relevant

  • Data should reflect how current underlying platform functions

  • Data should reflect the topics of interest to business

Be Cost Efficient

  • Only process pieces of information that have changed

  • Avoid scanning too much data per run

Be Easy to Maintain

  • Backfilling historical data should be possible via the scheduled run without the need for extra code adjustments

  • Changes in data should be easy to trace and audit

Avoid complex dependencies

  • Processing by topic instead of monolitic schedules of all topics together

Enforcing Code Quality

The following linters are in place

  • SQL linting with custom configuration for .sqlfluff

  • YAML linting with custom configuration for .yamllint

  • Python linting with default configuration via pylint

  • Markdown linting with default configuration with pymarkdownlint

SQL Linting

To see if your SQL is compliant to the defined standard, you can run the following commands

# lint a specific file
sqlfluff lint path/to/file.sql

# lint a file directory
sqlfluff lint directory/of/sql/files

# let the linter fix your code
sqlfluff fix folder/model.sql

YAML Linting

# check which files will be linted by default
yamllint --list-files .

# lint a specific file
yamllint my_file.yml

# OR
yamllint .

Pre-commit have been set up in this repo to check and fix for:

  • missing lines at the end
  • trailing whitespaces
  • violations of sql standards
  • errors in yaml syntax

dbt pre-commit hooks have been set up to check that:

Hence, when working with the repo, make sure you've got the pre-commit installed so that they run upon your every commit

# install the githook scripts
pre-commit install

# run against all existing files
pre-commit run --all-files

Setting up Local Testing Environments

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
Morty Proxy This is a proxified and sanitized view of the page, visit original site.