Skip to main content

Build your first ETL pipeline

In this tutorial, you'll build a full ETL pipeline with Dagster that:

  • Imports data into DuckDB using Sling
  • Transforms data into reports with dbt
  • Runs scheduled reports automatically
  • Generates one-time reports on demand
  • Visualizes the data with Evidence

You will learn to:

  • How to set up a Dagster project with the recommended project structure
  • Integrate with other tools
  • Create and materialize assets and dependencies
  • Ensure data quality with asset checks
  • Create and materialize partitioned assets
  • Automate the pipeline
  • Create and materialize assets with sensors
Prerequisites

To follow the steps in this guide, you'll need:

  • Basic Python knowledge
  • Python 3.9+ installed on your system. Refer to the Installation guide for information.
  • Familiarity with SQL and Python.
  • Understanding of data pipelines and the extract, transform, and load process.

Set up your Dagster project

  1. Open your terminal and scaffold a new project with uv:

    uvx create-dagster project etl_tutorial
  2. Change into that project

    cd etl_tutorial
  3. Activate the project virtual environment:

    source .venv/bin/activate

  4. To make sure Dagster and its dependencies were installed correctly, start the Dagster webserver:

    dg dev

    In your browser, navigate to http://127.0.0.1:3000

TODO: Screenshot

At this point the project will be empty but we will continue to add to it throughout the tutorial.

Next steps