Skip to main content

Build your first ETL pipeline

In this tutorial, you'll build a full ETL pipeline with Dagster that:

Imports data into DuckDB using Sling
Transforms data into reports with dbt
Runs scheduled reports automatically
Generates one-time reports on demand
Visualizes the data with Evidence

You will learn to:

How to set up a Dagster project with the recommended project structure
Integrate with other tools
Create and materialize assets and dependencies
Ensure data quality with asset checks
Create and materialize partitioned assets
Automate the pipeline
Create and materialize assets with sensors

Prerequisites

To follow the steps in this guide, you'll need:

Basic Python knowledge
Python 3.9+ installed on your system. Refer to the Installation guide for information.
Familiarity with SQL and Python.
Understanding of data pipelines and the extract, transform, and load process.

Set up your Dagster project

Open your terminal and scaffold a new project with uv:
```
uvx create-dagster project etl_tutorial
```
Change into that project
```
cd etl_tutorial
```
Activate the project virtual environment:
- MacOS
- Windows
source .venv/bin/activate
To make sure Dagster and its dependencies were installed correctly, start the Dagster webserver:
```
dg dev
```
In your browser, navigate to http://127.0.0.1:3000

TODO: Screenshot

At this point the project will be empty but we will continue to add to it throughout the tutorial.

Next steps

Continue this tutorial by creating and materializing assets

You will learn to:
Set up your Dagster project
Next steps