Structuring your Dagster project
To learn how to create a new Dagster project, see Creating a new Dagster project.
There are many ways to structure your Dagster project, and it can be difficult to know where to start. In this guide, we will walk you through our recommendations for organizing your Dagster project.
Your initial project structure
When you first scaffold your project using the create-dagster
CLI, it looks like the following:
$ create-dagster project my-project
- uv
- pip
.
└── my-project
├── pyproject.toml
├── src
│ └── my_project
│ ├── __init__.py
│ ├── definitions.py
│ └── defs
│ └── __init__.py
├── tests
│ └── __init__.py
└── uv.lock
.
└── my-project
├── pyproject.toml
├── src
│ └── my_project
│ ├── __init__.py
│ ├── definitions.py
│ └── defs
│ └── __init__.py
└── tests
└── __init__.py
This is a great structure as you are first getting started, however, as you begin to introduce more assets, jobs, resources, sensors, and utility code, you may find that your Python files are growing too large to manage.
Restructure your project
There are several paradigms in which you can structure your project. Choosing one of these structures is often personal preference, and influenced by how you and your team members operate. This guide will outline three possible project structures:
Option 1: Structured by technology
Data engineers often have a strong understanding of the underlying technologies that are used in their data pipelines. Because of that, it's often beneficial to organize your project by technology. This enables engineers to easily navigate the code base and locate files pertaining to the specific technology.
Within the technology modules, submodules can be created to further organize your code.
- uv
- pip
.
├── pyproject.toml
├── src
│ └── my_project
│ ├── __init__.py
│ ├── definitions.py
│ └── defs
│ ├── __init__.py
│ ├── dbt