Prompt Sheet - dbt Wizard HOL

Onboarding

Onboarding - First Week at The Builder Depot

A seven-step workflow that collapses the typical two-week "where is everything?" onboarding into a single guided session, ending with a working model the user wrote themselves, previewed but not materialized.

Step 1 - Project summary

Summarize what this dbt project does. What are the main subject areas and how is the project organized?

List the staging, intermediate, and mart models. Group them by domain.

Wizard exercises: project-level status, repo summarization

Step 2 - Inventory the models
```
Show me the lineage, grain, and key columns for the orders mart model.
```
Wizard exercises: search, folder and tag grouping

Step 3 - Deep-dive on one mart

Show me a 10-row sample of the orders mart and the distinct values in the order_status column.

Wizard exercises: describe, lineage

Step 4 - Look at the actual data
```
What tests and contracts are defined on the orders model? Are any currently failing?
```
Wizard exercises: warehouse (live sample of the mart), select distinct against status

Step 5 - Tests and failure signal

Create a new mart model called orders_by_week that aggregates orders to the week grain with order count, gross revenue, and distinct customers.

Wizard exercises: test introspection, run-results parsing

Step 6 - Ship something small
```
Compile and preview orders_by_week. Don't materialize it.
```
Wizard exercises: file edits, convention-aware model generation
Step 7 - Compile and preview (no materialization)

Wizard exercises: dbt_compile, dbt_show. No additional prompt - the previous step's request kicks off the compile and preview.

Scenario 1

Scenario 1 - Inventory Misallocation Investigation

A six-step workflow that turns a stakeholder question - "where did the missing inventory go?" - into a materialized dbt model naming the stores with inventory counts above or below the expected shipment-plan quantity.

Step 1 - Discovery

Find the models in this project related to inventory, stores, items, and shipments.

For those models, show the grain, key columns, and how they join together.

Wizard exercises: status, search

Step 2 - Schema Understanding

Check [specific item] shipments and inventory. Expected quantity is [N] per store. Show the per-store expected quantity, actual inventory, variance quantity, and variance direction, ordered by absolute variance desc, warehouse_id.

Wizard exercises: describe, lineage. Substitute the item and the expected per-store quantity.

Step 3 - Data Inspection

Create a dbt model named inventory_shipment_variance that lists every store where actual inventory for that item differs from the expected per-store quantity. Include store name, city or region, item name, actual inventory count, expected count, variance quantity, and a variance direction showing over-count or under-count.

Wizard exercises: warehouse, dbt_show on a targeted slice (no full scans, no materialization)

Step 4 - Model Creation

Compile the model and preview the first 20 rows using deterministic ordering. Order inventory variances by abs(variance_quantity) desc, warehouse_id, product_id.

Wizard exercises: file edits, model generation

Step 5 - Safe Preview

Before materializing, confirm the active dbt target, the dev schema, and permission to create the model. Then materialize inventory_shipment_variance into my dev schema as a table. For this timed lab, skip extended verification after the successful compile and deterministic preview. Materialize only after the expected rows appear.

Wizard exercises: dbt_compile, dbt_show

Step 6 - Materialize

Wizard exercises: dbt_run against the user's dev schema (dev_lab_user_N). No additional prompt - the previous step's request kicks off the materialize.

Scenario 2 - Choose your path

Scenario 2 - Extending an Existing Model with a New Source

A six-step workflow for wiring a new Fivetran-synced source into an existing intermediate model without breaking the downstream consumers that already depend on it.

Path substitution - fill placeholders below with your path's values:

Placeholder	Path A - Customer 360	Path B - Operations	Path C - Merchandising
`[TARGET_MODEL]`	`int_customer_order_summary`	`int_orders_enriched`	`int_product_sales_summary`
`[NEW_SOURCE]`	`retail.RET_TICKETS`	`retail.RET_TICKETS`	`retail.RET_PRODUCT_REVIEWS`
`[ENTITY]`	customer	order	product
`[NEW_COLUMNS]`	open_tickets_count, last_ticket_status, last_ticket_opened_at	ticket_count, has_open_ticket_flag, last_ticket_status	avg_rating, review_count, low_rating_count

Step 1 - Locate the target model

Find [TARGET_MODEL] in this project. Show me what it currently produces, its grain, and which models depend on it downstream.

Find every source in this project related to [ENTITY] that [TARGET_MODEL] does NOT currently reference. I want to know what data is sitting in our warehouse that we're not using yet.

Wizard exercises: search, describe, lineage

Step 2 - Discover the unused source

Describe the schema of [NEW_SOURCE]. Show me the columns, their types, the grain, and which column joins back to [ENTITY].

Wizard exercises: status, search, source-vs-model cross-referencing

Step 3 - Validate the join

Run a quick check: count rows in [NEW_SOURCE], count distinct join keys, and count how many of those keys match an [ENTITY] already in [TARGET_MODEL]. Tell me whether the grain is one-to-one or one-to-many.

Wizard exercises: describe, warehouse, join-key inspection

Step 4 - Modify the existing model

Update [TARGET_MODEL] to add [NEW_COLUMNS] from [NEW_SOURCE]. Use a LEFT JOIN so [ENTITY] rows without a match still appear, and aggregate [NEW_SOURCE] to one-row-per-[ENTITY] before joining if its grain is many-to-one. Preserve every column the model currently emits - only add new columns at the end.

Compile [TARGET_MODEL] and every downstream model that depends on it. Then preview 20 rows of [TARGET_MODEL] ordered deterministically. Do not materialize anything.

Wizard exercises: file edits on the existing model file (not a new file)

Step 5 - Compile downstream + safe preview

Materialize [TARGET_MODEL] into my dev schema. Skip the verification pass - the preview and downstream compile already confirmed the output.

Wizard exercises: dbt_compile across the lineage, dbt_show on the target

Step 6 - Materialize

Wizard exercises: dbt_run against the user's dev schema. No additional prompt - the previous step's request kicks off the materialize.

Path notes: Path A and Path B both use RET_TICKETS - that's intentional. The lesson is that "new source" means new to THIS model, not new to the project.

Scenario 3

Scenario 3 - High-Value Customer Segmentation

A seven-step workflow that turns "which customers should Marketing target?" into a reusable activity layer plus a segment model, materialized into the user's dev schema.

Step 1 - Discovery

Find the models related to customers, stores, orders, order lines, products, and categories.

Show the grain and joins for those models.

Wizard exercises: status, search

Step 2 - Schema Understanding

Check recent order dates and category values needed for a 180-day segmentation model.

Wizard exercises: describe, lineage

Step 3 - Data Inspection
```
Create a 180-day customer activity model by store.
```
Wizard exercises: warehouse, dbt_show on the rolling-window anchor and distinct category values
Step 4 - Activity Layer Model
```
Create a segment model for VIPs, big spenders, and category-loyal customers, built on top of the activity model.
```
Wizard exercises: file edits, model creation. Reusable intermediate model with customer x store grain (customer x store x category for category counts).
Step 5 - Segment Model
```
Compile and preview the segment model. Exclude customers with no segment.
```
Wizard exercises: file edits, model creation on top of the activity layer. Segment thresholds: VIP (avg > $100 AND count >= 3), big spender (max > $300), category-loyal (category count >= 10).

Step 6 - Safe Preview

Materialize the segment model into my dev schema. Skip the verification pass - the preview already confirmed the output.

Wizard exercises: dbt_compile, dbt_show

Step 7 - Materialize

Wizard exercises: dbt_run against the user's dev schema (dev_lab_user_N). No additional prompt - the previous step's request kicks off the materialize.

Scenario 4 - Choose your path

Scenario 4 - Broken Model from a Source Column Rename

A five-step workflow that turns a red dbt run ("Column 'X' does not exist in source") into a fixed, re-running pipeline. Reproduce the failure, map the blast radius, apply an alias-preserving fix, re-run to green.

Path substitution - fill placeholders below with your path's values:

Placeholder	Path A - Products	Path B - Orders	Path C - Customers
`[BROKEN_MODEL]`	`stg_products`	`stg_orders`	`stg_customers`
`[OLD_COLUMN]`	`brand`	`status`	`customer_type`
`[NEW_COLUMN]`	`brand_name`	`order_status`	`segment`
Source table	`retail.RET_PRODUCTS`	`retail.RET_ORDERS`	`retail.RET_CUSTOMERS`
Blast radius	~8 downstream files (4 intermediates, 4 marts)	~7 downstream files + YAML tests	5 downstream files + YAML tests

Step 0 - Pre-step - Reproduce the failure (TERMINAL)

Run in your terminal, NOT inside dbt Wizard. From the project root:
```
dbt run --select [BROKEN_MODEL]+
```
You will see a compile or runtime error referencing [OLD_COLUMN]. Capture the exact error text - Step 1 uses it.

Step 1 - Explain the failure

My dbt run just failed. Read the most recent run results and tell me which model failed, what the error was, and which upstream source or column the error references.

Describe the current schema of the upstream source table that [BROKEN_MODEL] reads from. List every column that exists today.

Wizard exercises: status, dbt_show against run-results, error parsing

Step 2 - Compare model code against current source schema
```
Show me every model, source definition, and test in this project that references the column [OLD_COLUMN]. I need a complete blast-radius list before I change anything.
```
Wizard exercises: describe, warehouse. Read the live column list from Snowflake and lay it next to what the model is asking for.

Step 3 - Blast-radius check

Update [BROKEN_MODEL] and every other file you just listed to use [NEW_COLUMN] instead of [OLD_COLUMN]. Keep the downstream column alias the same so consumers of these models don't break - only the source-side reference should change.

Wizard exercises: search plus lineage. If the count is suspiciously low, follow up with: "Are you searching tests and YAML files too? Make sure _staging__models.yml is in the result."

Step 4 - Apply the fix, preserve the contract
```
Compile [BROKEN_MODEL] and every downstream model you just edited, then preview the first 10 rows of [BROKEN_MODEL] ordered deterministically. Do not materialize anything yet.
```
Wizard exercises: file edits across multiple files. The staging model becomes select [NEW_COLUMN] as [OLD_COLUMN], not select [NEW_COLUMN].
Step 5 - Compile, preview, and re-run (TERMINAL re-run)

Wizard exercises: dbt_compile, dbt_show. Then run the actual re-build from your terminal:

Run in your terminal, NOT inside dbt Wizard.
```
dbt run --select [BROKEN_MODEL]+
```
A green run here closes the loop on the original failure from Step 0.

Path notes: All three paths are runnable simultaneously across the room. The instructor pre-applies the column rename to each cohort's source table before the lab.

Reference

Glossary

Terms that appear across the prompts and dbt Wizard responses.

dbt_show: The dbt Wizard tool that runs a compiled model's SQL and returns a sample of rows. Does not materialize anything in the warehouse. Use this to preview output before committing to a build.
dbt_compile: Renders a model's Jinja and ref() calls into the exact SQL that would be sent to the warehouse, without executing it. Catches reference errors, missing columns, and contract violations before any data is touched.
lineage: The directed graph of dependencies between sources and models. Upstream lineage shows what feeds a model; downstream lineage shows what depends on it. Knowing both is the difference between a safe edit and a broken dashboard.
describe: Returns the columns, grain, tests, and contracts of a model or source. The "what is this thing?" question, answered with metadata rather than guesswork.
search: Finds models, sources, columns, or tests by keyword or domain across the project. The grep replacement that understands dbt structure.
status: Project-level summary: which models exist, which ran last, which tests are passing or failing, which sources are configured. The orientation tool used at the start of every scenario.
warehouse: The live database connection (Snowflake in this lab). When dbt Wizard reads from "the warehouse," it is running a query against the real tables, not the dbt manifest.
ref(): The dbt Jinja function that points one model at another. Using ref('orders') instead of a hardcoded table name is what gives dbt its lineage graph and its environment-portability.
grain: The unit of one row in a table. "One row per order," "one row per customer per day," "one row per order line." Naming the grain out loud before writing a join is the single highest-leverage habit in analytics engineering.
blast radius: The set of files a change will affect. For a source column rename, the blast radius is every staging model, intermediate model, mart, YAML test, and downstream consumer that references the old name. Map it before you edit.
materialize: The act of building a model into the warehouse as a table or view. The opposite of "preview only." In the HOL, you materialize into your personal dev_lab_user_N schema, which is dropped after the lab.

Every prompt, in order

Table of contents

Onboarding

Inventory Misallocation

Extending a Model

Marketing Segments

Broken Source Rename

Onboarding - First Week at The Builder Depot

Scenario 1 - Inventory Misallocation Investigation

Scenario 2 - Extending an Existing Model with a New Source

Scenario 3 - High-Value Customer Segmentation

Scenario 4 - Broken Model from a Source Column Rename

Glossary