A six-step workflow for the most common real-world analytics-engineering task: an existing intermediate model is missing a domain, and a Fivetran-synced source already in the warehouse can fill the gap. Wire it in without breaking the downstream consumers.
Path A - extend int_customer_order_summary with retail.RET_TICKETS for the VP of Customer Success.
Path B - extend int_orders_enriched with retail.RET_TICKETS at the order grain for the Director of Operations. Same source as Path A, different model and grain on purpose.
Path C - extend int_product_sales_summary with retail.RET_PRODUCT_REVIEWS for the Head of Merchandising.
A stakeholder has a follow-up question that the current model can't answer. The data they need is already arriving in Snowflake via Fivetran. It just isn't connected to the model they're staring at.
The job is to wire the new source into the existing intermediate model without breaking the downstream consumers that already depend on it. You'll find the unused source, validate the join's grain and coverage before writing any SQL, modify the existing model with an alias-safe left-joined extension, and confirm the entire downstream lineage still compiles. All in minutes.
Pick one path before Step 1. The prompt flow is identical across paths - only the placeholders change. The room can run all three simultaneously.
Stakeholder: VP of Customer Success
Ask: When we segment customers for renewal outreach, can we deprioritize customers with open complaints?
int_customer_order_summaryretail.RET_TICKETSopen_tickets_count, last_ticket_status, last_ticket_opened_atStakeholder: Director of Operations
Ask: For every order in the enriched view, can we see how many support tickets were opened against it so we can flag problematic orders for postmortem?
int_orders_enrichedretail.RET_TICKETS (at order grain)ticket_count, has_open_ticket_flag, last_ticket_statusStakeholder: Head of Merchandising
Ask: In product performance, can we surface average customer rating and review volume so we can see which top sellers are also well-rated?
int_product_sales_summaryretail.RET_PRODUCT_REVIEWSavg_rating, review_count, low_rating_count (1-2 stars)RET_TICKETS) but at the order grain. That's the point - the source is "new to this model," not new to the project.
The prompts below use placeholders. Substitute the values for the path you picked - the reference card sits underneath.
Start from the model we're being asked to extend, not from a blank file. Knowing the downstream consumers up front is what separates a safe edit from a Slack thread about a broken dashboard.
Find [TARGET_MODEL] in this project. Show me what it currently produces, its grain, and which models depend on it downstream.
Then ask what unused sources are available.
Find every source in this project related to [ENTITY] that [TARGET_MODEL] does NOT currently reference. I want to know what data is sitting in our warehouse that we're not using yet.
dbt Wizard surfaces a Fivetran-synced source that's already configured in the project but isn't connected to the target model. No need to spelunk through _staging__sources.yml or guess what's been added since the last time anyone looked.
Describe the schema of [NEW_SOURCE]. Show me the columns, their types, the grain, and which column joins back to [ENTITY].
This is the step that prevents the most common silent failure of cross-domain joins: a key that looks right by name but doesn't behave right by data. Verify grain, coverage, and join-key match before any SQL is touched.
Run a quick check: count rows in [NEW_SOURCE], count distinct join keys, and count how many of those keys match an [ENTITY] already in [TARGET_MODEL]. Tell me whether the grain is one-to-one or one-to-many.
[ENTITY], decide with dbt Wizard whether to aggregate (open tickets count, review count) or pick the latest row by timestamp (last ticket status). Name the aggregation choice out loud before Step 4.
Edit the existing model file, not a new file. The skill being demonstrated is safe modification of an existing asset, not another bolt-on intermediate model.
Update [TARGET_MODEL] to add [NEW_COLUMNS] from [NEW_SOURCE]. Use a LEFT JOIN so [ENTITY] rows without a match still appear, and aggregate [NEW_SOURCE] to one-row-per-[ENTITY] before joining if its grain is many-to-one. Preserve every column the model currently emits - only add new columns at the end.
[ENTITY] rows without a match). Preserve existing columns (downstream consumers select specific columns - renaming or removing them is a contract break).
Compiling the downstream models too is the compile-time guarantee that we haven't broken any consumer that depends on a column that no longer exists or has changed type. Small cost, high catch rate.
Compile [TARGET_MODEL] and every downstream model that depends on it. Then preview 20 rows of [TARGET_MODEL] ordered deterministically. Do not materialize anything.
[TARGET_MODEL] is unchanged. Every downstream model compiled without error.
Build into your dev schema. Step 5 already validated output and downstream compile, so the verification pass is scoped out for this timed lab.
Materialize [TARGET_MODEL] into my dev schema. Skip the verification pass - the preview and downstream compile already confirmed the output.
int_customer_order_summaryretail.RET_TICKETSopen_tickets_count, last_ticket_status, last_ticket_opened_atint_orders_enrichedretail.RET_TICKETSticket_count, has_open_ticket_flag, last_ticket_statusint_product_sales_summaryretail.RET_PRODUCT_REVIEWSavg_rating, review_count, low_rating_count[TARGET_MODEL] now emits [NEW_COLUMNS] from [NEW_SOURCE] in your dev schema. Existing column contract is preserved, downstream models still compile, row count is unchanged from the pre-edit baseline.
The canonical scenario-2 skill definition with all three paths and instructor notes.
One-page lab handout with every prompt in this scenario, formatted for print.
High-value customer segmentation. Build a two-model design: reusable activity layer plus a segment model for Marketing.