Scenario 3 - High-Value Customer Segmentation

Segment Definitions

Three canonical segments

Customers can belong to multiple segments. The model tags or unions rows so a single customer can appear with multiple segment_name values - that's the business definition, not a bug.

VIP

High frequency, high value

avg_transaction_value > $100 AND transaction_count >= 3

The customer who shows up regularly and spends meaningfully every time.

Big spender

At least one large transaction

max_transaction_value > $300

A single transaction over $300 puts them in this segment - independent of frequency.

Category-loyal

Deep in one category

category_transaction_count >= 10 for any single category

The customer who keeps coming back for the same category - perfect for targeted merch campaigns.

The Prompt Flow

What you'll type, in order

Copy each prompt as written (recommended) or rephrase in your own words. Run the steps in order; do not skip ahead.

Discovery

Six entity types - wider than the other scenarios - which is exactly why discovery has to happen before any SQL.
```
Find the models related to customers, stores, orders, order lines, products, and categories.
```
Then ask for grain and joins.
```
Show the grain and joins for those models.
```
Two domains are commonly missed: a dedicated categories model (sometimes category lives only as a column on products) and a separate order lines model (sometimes folded into orders). Name either before moving on.

Exercises: status, search
Schema Understanding

Check three things when the response returns: where category actually lives (product level, order-line level, or order header), the grain of orders versus order lines, and whether store is on the order, on the customer, or on both.
```
Check recent order dates and category values needed for a 180-day segmentation model.
```
Exercises: describe, lineage
Data Inspection

Two silent bugs to watch for: a stale max(order_date) that quietly shrinks the 180-day window, and a category field with nulls, whitespace, or mixed case that breaks category-loyal logic later.
```
Create a 180-day customer activity model by store.
```
If the max order date is more than a few days old, decide whether to anchor the window on current_date or on max(order_date). If categories are dirty, decide on exclude-versus-coalesce now.

Exercises: warehouse, dbt_show on rolling-window anchor and distinct categories
Activity Layer Model

The reusable intermediate model - not the segment model yet. Per-customer x store aggregates over the trailing 180 days: transaction_count, avg_transaction_value, max_transaction_value, and category_transaction_count at customer x store x category grain.
```
Create a segment model for VIPs, big spenders, and category-loyal customers, built on top of the activity model.
```
Building this as its own model is the design choice that pays back for years. Verify the 180-day window is applied consistently and that the grain is customer x store (with customer x store x category for the category counts).

Exercises: file edits, model creation
Segment Model

Built on top of the activity layer. Expected columns: customer identifier, store_name, segment_name, transaction_count, avg_transaction_value, max_transaction_value, category, category_transaction_count. Verify multi-segment logic and confirm thresholds match the definitions above.
```
Compile and preview the segment model. Exclude customers with no segment.
```
Exercises: file edits, model creation on top of activity layer
Safe Preview

The SQL compiles, sample rows render, nothing lands in the warehouse yet. The "exclude no segment" filter doubles as a data-quality check: if a large share of customers fall out, the thresholds or the date window need a second look before Marketing ever sees the list.
```
Materialize the segment model into my dev schema. Skip the verification pass - the preview already confirmed the output.
```
Confirm multi-segment customers appear on multiple rows. Flag any segment that comes back suspiciously empty (zero VIPs is almost always a threshold or join issue). Eyeball whether the audience size is plausible for a Marketing campaign.

Exercises: dbt_compile, dbt_show
Materialize

dbt Wizard runs the build against your dev schema (dev_lab_user_N). The instructor drops dev schemas after the lab via a cleanup script, so this build is safe and reversible.

The "skip the verification pass" instruction is deliberate: the Step 6 preview already validated the output, and re-running a full verification pass would burn roughly 10% of a 20-minute lab on duplicate work.

Exercises: dbt_run against dev schema

High-Value Customer Segmentation

What you're solving

Three canonical segments

High frequency, high value

At least one large transaction

Deep in one category

What you'll type, in order

Final artifact

Keep going

Open SKILL.md

Printable prompt sheet

Scenario 4