Personal shopper agent#
Pick the highest-value action at every tick when the goal isn’t a
fixed terminal state. This is the canonical use case for utility-AI
planning, and LangGOAP exposes it through UtilityStrategy plus the
special NirvanaGoal.
The scenario is a household with a $50 weekly grocery budget: the pantry is low on milk and bread, and the local store occasionally runs a flash sale. The agent should buy the most valuable applicable item right now — tick by tick — until either the needs are covered or no purchase is worth the money any more.
Three primitives at work:
NirvanaGoal— the planner is never “done”; the loop ends only when no applicable action remains.UtilityStrategy— at each tick, filter to applicable actions and pick the one with the highestnet_value(state) = utility(state) - cost(state).Context-aware utility — the flash-sale item carries a callable utility that spikes when
sale_active=True, naturally re-ordering the agent’s priorities.
A fixed-goal A* baseline at the end of the notebook makes the behavioural difference explicit.
Backed by tests/integration/test_utility_planner_loop.py.
Setup — define the shopper’s actions#
from typing import Any
from langgoap import (
ActionSpec,
GoapGraph,
GoalPolicy,
GoalSpec,
NirvanaGoal,
ReplanStrategy,
UtilityStrategy,
)
def buy_action(item: str, eff_key: str, default_price: float) -> Any:
def execute(ws: dict[str, Any]) -> dict[str, Any]:
spent = float(ws.get('spent', 0.0))
price = float(ws.get(f'price_{item}', default_price))
return {eff_key: True, 'spent': spent + price}
return execute
# Three purchases. Two are everyday staples; the third is a deal-of-the-day
# whose utility depends on whether the store is currently running a sale.
actions = [
ActionSpec(
name='buy_milk',
preconditions={'need_milk': True},
effects={'milk': True, 'spent': 0.0},
execute=buy_action('milk', 'milk', 4.0),
cost=lambda ws: float(ws.get('price_milk', 4.0)),
utility=10.0,
can_rerun=False, # buy each item at most once per shopping run
),
ActionSpec(
name='buy_bread',
preconditions={'need_bread': True},
effects={'bread': True, 'spent': 0.0},
execute=buy_action('bread', 'bread', 3.0),
cost=lambda ws: float(ws.get('price_bread', 3.0)),
utility=8.0,
can_rerun=False,
),
ActionSpec(
name='buy_sale_item',
preconditions={'sale_active': True},
effects={'sale_item': True, 'spent': 0.0},
execute=buy_action('sale', 'sale_item', 7.0),
cost=lambda ws: float(ws.get('price_sale', 7.0)),
# Utility is 50 during a sale, near-zero off-sale — context-aware.
utility=lambda ws: 50.0 if ws.get('sale_active') else 0.5,
can_rerun=False,
),
]
print(f'Set up {len(actions)} purchase options.')
/Users/brian.sam-bodden/Code/langgoap/.venv/lib/python3.12/site-packages/langgraph/checkpoint/serde/encrypted.py:5: LangChainPendingDeprecationWarning: The default value of `allowed_objects` will change in a future version. Pass an explicit value (e.g., allowed_objects='messages' or allowed_objects='core') to suppress this warning.
from langgraph.checkpoint.serde.jsonplus import JsonPlusSerializer
Set up 3 purchase options.
Scenario A — baseline (no sale)#
The household needs milk and bread. No sale is active, so the sale-item action is excluded by precondition. Net values:
Action |
Utility |
Cost |
Net value |
|---|---|---|---|
|
10 |
4 |
6 |
|
8 |
3 |
5 |
Milk wins on the first tick; bread on the second. We use
ReplanStrategy.EVERY_ACTION so the planner runs again after every
purchase — utility planning is fundamentally iterative.
graph = GoapGraph(actions, strategy=UtilityStrategy())
result_a = graph.invoke(
goal=GoalSpec(
conditions={'milk': True, 'bread': True},
policy=GoalPolicy(replan_strategy=ReplanStrategy.EVERY_ACTION),
),
world_state={'need_milk': True, 'need_bread': True},
)
purchases = [
h.action_name for h in result_a['execution_history'] if h.success
]
print(f"Status: {result_a['status']}")
print(f'Purchases: {purchases}')
print(f"Spent: ${result_a['world_state']['spent']:.2f}")
Status: goal_achieved
Purchases: ['buy_milk', 'buy_bread']
Spent: $7.00
Scenario B — flash sale activates#
The store now has a sale on a high-value item. The sale-item’s utility callable returns 50 instead of 0.5. Net values change completely:
Action |
Utility |
Cost |
Net value |
|---|---|---|---|
|
50 |
7 |
43 |
|
10 |
4 |
6 |
|
8 |
3 |
5 |
We switch the goal to NirvanaGoal — the household has no fixed
completion criterion, it just keeps shopping until no purchase is
applicable any more (every item bought, sale exhausted).
result_b = graph.invoke(
goal=NirvanaGoal(policy=GoalPolicy(replan_strategy=ReplanStrategy.EVERY_ACTION)),
world_state={
'need_milk': True,
'need_bread': True,
'sale_active': True,
},
)
purchases = [
h.action_name for h in result_b['execution_history'] if h.success
]
print(f"Status: {result_b['status']}")
print(f'Purchases: {purchases}')
print(f"Spent: ${result_b['world_state']['spent']:.2f}")
print()
print('Note the sale item went FIRST — the agent re-prioritised in real time.')
A* found no plan for goal {} — The goal conditions appear theoretically reachable but A* found no valid action ordering. This may indicate circular precondition dependencies or conflicting action effects. Review action preconditions and effects for internal consistency.
Status: no_plan
Purchases: ['buy_sale_item', 'buy_milk', 'buy_bread']
Spent: $14.00
Note the sale item went FIRST — the agent re-prioritised in real time.
Contrast — A* fixed-goal planning would just minimise cost#
If we hand the same actions to the default A* planner with a fixed goal (“end with milk and bread”), it doesn’t care about utility — it minimises total cost. The sale item is excluded because it’s not needed to satisfy the goal.
default_graph = GoapGraph(actions) # no strategy = default A*
result_default = default_graph.invoke(
goal=GoalSpec(conditions={'milk': True, 'bread': True}),
world_state={
'need_milk': True,
'need_bread': True,
'sale_active': True,
},
)
purchases = [
h.action_name for h in result_default['execution_history'] if h.success
]
print(f"Status: {result_default['status']}")
print(f'Purchases: {purchases} (no sale_item — irrelevant to the goal)')
print(f"Spent: ${result_default['world_state']['spent']:.2f}")
Status: goal_achieved
Purchases: ['buy_bread', 'buy_milk'] (no sale_item — irrelevant to the goal)
Spent: $7.00
When to reach for utility planning#
Utility planning shines when:
The agent should react to changing context rather than chasing a fixed completion criterion.
“Best next move” is well-defined but the long-term plan isn’t.
Some actions get more valuable in certain states (callable utility).
Pair UtilityStrategy with NirvanaGoal for always-on agents, or with
a regular GoalSpec when you want “act greedily until the goal happens
to be satisfied.” Set can_rerun=False on actions that should fire at
most once per session — _plan_core auto-blacklists them after
successful execution so the loop terminates cleanly.
See also langgoap/planner/utility.py for the source and the
tests/integration/test_utility_planner_loop.py integration tests for
more scenarios.
Next steps#
The integration test
test_utility_planner_loop.pyexercises the same scenario with deterministic assertions on every tick.Combine
UtilityStrategywithMaxActionsPolicyfor a hard cap on shopper iterations — seebasics/termination_policies.ipynb.For a fixed-goal counterpart on the same domain, swap in
GoalSpec(conditions=...)plusAStarStrategy.