Personal shopper agent#

Pick the highest-value action at every tick when the goal isn’t a fixed terminal state. This is the canonical use case for utility-AI planning, and LangGOAP exposes it through UtilityStrategy plus the special NirvanaGoal.

The scenario is a household with a $50 weekly grocery budget: the pantry is low on milk and bread, and the local store occasionally runs a flash sale. The agent should buy the most valuable applicable item right now — tick by tick — until either the needs are covered or no purchase is worth the money any more.

Three primitives at work:

  • NirvanaGoal — the planner is never “done”; the loop ends only when no applicable action remains.

  • UtilityStrategy — at each tick, filter to applicable actions and pick the one with the highest net_value(state) = utility(state) - cost(state).

  • Context-aware utility — the flash-sale item carries a callable utility that spikes when sale_active=True, naturally re-ordering the agent’s priorities.

A fixed-goal A* baseline at the end of the notebook makes the behavioural difference explicit.

Backed by tests/integration/test_utility_planner_loop.py.

Setup — define the shopper’s actions#

from typing import Any

from langgoap import (
    ActionSpec,
    GoapGraph,
    GoalPolicy,
    GoalSpec,
    NirvanaGoal,
    ReplanStrategy,
    UtilityStrategy,
)


def buy_action(item: str, eff_key: str, default_price: float) -> Any:
    def execute(ws: dict[str, Any]) -> dict[str, Any]:
        spent = float(ws.get('spent', 0.0))
        price = float(ws.get(f'price_{item}', default_price))
        return {eff_key: True, 'spent': spent + price}

    return execute


# Three purchases.  Two are everyday staples; the third is a deal-of-the-day
# whose utility depends on whether the store is currently running a sale.
actions = [
    ActionSpec(
        name='buy_milk',
        preconditions={'need_milk': True},
        effects={'milk': True, 'spent': 0.0},
        execute=buy_action('milk', 'milk', 4.0),
        cost=lambda ws: float(ws.get('price_milk', 4.0)),
        utility=10.0,
        can_rerun=False,  # buy each item at most once per shopping run
    ),
    ActionSpec(
        name='buy_bread',
        preconditions={'need_bread': True},
        effects={'bread': True, 'spent': 0.0},
        execute=buy_action('bread', 'bread', 3.0),
        cost=lambda ws: float(ws.get('price_bread', 3.0)),
        utility=8.0,
        can_rerun=False,
    ),
    ActionSpec(
        name='buy_sale_item',
        preconditions={'sale_active': True},
        effects={'sale_item': True, 'spent': 0.0},
        execute=buy_action('sale', 'sale_item', 7.0),
        cost=lambda ws: float(ws.get('price_sale', 7.0)),
        # Utility is 50 during a sale, near-zero off-sale — context-aware.
        utility=lambda ws: 50.0 if ws.get('sale_active') else 0.5,
        can_rerun=False,
    ),
]
print(f'Set up {len(actions)} purchase options.')
/Users/brian.sam-bodden/Code/langgoap/.venv/lib/python3.12/site-packages/langgraph/checkpoint/serde/encrypted.py:5: LangChainPendingDeprecationWarning: The default value of `allowed_objects` will change in a future version. Pass an explicit value (e.g., allowed_objects='messages' or allowed_objects='core') to suppress this warning.
  from langgraph.checkpoint.serde.jsonplus import JsonPlusSerializer
Set up 3 purchase options.

Scenario A — baseline (no sale)#

The household needs milk and bread. No sale is active, so the sale-item action is excluded by precondition. Net values:

Action

Utility

Cost

Net value

buy_milk

10

4

6

buy_bread

8

3

5

Milk wins on the first tick; bread on the second. We use ReplanStrategy.EVERY_ACTION so the planner runs again after every purchase — utility planning is fundamentally iterative.

graph = GoapGraph(actions, strategy=UtilityStrategy())

result_a = graph.invoke(
    goal=GoalSpec(
        conditions={'milk': True, 'bread': True},
        policy=GoalPolicy(replan_strategy=ReplanStrategy.EVERY_ACTION),
    ),
    world_state={'need_milk': True, 'need_bread': True},
)

purchases = [
    h.action_name for h in result_a['execution_history'] if h.success
]
print(f"Status:    {result_a['status']}")
print(f'Purchases: {purchases}')
print(f"Spent:     ${result_a['world_state']['spent']:.2f}")
Status:    goal_achieved
Purchases: ['buy_milk', 'buy_bread']
Spent:     $7.00

Scenario B — flash sale activates#

The store now has a sale on a high-value item. The sale-item’s utility callable returns 50 instead of 0.5. Net values change completely:

Action

Utility

Cost

Net value

buy_sale_item

50

7

43

buy_milk

10

4

6

buy_bread

8

3

5

We switch the goal to NirvanaGoal — the household has no fixed completion criterion, it just keeps shopping until no purchase is applicable any more (every item bought, sale exhausted).

result_b = graph.invoke(
    goal=NirvanaGoal(policy=GoalPolicy(replan_strategy=ReplanStrategy.EVERY_ACTION)),
    world_state={
        'need_milk': True,
        'need_bread': True,
        'sale_active': True,
    },
)

purchases = [
    h.action_name for h in result_b['execution_history'] if h.success
]
print(f"Status:    {result_b['status']}")
print(f'Purchases: {purchases}')
print(f"Spent:     ${result_b['world_state']['spent']:.2f}")
print()
print('Note the sale item went FIRST — the agent re-prioritised in real time.')
A* found no plan for goal {} — The goal conditions appear theoretically reachable but A* found no valid action ordering. This may indicate circular precondition dependencies or conflicting action effects. Review action preconditions and effects for internal consistency.
Status:    no_plan
Purchases: ['buy_sale_item', 'buy_milk', 'buy_bread']
Spent:     $14.00

Note the sale item went FIRST — the agent re-prioritised in real time.

Contrast — A* fixed-goal planning would just minimise cost#

If we hand the same actions to the default A* planner with a fixed goal (“end with milk and bread”), it doesn’t care about utility — it minimises total cost. The sale item is excluded because it’s not needed to satisfy the goal.

default_graph = GoapGraph(actions)  # no strategy = default A*

result_default = default_graph.invoke(
    goal=GoalSpec(conditions={'milk': True, 'bread': True}),
    world_state={
        'need_milk': True,
        'need_bread': True,
        'sale_active': True,
    },
)

purchases = [
    h.action_name for h in result_default['execution_history'] if h.success
]
print(f"Status:    {result_default['status']}")
print(f'Purchases: {purchases}  (no sale_item — irrelevant to the goal)')
print(f"Spent:     ${result_default['world_state']['spent']:.2f}")
Status:    goal_achieved
Purchases: ['buy_bread', 'buy_milk']  (no sale_item — irrelevant to the goal)
Spent:     $7.00

When to reach for utility planning#

Utility planning shines when:

  • The agent should react to changing context rather than chasing a fixed completion criterion.

  • “Best next move” is well-defined but the long-term plan isn’t.

  • Some actions get more valuable in certain states (callable utility).

Pair UtilityStrategy with NirvanaGoal for always-on agents, or with a regular GoalSpec when you want “act greedily until the goal happens to be satisfied.” Set can_rerun=False on actions that should fire at most once per session — _plan_core auto-blacklists them after successful execution so the loop terminates cleanly.

See also langgoap/planner/utility.py for the source and the tests/integration/test_utility_planner_loop.py integration tests for more scenarios.

Next steps#

  • The integration test test_utility_planner_loop.py exercises the same scenario with deterministic assertions on every tick.

  • Combine UtilityStrategy with MaxActionsPolicy for a hard cap on shopper iterations — see basics/termination_policies.ipynb.

  • For a fixed-goal counterpart on the same domain, swap in GoalSpec(conditions=...) plus AStarStrategy.