Time/Date Hallucinations?

NickyDigital · 2026-05-16

nyone else seeing agents hallucinate dates?

Pattern: agents generate artifacts (issue titles, brief filenames, scheduled work) with wrong dates — sometimes future-dated, sometimes stuck on training cutoff, sometimes UTC-vs-local timezone drift. Worst case in ALE: CoS produced 5 "Daily Brief" duplicates in 8 hours, each guessing a different date for the same actual day.

Guessing a different date for the same actual day.

Root cause looks like LLMs guess dates instead of checking. They pull from issue.createdat (UTC), document content, prior comments, or training data — anywhere except date.

My fix attempt: added agent SOUL/AGENTS rule forcing TZ=America/NewYork date shell call before any date reference. Plus an idempotency rule (search before create, one artifact per date+slot).\ \ Just deployed — too early to know if it sticks.

Questions:

Has anyone else hit this on Paperclip? How frequently? What worked to fix it? Per-agent rule, system-level constraint, post-creation lint? Anyone tried injecting a "today's date" header into every agent run via the heartbeat context, so agents don't have to remember to check? Is there a Paperclip-native primitive for "currenttime" we should be using instead of shell? Anyone solved it at the framework level (Manifest? opencode adapter?) vs per-company instructions?

Concrete cost: my fleet wastes \~30% of CoS heartbeat cycles on duplicate-artifact generation.\ \ Plus Truth Gate risk — a Daily Brief titled with a future date is a false claim about when work happened.

Answers

NickyDigital · 2026-05-18

Aron — quick report on what landed.

Diagnosis was 80% right, one correction: I'd conflated two artifact types. There are two "Daily Brief"s on ALE — the routine-driven Daily AI News pipeline (CPO → Tavily → Substack draft), and the heartbeat-generated CoS internal Daily Brief (founder status digest). The 5-in-8h dupe loop was the CoS one — no routine owns it. Your structural fix applied to the routine; the dupes needed the agent-side idempotency rule.

What's holding:

\- CoS Daily Brief Idempotency rule (search by date+slot before create, locked title format): exactly one brief for today, zero dupes since deploy. Stuck so far.

\- Daily AI News routine: already had correct ET TZ + CPO assignee. Past 7 runs failed (CPO was paused). Today's 11:00 ET fire succeeded — 6.5 min run, clean execution.

Mass-pause correction: ALE's wasn't April 28 — it was 2026-05-15 05:49 UTC. 15 routines updated to identical microsecond = single bulk SQL transaction. No activitylog trail, no recall. Proceeding with staggered revival anyway, watching for re-trip.

Unexpected side-wins: while inspecting CPO config I recovered the Substack URL (was missing from project docs but lived in adapterconfig.env) and triggered a 60-agent fleetwide MANIFESTAPIKEY plain→secretref migration. Worth flagging for anyone else doing similar hygiene: the ppinjectmanifestenv trigger checks env.MANIFESTAPIKEY.value for emptiness and re-injects plain — it doesn't recognize type:'secretref' (which has no value field), so migrations get silently reverted until you patch the trigger.

Next: Wave 1 revival = 5 CPO-driven dailies after TZ correction (UTC→ET). 2 routines blocked on Affiliate Marketing Agent unpause (separate spec work).

Will flag again after Wave 1.

Aron Prins · 2026-05-18

Great progress on the diagnosis — and these are exactly the right follow-ups. Going through them in order.

1. Reviving a long-paused routine. catchUpPolicy = skipmissed is the right primary safety net and it does what it says — when the routine flips active and the scheduler computes the next fire, missed windows are dropped, not enqueued. Per the Heartbeats & Routines (/docs/projects-workflow/routines) doc, "the defaults (coalesce if active, skip missed) are what you want: no pile-ups, no surprise flood of work after a restart." Two things to do before you flip the switch, in this order:

Unpause the assignee agent first, routine second. If you activate the routine while the Content Pipeline Orchestrator is still paused, the first cron fire will land on a paused agent and stall on assignment delivery. Don't archive instead of pause. The lifecycle is active ↔ paused, plus a one-way archived. Archived routines do not fire and cannot be reactivated (see Routine Lifecycle (/docs/api/routines)) — keep these in paused, not archived, while you're staging the revival.

routineruns rows are pure history and don't need clearing — they don't participate in the next-fire decision. The trigger's cron + timezone is the only thing the scheduler reads. Same for the agent's runtimestate — it's heartbeat bookkeeping, not scheduling input.

2. Idempotency at the API layer — partial yes. This was worth digging into properly. Three layers to keep separate:

Routine runs do support idempotency. Both routineruns.idempotencyKey (indexed against triggerId) and a server-computed dispatchFingerprint exist on the schema, and the Manual Run endpoint (/docs/api/routines#manual-run) POST /api/routines/{routineId}/run accepts idempotencyKey directly. This is what stops two concurrent fires of the same routine from producing duplicate runs. Scheduled fires use the fingerprint automatically. You don't need to set anything for cron-driven runs — the server's dispatchFingerprint covers the "two parallel ticks for the same window" race. Issue creation from a routine does not take a content-level idempotency key like daily-brief-2026-05-17. The idempotency is on the routine run, not on the resulting issue's content. So if your concern is "never two Daily Brief issues with the same date slug, even from non-routine sources", you still want either (a) a uniqueness check inside the assignee agent before it writes the artifact, or (b) a DB-level unique constraint on a custom field. Issues API does carry idempotencyKey elsewhere (on /issues/{id}/interactions for confirmations), but not on issue create.

For your fix: once the routine + assignee are revived, the routine-run idempotency takes care of "two cron ticks fire the same window." The CoS no longer creating Daily Briefs (see #4 below) takes care of the other source of duplicates. Between those two, you shouldn't need a content-level slug at all.

3. TZ change while paused — safe, no revision bump needed. Triggers have their own endpoint (PATCH /api/routine-triggers/{triggerId}) and don't require baseRevisionId the way the routine-level update does. Routine revisions exist (routinerevisions table, with baseRevisionId optimistic concurrency on PATCH /api/routines/{routineId}), but they snapshot routine fields plus safe trigger metadata — not trigger edits themselves. Order I'd do it:

Routine still paused. PATCH /api/routine-triggers/{triggerId} with the new timezone (and cronExpression if you want to retune). Reload the trigger and confirm the Next: countdown shows your expected local America/NewYork time. Then flip the routine to active.

No need to wait for paused → active to mint a revision.

Aron Prins · 2026-05-17

Great write-up — this is a real failure mode and your fix is in the right direction. Going through your five questions:

1. Frequency. Common enough to come up, but the duplicate-Daily-Brief pattern is usually a combination of date-guessing and missing idempotency, not date-guessing alone. Each ingredient is fixable; together they produce the 30%-overhead bug you're seeing.

2. What works. Your shell date rule is the most reliable per-agent fix. Two complementary patterns worth layering on:

Anchor on a server-authoritative timestamp. Every issue the API returns has createdAt/updatedAt in ISO 8601 UTC — for any artifact derived from a task, that is the canonical "when did this happen," not the LLM's intuition. For genuinely "now," date -u +%Y-%m-%dT%H:%M:%SZ in the harness wins. Hard idempotency keys, not just search-before-create. For routine-style outputs, compute a deterministic slug — daily-brief-{YYYY-MM-DD} in a fixed TZ — and treat the API as a UPSERT on that slug. Even if the LLM picks the right date, two parallel wakes can race; the idempotency key is what stops the duplicate. Search-before-create is a soft form of this and is racy on its own.

3. Heartbeat-injected "today's date" header. You can do something close to this today via adapterConfig.promptTemplate on the agent — it's an undocumented field but it's the seam for prepending stable text at the start of every wake. Caveat: promptTemplate is static — it doesn't interpolate live shell. The clean way is to wrap the adapter's pre-run with a small hook (or use the process adapter) that writes a HEARTBEAT.md into the working directory each wake with the current timestamps, and reference that file in the template. There's an open RFC ([#206](https://github.com/paperclipai/paperclip/issues/206)) on formalising a HEARTBEAT.md convention for exactly this kind of per-run context.

4. Paperclip-native currenttime primitive. No — not today. The env vars auto-injected on every heartbeat are PAPERCLIPAGENTID, PAPERCLIPCOMPANYID, PAPERCLIPAPIURL, PAPERCLIPRUNID, plus optional wake-context vars (PAPERCLIPTASKID, PAPERCLIPWAKEREASON, PAPERCLIPWAKECOMMENTID, PAPERCLIPAPPROVALID, PAPERCLIPAPPROVALSTATUS, PAPERCLIPLINKEDISSUEIDS). There's no PAPERCLIPNOW or equivalent. Adding one has come up in adjacent RFCs but isn't on a current milestone — I'll open an issue to track it formally given how often it surfaces. In the meantime, date from the runtime is the canonical answer.

5. Framework-level vs per-company. Two layers:

Time-pinned work (Daily Brief, weekly review, etc.) should live in a Routine, not in a heartbeat that has to deduce "is today the day?" A routine fires on cron in an explicit timezone, and the resulting issue's createdAt is the canonical trigger time. That alone removes most of the failure surface for recurring artifacts. See Heartbeats & Routines (/docs/projects-workflow/routines) — note especially the explicit-TZ cron handling and DST behaviour. Per-agent date discipline (any agent that references dates, not just routine-driven ones) belongs in a reusable skill. I'd factor your SOUL rule into a skill called something like time-discipline and attach it to every agent that produces dated output — that way it doesn't drift per-company.

For your concrete bug: if those Daily Briefs come from a routine, move the date out of the agent's head entirely — let the routine's trigger time (carried on the resulting issue's createdAt) be the source of truth, and use a deterministic slug as the idempotency key on creation. Duplicates go away regardless of what the LLM thinks the date is.

Flag back in a few days whether the SOUL rule sticks. If it doesn't, the run transcripts that violated it are the kind of data that drives a real primitive landing.

Aron Prins · 2026-05-20

Good update — and the side-findings are arguably more valuable than the original fix. Three quick reactions:

1. Mass-pause = single bulk SQL transaction. That's a useful forensic find. The identical-microsecond pausedat pattern across 15 routines is a near-definitive fingerprint of a manual UPDATE routines SET status='paused' WHERE …, not application code — application paths go through the routine lifecycle service and would write per-row at near-but-not-identical timestamps. Worth checking your shell history / psql logs for that window if you can — even without an activitylog trail, the bash side often holds the receipt. If you can recover even a partial query, it tells us whether this was a backup-tool side effect, a migration, or a human.

2. ppinjectmanifestenv trigger — please flag this upstream. I searched master for that trigger name and didn't find a hit, so it's either (a) under a different name now, (b) in a private repo path, or (c) a name from your local DB that diverged. Either way, the logic bug you described is real and shippable as a one-line fix: the trigger should test on type first, not on value emptiness, when deciding whether to re-inject. Can you grab the trigger body via \df+ ppinjectmanifestenv (or pggettriggerdef on the trigger) and file an issue on paperclipai/paperclip? That's the kind of fix that lands fast if there's a reproducer attached.

3. CoS Daily Brief idempotency holding — good. Two follow-ups worth doing within the next week so the fix doesn't quietly rot:

Add a regression check. A small skill or scheduled task that queries issues WHERE title LIKE 'CoS Daily Brief%' AND createdat::date = currentdate GROUP BY date HAVING count() > 1 and alerts you. If the rule ever drifts, you'll know on day one instead of after another 8-hour burn. Resolve the SOUL rule's place. You'd flagged deferring the skill factoring until a second company hits the same need. The 60-agent MANIFESTAPIKEY migration is essentially that signal — you're now operating across multiple companies and would benefit from the rule being a portable skill, not a per-AGENTS.md instruction. Worth doing before the next ALE-shaped bug hits CompanyTwo.

No need to flag back on Wave 1 unless something breaks — but if you do file the trigger bug, drop the issue # here so anyone hitting the same hygiene migration can find it.

NickyDigital · 2026-05-17

Huge thanks — this nailed it. Did the homework you suggested and confirmed your structural diagnosis on my fleet.

What I found

ALE has 18 Routines defined, all paused or archived. The relevant one — "Daily AI News pipeline" — is paused (since 2026-05-14), cron 0 11 \ \ \, TZ set to UTC (not America/NewYork), assignee = Content Pipeline Orchestrator (also paused). The other recurring-content routines (Saturday/Thursday/Tuesday Long-Form, Tool of the Day, Metrics Collection, etc.) are all paused too.

So my Daily Brief was supposed to be Routine-driven. When the routine + its assignee agent paused, my CoS started heartbeat-generating Daily Briefs as a workaround, with no cron anchor and no idempotency. That's where the 5-duplicates-in-8h came from. Date hallucination on top of missing scheduling anchor — exactly your call.

Fix path I'm planning

Set "Daily AI News pipeline" trigger TZ from UTC → America/NewYork (DST handling).

Activate routine: paused → active.

Unpause Content Pipeline Orchestrator.

Add a CoS [AGENTS.md](http://AGENTS.md) rule explicitly disowning Daily Brief creation — it belongs to the routine + assignee, not CoS heartbeat.

Keep my date-discipline SOUL rule as the per-agent layer for everything that still references dates (comments, ad-hoc artifacts, non-routine work).

NickyDigital · 2026-05-17

Thanks Aaron! Let me check these out.