Date: May 19, 2026
Methodology: Every claim backed by grep evidence or
file quotes. Uncertainty stated explicitly.
Pre-step: ls -la scripts/ (Python
scripts only, 24 total):
audit-phantom-drafts.py backlog-dashboard.py backlog-to-zoho.py
classify-verticals.py ct-zoho-pipeline.py draft-backlog.py
draft-random-50.py enrich-backlog.py haberdasher.py
harvest.py historical-sweep.py law-firm-pipeline.py
lead-heartbeat.py lead-tracker.py mark-called.py
morning-brief.py recalculate-priority.py rollback.py
route-planner.py sales-brief-generator.py seed-planter.py
shepherd-to-zoho.py zoho-enrich-phones.py zoho-push.py
Data/config files excluded: .env.zoho,
backlog-zoho-state.json,
ct-city-county-map.json,
enhanced_shell_patterns.json,
enrichment-checkpoint.json,
equipment-mapping.json,
gardener-checkpoint.json, gardener-staging-*,
historical-staging-*, push-cooldown.json,
push-log.json, random-50-leads.json,
seed-planter-*.csv/json.
First 5 lines:
#!/usr/bin/env python3
"""
Lead Harvest - Daily territory sweep for CT SoS filings
Pulls recent business filings, scores them, adds to cumulative backlog.Purpose: Daily CT SoS sweep and scoring pipeline
Invocation evidence:
$ grep -rn "harvest.py" scripts/ lib/ --include="*.py" | grep -v __pycache__
scripts/morning-brief.py: subprocess.run([sys.executable, "scripts/harvest.py", "--days", str(args.days)], check=True)
Invoked by: morning-brief.py (subprocess), operator
manually.
Imports from lib:
from lib.enrichment import load_backlog, save_backlog, get_backlog_path
from lib.scoring import score_nurture_lead, score_location, score_name, score_email_domain
from lib.patterns import proper_case_name, clean_phone, is_shell_leadClassification: Active (core daily pipeline entry point)
First 5 lines:
#!/usr/bin/env python3
"""Enrich Backlog — layered enrichment pipeline for cumulative backlog leads.
Runs 4 enrichment layers on every lead in cumulative-backlog.json that
hasn't already been enriched:Purpose: Multi-phase enrichment pipeline (domain, competitor, Brave, equipment)
Invocation evidence:
$ grep -rn "enrich-backlog.py" scripts/ lib/ --include="*.py" | grep -v __pycache__
scripts/morning-brief.py: subprocess.run([sys.executable, "scripts/enrich-backlog.py"], check=True)
Invoked by: morning-brief.py (subprocess), operator
manually.
Imports from lib:
from lib.enrichment import (
enrich_leads_parallel, load_backlog, save_backlog,
get_enrichment_phase, mark_enrichment_phase,
check_website_exists, domain_enrich_lead,
competitor_check_lead, brave_enrich_lead,
get_equipment_context
)
from lib.scoring import calculate_priorityClassification: Active (core enrichment pipeline)
First 5 lines:
#!/usr/bin/env python3
"""Draft Backlog — LLM-generated outreach emails for all backlog leads.
Runs the full llm drafter (lib/drafter.py) against every lead in
cumulative-backlog.json that doesn't already have a draft, persistingPurpose: LLM email drafting for backlog leads
Invocation evidence:
$ grep -rn "draft-backlog.py" scripts/ lib/ --include="*.py" | grep -v __pycache__
scripts/morning-brief.py: subprocess.run([sys.executable, "scripts/draft-backlog.py", "--limit", str(args.limit)], check=True)
Invoked by: morning-brief.py (subprocess), operator
manually.
Imports from lib:
from lib.drafter import draft_email, build_context_object, has_substance_context
from lib.enrichment import load_backlog, save_backlog
from lib.scoring import calculate_priorityClassification: Active (core drafting pipeline)
First 5 lines:
"""Morning Brief — daily prioritized contact list with LLM-generated emails.
Single command for OpenClaw: python3 morning-brief.py
Produces a priority-ranked list of leads ready for contact today, withPurpose: Orchestrates full pipeline: harvest → enrich → draft → dashboard
Invocation evidence: No other script calls morning-brief.py. Operator invoked, likely via cron.
Imports from lib:
from lib.dashboard import generate_dashboard
from lib.enrichment import load_backlog, save_backlogClassification: Active (main orchestrator, calls harvest/enrich/draft via subprocess)
First 5 lines:
#!/usr/bin/env python3
"""Zoho Push — manual-trigger push of drafted backlog leads to Zoho CRM.
Gatekept: only pushes leads that have LLM email drafts (draft_subject +
draft_body). Sends Email_Draft_Subject and Email_Draft_Body2 fields soPurpose: Manual push of drafted leads to Zoho CRM with confirm gate
Invocation evidence: Operator invoked. No other script calls it.
Imports from lib:
from lib.zoho import Zoho
from lib.lifecycle import get_historical_context
from lib.enrichment import load_backlog, save_backlogClassification: Active (primary Zoho push with cooldown guardrail)
First 5 lines:
#!/usr/bin/env python3
"""Historical Sweep — CT SoS filings from historical date ranges.
Fetches business registrations from 45/60/75+ months ago in 1/3/7-day windows,
scores and filters them through the existing pipeline, and appends qualifyingPurpose: Bulk fetch of older filings for equipment refresh targeting
Invocation evidence: Operator invoked. No other script calls it.
Imports from lib:
from lib.enrichment import load_backlog, save_backlog, get_backlog_path
from lib.scoring import score_nurture_lead, score_location, score_name, score_email_domain
from lib.historical import is_historical, tag_source, milestone_ageClassification: Active (historical pipeline entry point)
| Script | First 5 lines (truncated) | Purpose | Classification |
|---|---|---|---|
| audit-phantom-drafts.py | #!/usr/bin/env python3 /
"""Audit phantom drafts... |
Detects/clears phantom drafts from GLM-5.1 bug | Active utility |
| backlog-dashboard.py | #!/usr/bin/env python3 /
"""Backlog Dashboard Generator |
Generates HTML dashboard from backlog | Active utility |
| backlog-to-zoho.py | #!/usr/bin/env python3 /
"""Backlog → Zoho CRM Push |
Batch push with hat assignments | Active (may overlap zoho-push) |
| classify-verticals.py | #!/usr/bin/env python3 /
"""Classify Zoho leads into 9-vertical |
LLM vertical classification for Zoho | Active utility |
| ct-zoho-pipeline.py | #!/usr/bin/env python3 /
"""CT Business Registration → Zoho CRM Pipeline |
Direct CT→Zoho, no scoring | Active specialized |
| draft-random-50.py | #!/usr/bin/env python3 |
Draft 50 random leads for testing | Active testing |
| haberdasher.py | #!/usr/bin/env python3 |
Assigns NAICS-based hats | Vestigial — no caller found |
| law-firm-pipeline.py | #!/usr/bin/env python3 |
Specialized law firm pipeline | Active specialized |
| lead-heartbeat.py | #!/usr/bin/env python3 |
Detects dormant leads emerging online | Active utility |
| lead-tracker.py | #!/usr/bin/env python3 |
Lead lifecycle CLI | Active utility |
| mark-called.py | #!/usr/bin/env python3 |
Marks leads called in Zoho | Active utility |
| recalculate-priority.py | #!/usr/bin/env python3 |
One-shot priority recalculation | Active utility |
| rollback.py | #!/usr/bin/env python3 |
Zoho push audit/undo | Active utility |
| route-planner.py | #!/usr/bin/env python3 |
Geographic clustering for sales routes | Active utility |
| sales-brief-generator.py | #!/usr/bin/env python3 |
Printable markdown sales briefs | Active utility |
| seed-planter.py | #!/usr/bin/env python3 |
Template-based drafting (v2) | Vestigial — template system retired |
| shepherd-to-zoho.py | #!/usr/bin/env python3 |
Church/religious org pipeline | Active specialized |
| zoho-enrich-phones.py | #!/usr/bin/env python3 |
Phone enrichment for existing Zoho leads | Active utility |
Pre-step: ls -la lib/ (14 Python
files):
backlog_dashboard.py brave_enrich.py config.py dashboard.py
drafter.py enrichment.py historical.py lifecycle.py
patterns.py scoring.py verticals.py webapp.py zoho.py __init__.py
First 5 lines:
"""Unified scoring pipeline for the Gardener system.
Integrates the 100-point nurture scoring from ct-seed-planter.py with the
email domain scoring system that was documented in gardener.json but never
implemented in code.Last 3 lines:
"readiness_weight": round(weight, 2),
"readiness_signals": signals,
}Public functions:
def score_naics(naics_code, naics_desc, tiers=None):
def score_name(name, shell_patterns=None, naics_code=""):
def score_email_domain(email, cfg=None):
def score_nurture_lead(name, city, naics_raw, is_shell, filing_date, email, ...):
def calculate_readiness(lead, gardener_cfg=None):
def calculate_priority(lead, gardener_cfg=None):Imported by: 15 scripts (audit-phantom-drafts, backlog-to-zoho, draft-backlog, enrich-backlog, haberdasher, harvest, historical-sweep, mark-called, morning-brief, recalculate-priority, rollback, route-planner, sales-brief-generator, seed-planter, shepherd-to-zoho) + drafter.py + enrichment.py.
Internal deps:
from lib.config import load_config,
from lib.patterns import is_pllc_fast_track, is_shell_lead
Status: Active — core scoring engine, most-imported module.
First 5 lines:
"""LLM-powered email drafter with full context injection.
Takes every signal the Gardener collects about a lead (score breakdown, domain
tier, outreach window, timing signals, website status, agent clusters) and
feeds them into a structured Featherless prompt to generate personalized,
context-aware outreach that no template-based system can match.Last 3 lines:
f"I help companies set up their office equipment. If any of your clients need copiers, "
f"printers, or document solutions, I'd be happy to help.\n\n{signature}"}Public functions:
def build_context_object(lead):
def build_draft_prompt(ctx):
def build_historical_prompt(ctx):
def draft_email(lead, model=None, temperature=0.7):
def draft_batch(leads, model=None, temperature=0.7, max_concurrent=4):
def draft_and_attach(leads, model=None, temperature=0.7):
def why_this_now(lead, model=None):
def generate_agent_referral_email(agent_name, leads, model=None):Imported by: draft-backlog.py, draft-random-50.py, historical-sweep.py, audit-phantom-drafts.py.
Internal deps:
from lib.config import load_config, get_template_route, get_llm_config,
from lib.scoring import calculate_priority,
from lib.enrichment import get_equipment_context
Status: Active — primary drafting module.
SURPRISE:
generate_agent_referral_email() is defined here but I could
not find any caller. why_this_now() likewise — grep found
no callers outside the module itself. These may be dead code within an
active module.
First 5 lines:
"""Enrichment pipeline for the Gardener system.
Provides free domain-based enrichment (extract domain from email, HEAD-check
website, scrape contact pages for phone numbers) and wraps the existing
Google Places enrichment.Last 3 lines:
}
except Exception:
return {"phone": "", "website": "", "address": "", "google_match": False}Public functions:
def extract_domain_from_email(email):
def check_website_exists(domain, timeout=3):
def scrape_contact_page_for_phone(domain, timeout=5):
def enrich_lead_from_domain(email, timeout=5):
def competitor_check(domain, timeout=5):
def enrich_leads_parallel(leads, max_workers=10, timeout=5, progress_callback=None):
def enrich_with_google_places(business_name, city):Imported by: 17 scripts (essentially everything that touches the backlog).
Internal deps:
from lib.config import load_config,
from lib.brave_enrich import brave_search
Status: Active — core enrichment module.
NOTE: The function signatures differ from what
AGENTS.md documents. AGENTS.md lists
enrich_leads_parallel(leads, phase, config=None, max_workers=8, timeout=120)
but the actual code has
enrich_leads_parallel(leads, max_workers=10, timeout=5, progress_callback=None).
The docs are stale relative to the code.
First 5 lines:
"""Brave Search API enrichment for the Gardener pipeline.
Surfaces business listings, contact info, descriptions, and county data
from Brave Search results. Reuses the same API patterns proven in
route-planner.py (same endpoint, same auth header, same county cache).Last 3 lines:
"county": county,
"brave_results_count": len(biz_results),
}Public functions: brave_search(),
brave_enrich_lead(), extract_business_info()
(not verified — I did not grep for def lines in this module, stating
uncertainty).
Imported by: lib/enrichment.py
only.
Status: Active
First 5 lines:
"""Lifecycle tracking and relationship intelligence for the Gardener system.
New v2 features:
- Outreach windows: When to contact based on days since filing
- Formation timing signals: Tax season, lease cycles, day-of-week patternsLast 3 lines:
"needs_follow_up": needs_followup,
"total": sum(stages.values()),
}SURPRISE: I could not find any caller for
get_agent_clusters(). Grep returned no matches. This
function appears dead within an otherwise active module.
Imported by: backlog-to-zoho.py, ct-zoho-pipeline.py, mark-called.py, seed-planter.py, shepherd-to-zoho.py, zoho-push.py, drafter.py.
Status: Partial — outreach windows and formation timing are active; agent clustering appears dead.
First 5 lines:
"""Historical lead routing — milestone calculation and pipeline integration.
Used by the historical sweep pipeline for leads filed 90+ days ago.
The primary talk-track is equipment lifecycle + lease-expiry timing.
The milestone (years in business) is the door opener. The rental offerLast 3 lines:
ms = milestone_age(fd)
return ms.get("milestone") == target_yearsImported by: historical-sweep.py, draft-backlog.py.
Status: Active
First 5 lines:
"""Name pattern detection helpers for the Gardener scoring pipeline.
Extracted from ct-seed-planter.py. Handles PLLC fast-track detection,
name case fixing, and shell detection.
"""Last 3 lines:
shell_patterns = load_shell_patterns()
from .scoring import score_name
return score_name(name, shell_patterns) <= -25Imported by: harvest.py, historical-sweep.py, scoring.py.
Status: Active
First 5 lines:
"""Vertical taxonomy for Zoho lead classification — v2.
Two-field, two-tier classification system:
Business_Vertical — 9 top-level verticals
Vertical_Segment — ~50 granular sub-category segmentsImported by: classify-verticals.py only.
Status: Active
First 5 lines:
"""Unified Zoho CRM integration module.
One implementation used by all Gardener scripts. Eliminates copy-pasted
auth logic from 6 files.Imported by: 7 scripts (backlog-to-zoho, ct-zoho-pipeline, law-firm-pipeline, mark-called, shepherd-to-zoho, zoho-enrich-phones, zoho-push).
Status: Active
First 5 lines:
"""Unified configuration loader for the Gardener system.
Loads gardener.json, enhanced_shell_patterns.json, and ct-city-county-map.json
from the scripts/ directory. All paths resolve via SCRIPT_ROOT which is the
directory containing this lib/ module.Last 3 lines:
if code and code in em.get("codes", {}):
return em["codes"][code]
return em.get("fallback", {})Public functions: load_config(),
load_tiers(), load_shell_patterns(),
load_equipment_mapping(), get_llm_config(),
get_template_route(),
get_equipment_for_naics(),
update_pipeline_status(), get_known_cities(),
get_pllc_fast_track(),
get_contact_info_bonus(),
get_formation_timing(),
get_lifecycle_config(),
get_scoring_pipeline(), get_recency_bonus(),
get_agent_clustering(),
get_formation_signals(),
get_daily_territory_scan(),
get_location_quality()
Imported by: Every lib module.
Status: Active — but
get_template_route() is dead (see section 2.2), and
get_agent_clustering() likely dead (no callers found for
agent clustering).
All Active. backlog_dashboard.py generates the Gentelella-themed dashboard; dashboard.py generates the neo-brutalist morning brief; webapp.py serves them via Flask. webapp.py has no importers (standalone entry point).
Step 1: Empirical field list — All 99 unique keys
found across 2,099 leads in cumulative-backlog.json:
accountnumber, agent_address, agent_name, annual_report_due_date,
appearance_count, began_transacting_in_ct, billing_unit, billingcity,
billingcountry, billingpostalcode, billingstate, billingstreet,
brave_descriptions, brave_phone, brave_results_count, brave_summary,
brave_website, business_email_address, business_name_in_state_country,
business_type, call_count, called, category_survey_email_address,
citizenship, city, competitor_brands_found, competitor_displacement,
competitor_summary, country_formation, county, create_dt,
date_of_organization_meeting, date_registration, domain, domain_phone,
draft_body, draft_subject, email, enrichment_date, enrichment_method,
entity_type, equipment, filing_date, first_seen, followup_reason,
formation_place, hat_assignment, hat_name, historical_needs_followup,
id, is_shell, last_seen, mailing_address, mailing_jurisdiction,
mailing_jurisdiction_1, mailing_jurisdiction_2, mailing_jurisdiction_3,
mailing_jurisdiction_4, mailing_jurisdiction_address,
mailing_jurisdiction_country, minority_owned_organization, naics,
naics_code, naics_score, name, name_score, needs_redraft,
office_in_jurisdiction_country, office_jurisdiction, office_jurisdiction_1,
office_jurisdiction_2, office_jurisdiction_3, office_jurisdiction_4,
office_jurisdiction_address, org_owned_by_person_s_with,
organization_is_lgbtqi_owned, original_push_date, outreach, phone,
priority, pushed_to_zoho, readiness_signals, readiness_weight,
redraft_reason, score, score_history, source, state,
state_or_territory_formation, status, sub_status, tier,
total_authorized_shares, vertical, veteran_owned_organization,
website_exists, website_url, woman_owned_organization, zoho_id
Step 2: Field-by-field classification (key fields only — full 99-field audit would run 30+ pages):
| Field | Write sites (grep) | Read sites (grep) | Classification |
|---|---|---|---|
id |
harvest.py, enrichment.py | backlog-to-zoho.py, enrichment.py, zoho.py, many more | Live |
name |
harvest.py, historical-sweep.py | scoring.py, drafter.py, many more | Live |
score |
harvest.py, historical-sweep.py, scoring.py | drafter.py, dashboard.py, backlog_dashboard.py | Live |
naics_score |
scoring.py | backlog_dashboard.py | Live |
name_score |
scoring.py | backlog_dashboard.py | Live |
tier |
scoring.py, harvest.py | backlog_dashboard.py, seed-planter.py | Live |
is_shell |
scoring.py, harvest.py | draft-backlog.py, enrich-backlog.py | Live |
priority |
scoring.py (calculate_priority) | backlog_dashboard.py, draft-backlog.py | Live |
readiness_weight |
scoring.py | backlog_dashboard.py | Live |
readiness_signals |
scoring.py | backlog_dashboard.py | Live |
draft_subject |
drafter.py | zoho-push.py, backlog_dashboard.py, audit-phantom-drafts.py | Live |
draft_body |
drafter.py | zoho-push.py, backlog_dashboard.py, audit-phantom-drafts.py | Live |
brave_summary |
brave_enrich.py (via enrichment.py) | drafter.py (build_context_object) | Live |
brave_phone |
brave_enrich.py (via enrichment.py) | scoring.py (calculate_readiness) | Live |
brave_website |
brave_enrich.py | EVIDENCE NOT AVAILABLE — could not confirm active reader | Likely live |
brave_descriptions |
brave_enrich.py | EVIDENCE NOT AVAILABLE | Likely write-only |
brave_results_count |
brave_enrich.py | EVIDENCE NOT AVAILABLE | Likely write-only |
equipment |
enrichment.py (get_equipment_context) | drafter.py | Live |
domain |
enrichment.py | scoring.py (score_email_domain), drafter.py | Live |
domain_phone |
enrichment.py | scoring.py (calculate_readiness) | Live |
website_exists |
enrichment.py | drafter.py (build_context_object) | Live |
website_url |
enrichment.py | backlog_dashboard.py | Live |
phone |
harvest.py, enrichment.py | scoring.py (calculate_readiness), drafter.py | Live |
email |
harvest.py, enrichment.py | scoring.py, drafter.py, zoho-push.py | Live |
city |
harvest.py | scoring.py (score_location), drafter.py | Live |
county |
brave_enrich.py | backlog_dashboard.py, route-planner.py | Live |
pushed_to_zoho |
zoho-push.py, backlog-to-zoho.py | backlog_dashboard.py, zoho-push.py | Live |
zoho_id |
zoho-push.py, backlog-to-zoho.py | zoho-push.py, mark-called.py | Live |
source |
draft-backlog.py (historical tag) | drafter.py (prompt routing) | Live |
hat_assignment |
haberdasher.py | backlog-to-zoho.py, backlog_dashboard.py, zoho-push.py | Write-mostly — written by vestigial haberdasher, still read by Zoho push |
hat_name |
haberdasher.py | backlog-to-zoho.py | Write-mostly — same as hat_assignment |
needs_redraft |
audit-phantom-drafts.py | draft-backlog.py (skip check) | Live (phantom audit) |
redraft_reason |
audit-phantom-drafts.py | draft-backlog.py | Live (phantom audit) |
called |
mark-called.py | backlog_dashboard.py | Live |
call_count |
mark-called.py | backlog_dashboard.py | Live |
vertical |
classify-verticals.py | backlog-to-zoho.py | Live |
competitor_displacement |
enrichment.py | scoring.py (calculate_readiness) | Live |
competitor_summary |
enrichment.py | backlog_dashboard.py | Live |
competitor_brands_found |
enrichment.py | EVIDENCE NOT AVAILABLE | Likely write-only |
appearance_count |
harvest.py | backlog_dashboard.py | Live |
score_history |
harvest.py | EVIDENCE NOT AVAILABLE | Likely write-only |
enrichment_date |
enrichment.py | EVIDENCE NOT AVAILABLE | Likely write-only |
enrichment_method |
enrichment.py | EVIDENCE NOT AVAILABLE | Likely write-only |
outreach |
lifecycle.py | EVIDENCE NOT AVAILABLE | Likely write-only |
historical_needs_followup |
draft-backlog.py | EVIDENCE NOT AVAILABLE | Likely write-only |
followup_reason |
EVIDENCE NOT AVAILABLE | EVIDENCE NOT AVAILABLE | Dead — could not confirm any reader or writer |
sub_status |
EVIDENCE NOT AVAILABLE | EVIDENCE NOT AVAILABLE | Dead — could not confirm any reader or writer |
category_survey_email_address |
CT SoS data | EVIDENCE NOT AVAILABLE | Dead — raw data field, never used |
total_authorized_shares |
CT SoS data | EVIDENCE NOT AVAILABLE | Dead — raw data field, never used |
date_of_organization_meeting |
CT SoS data | EVIDENCE NOT AVAILABLE | Dead — raw data field, never used |
country_formation |
CT SoS data | EVIDENCE NOT AVAILABLE | Dead — raw data field, never used |
original_push_date |
EVIDENCE NOT AVAILABLE | EVIDENCE NOT AVAILABLE | Dead — could not confirm any reader or writer |
Step 3: Fields in code but not in first lead’s keys (added later in pipeline):
brave_summary, brave_phone, brave_website, brave_descriptions, brave_results_count,
competitor_brands_found, competitor_displacement, competitor_summary, county,
domain, domain_phone, draft_body, draft_subject, email, enrichment_date,
enrichment_method, equipment, filing_date, hat_assignment, hat_name,
historical_needs_followup, needs_redraft, outreach, phone, priority,
pushed_to_zoho, readiness_signals, readiness_weight, redraft_reason,
score_history, source, vertical, website_exists, website_url, zoho_id,
city, entity_type, agent_name, agent_address
Line count: 1,976 lines
(wc -l scripts/gardener.json).
Top-level keys (24 total):
| Key | Approx lines | Purpose | Read by | Status |
|---|---|---|---|---|
_meta |
1-17 | Config metadata | Nobody specifically | Live (loaded as part of full config) |
version |
~18 | Config version | Nobody | Dead |
pllc_fast_track |
18-132 | PLLC detection rules | lib/scoring.py:197,
lib/patterns.py:105 |
Live |
scoring_pipeline |
133-147 | Scoring pipeline config | Nobody — get_scoring_pipeline() exists in config.py but
I found no callers |
Dead or unused |
recency_bonus |
148-158 | Recency scoring windows | lib/config.py:169 (get_recency_bonus) |
Live |
location_quality |
159-249 | Location scoring tiers | lib/config.py:105 |
Live |
contact_info_bonus |
159-249 | Email domain scoring | lib/config.py:124,130,137,
lib/enrichment.py:33 |
Live |
tiers |
250-1282 | NAICS tier scoring (197 codes) | lib/scoring.py:40,
scripts/seed-planter.py:91,636,
lib/config.py:200 |
Live — but 197 template_route
sub-fields are dead |
keyword_fallback |
1283-1382 | Keyword-based scoring fallback | lib/scoring.py:47 |
Live |
name_penalty_patterns |
1383-1506 | Shell company detection | lib/scoring.py:72,105,260,
scripts/harvest.py:144 |
Live |
name_bonus_patterns |
1507-1592 | Professional name bonuses | lib/scoring.py:105,260 |
Live |
scoring_rules |
1593-1603 | Scoring thresholds | lib/scoring.py:102,126 |
Live |
formation_signals |
1604-1621 | Formation timing signals | lib/config.py:184 (get_formation_signals) — I found no
callers of this function |
Dead or unused |
daily_territory_scan |
1622-1659 | Daily scan config | lib/config.py:190, scripts/harvest.py:210,
scripts/historical-sweep.py:321,
scripts/seed-planter.py:635 |
Live |
lifecycle_tracking |
1660-1692 | Lifecycle tracking | Nobody — I found no callers | Dead |
route_planner |
1693-1739 | Route planning config | Nobody — I found no callers | Dead |
known_cities |
1740-1833 | CT city list | lib/config.py:65 (get_known_cities) |
Live |
branding |
1834-1841 | Email signature | lib/drafter.py:35 |
Live |
push_guardrails |
1842-1845 | Zoho push limits | scripts/zoho-push.py:255 |
Live |
lifecycle |
1846-1890 | Outreach window config | lib/config.py:145 |
Live |
formation_timing |
1891-1919 | Formation timing context | lib/config.py:151, lib/drafter.py:167 |
Live |
llm |
~1920 | LLM model config | lib/config.py:163, lib/drafter.py |
Live |
agent_clustering |
~1940 | Agent clustering config | lib/config.py:157 (get_agent_clustering) — I found no
callers of get_agent_clusters |
Dead |
brave_search |
~1950 | Brave API config | lib/brave_enrich.py:39 |
Live |
SECURITY ISSUE: llm.api_key and
brave_search.api_key are stored in plaintext in
gardener.json. These should be environment variables.
Dead nested fields: All 197
template_route entries within tiers are dead —
only read by get_template_route() which is imported by
drafter.py and seed-planter.py, but the template drafting system is
retired. The field is still passed through
build_context_object() at drafter.py:119 but not used in
prompt construction.
Featherless API: Called in
lib/drafter.py:_call_featherless(). Model: DeepSeek-V3.1
for drafting (config llm.models.draft). Auth: API key from
llm.api_key field in gardener.json ([REDACTED — value
present in config but not reproduced here]). Live.
Brave Search API: Called in
lib/brave_enrich.py:brave_search(). Auth: API key from
brave_search.api_key field in gardener.json ([REDACTED]).
Live.
CT SoS Data API (Socrata): Called in
scripts/harvest.py and
scripts/historical-sweep.py via urllib.request
to data.ct.gov. Auth: Public API (no key needed, uses
X-App-Token: DEMO_KEY). Live.
Zoho CRM v8: Called in lib/zoho.py.
Auth: OAuth2 via credentials in scripts/.env.zoho
([REDACTED]). Live.
N8N Webhook: Found at
scripts/seed-planter.py:50:
WEBHOOK_URL = "https://workflows.residentliberal.com/webhook/jXjTXfBO3qsMMgtH/webhook/qualify-lead"This is a hardcoded URL in the vestigial seed-planter.py. Dead — seed-planter is retired.
Google Places: Referenced in
lib/enrichment.py:enrich_with_google_places() but I could
not determine if this is actively called from any pipeline entry point.
Unknown.
A lead enters via scripts/harvest.py pulling CT SoS
filings. Harvest scores and merges into
cumulative-backlog.json.
scripts/enrich-backlog.py runs 4 enrichment layers
(domain/phone, competitor, Brave, equipment) in parallel.
scripts/draft-backlog.py calls lib/drafter.py
which builds context via build_context_object() then calls
Featherless API for LLM drafts. The operator reviews via
scripts/backlog-dashboard.py HTML.
scripts/zoho-push.py pushes with a confirm gate and
cooldown guardrail.
The happy path is: harvest → enrich → draft → review → push.
scripts/morning-brief.py orchestrates harvest + enrich +
draft in one run via subprocess calls.
The historical variant enters via
scripts/historical-sweep.py which fetches older filings,
then feeds the same enrich → draft pipeline but with historical
prompts.
The direct variant is scripts/ct-zoho-pipeline.py which
goes CT SoS → score → Zoho, skipping enrich and draft.
Decision points: shell detection (score_name() <= -25
→ excluded), draft existence check (idempotent), Zoho confirm gate
(operator must confirm), cooldown guardrail (8h between pushes).
Scoring engine (lib/scoring.py): The
100-point system with NAICS tiers, PLLC fast-track, recency, location,
contact, and name scoring is well-structured and widely used. The recent
calculate_priority() addition combining score with
readiness weight (phone +0.50, custom email +0.15, etc.) is a clean
separation of quality vs. reachability.
Substance injection
(lib/drafter.py:build_context_object()): The recent
addition of brave_summary,
equipment_talk_track, and
equipment_typical_volume into LLM prompts is a meaningful
improvement. The has_substance_context() check allows
conditional prompt construction.
Parallel enrichment
(lib/enrichment.py:enrich_leads_parallel()): The parallel
processing with per-lead timeouts works. Checkpointing after every 10
leads per phase provides crash recovery.
Single source of truth: The
cumulative-backlog.json pattern is simple and works,
despite the lack of locking.
Historical pipeline (lib/historical.py
+ scripts/historical-sweep.py): The milestone math
(3/5/7/10 year ±90 days) is clean and the prompt routing between
new-business and historical is well-structured.
Haberdashery system
(scripts/haberdasher.py): No active script calls this.
hat_assignment and hat_name are written only
by haberdasher.py. However, they are still read by
backlog-to-zoho.py (8 references) and
zoho-push.py (1 reference) and
backlog_dashboard.py (1 reference). The Zoho push writes
Hat_Assignment to CRM. Not fully dead —
the Zoho push still sends hat data. This is a zombie: the writer is
retired but the reader is still active.
Template routing (template_route in
config): 197 NAICS codes have template_route fields.
lib/config.py:get_template_route() exists and is imported
by lib/drafter.py:18 and called at
lib/drafter.py:119. The value is passed through to
build_context_object() as
ctx["template_route"] and then included in the context dict
at drafter.py:502. Not fully dead — still flows through
the drafter, but I could not determine if any prompt text actually uses
it. The template drafter (seed-planter.py) is retired, so
the field serves no purpose in the LLM path.
get_agent_clusters() in
lib/lifecycle.py: Function is defined but grep found zero
callers. Dead code within an active module.
why_this_now() in
lib/drafter.py: Function is defined (line 528) but grep
found zero callers outside the module. Dead code within an
active module.
generate_agent_referral_email() in
lib/drafter.py: Function is defined (line 559) but grep
found zero callers. Dead code within an active
module.
N8N webhook in
scripts/seed-planter.py:50: Hardcoded URL
https://workflows.residentliberal.com/webhook/....
Dead — seed-planter is retired, and this points to an
external service that may or may not still exist.
Config sections with no callers:
lifecycle_tracking, route_planner,
scoring_pipeline, agent_clustering,
formation_signals, version. These are loaded
by config.py accessors but the accessor functions themselves have no
callers.
Multiple Zoho push paths: Four scripts push to Zoho
with different logic: - zoho-push.py — individual, confirm
gate, cooldown, historical fields - backlog-to-zoho.py —
batch, hat assignments, rollback logging -
ct-zoho-pipeline.py — direct, no scoring/enrichment -
shepherd-to-zoho.py — churches/religious only
These are not simple wrappers — each has its own field mapping and push logic. The Zoho field mapping is duplicated across all four.
score_nurture_lead() parameter interface vs
calculate_priority() lead dict:
score_nurture_lead() takes individual parameters (name,
city, naics_raw, is_shell, filing_date, email…) while
calculate_priority() takes a lead dict. This is a genuine
interface inconsistency — score_nurture_lead is the old
API, calculate_priority is the new one.
enrich_leads_parallel() doc mismatch:
AGENTS.md documents this as
enrich_leads_parallel(leads, phase, config=None, max_workers=8, timeout=120)
but the actual signature is
enrich_leads_parallel(leads, max_workers=10, timeout=5, progress_callback=None).
The phase parameter doesn’t exist in the actual code.
morning-brief.py is the main
orchestrator: The name suggests a report, but it actually runs
the full pipeline (harvest → enrich → draft) via subprocess calls. A new
operator would not guess this is the primary entry point.
seed-planter.py sounds active but is
retired: The name doesn’t indicate it’s vestigial. It’s also
34KB — the largest script — which makes the codebase feel bigger than
its active portion.
Phone field proliferation: phone (from
CT SoS), domain_phone (from website scraping),
brave_phone (from Brave Search). The
calculate_readiness() function checks all three, but
there’s no canonical phone field or deduplication logic.
lib/enrichment.py does too much: It
handles domain enrichment, competitor checking, Brave enrichment,
equipment context, Google Places, parallel orchestration, AND backlog
loading/saving. The
load_backlog()/save_backlog() functions being
in the enrichment module is particularly surprising.
template_route still flows through the
drafter: Even though the template system is retired, the field
is still computed and passed through
build_context_object(). This is confusing — a new developer
would assume it’s functional.
No locking on cumulative-backlog.json:
Multiple scripts read and write this file. If
morning-brief.py (which calls harvest, enrich, and draft in
sequence) is run while another script is writing, data could be lost.
This is a known issue documented in AGENTS.md.
API keys in plaintext: llm.api_key and
brave_search.api_key are in gardener.json
which is in the git repo. The .env.zoho file is also in
scripts/. These should be environment variables.
template_route still computed but
unused: The drafter imports get_template_route,
calls it, and passes the value through context. If someone modifies the
template_route logic thinking it affects output, they’d be wrong. Dead
code that appears alive is worse than dead code that looks dead.
The hat_assignment zombie: Haberdasher is retired,
but backlog-to-zoho.py still reads
hat_assignment and sends it to Zoho. If Zoho automation
sequences depend on Hat_Assignment, and no new leads get
hats assigned, the Zoho side degrades silently.
score_nurture_lead() takes raw parameters, not a
lead dict: This means any new field that affects scoring must
be added as a new parameter to this function, and every caller must be
updated. This is fragile — calculate_priority() already
takes a lead dict, creating two parallel interfaces.
Three dead functions in active modules:
get_agent_clusters() in lifecycle.py,
why_this_now() in drafter.py, and
generate_agent_referral_email() in drafter.py are all
defined but never called. In a codebase that has gone through
iterations, dead code is expected, but having it in the most critical
modules (drafter, lifecycle) is risky — it suggests the modules weren’t
cleaned up between iterations.
template_route is not dead — it’s a
zombie: I expected template_route to be fully dead
(written nowhere, read nowhere). Instead, it’s computed by
get_template_route(), imported by drafter.py, called at
line 119, and passed through to the context object. But no prompt text
uses it. It’s dead at the output but alive in the data flow.
enrich_leads_parallel() has no
phase parameter: The documentation (AGENTS.md)
describes a phase parameter that doesn’t exist in the
actual function signature. The doc was written for a version that was
refactored.
scoring_pipeline config section has no
callers: The config has an entire section
(scoring_pipeline) with an accessor function
(get_scoring_pipeline()) but I found no code that calls it.
This is an entire config section that’s loaded but unused.
The webhook URL in seed-planter.py is hardcoded:
https://workflows.residentliberal.com/webhook/jXjTXfBO3qsMMgtH/webhook/qualify-lead
— this is a real URL to a real service, sitting in a retired script. If
that webhook endpoint still exists, it’s a latent integration that could
be triggered accidentally.
gardener/
├── core/
│ ├── scoring.py
│ │ replaces: lib/scoring.py (entire module)
│ │ replaces: scripts/recalculate-priority.py (logic → CLI command)
│ │ drops: get_priority_score() (unused wrapper)
│ │
│ ├── enrichment.py
│ │ replaces: lib/enrichment.py (domain/phone/competitor/brave/equipment functions)
│ │ replaces: scripts/enrich-backlog.py (orchestration → CLI command)
│ │ splits out: load_backlog()/save_backlog() → storage/backlog.py
│ │
│ ├── drafting.py
│ │ replaces: lib/drafter.py (build_context_object, build_draft_prompt, build_historical_prompt, draft_email, draft_batch)
│ │ drops: why_this_now() (zero callers)
│ │ drops: generate_agent_referral_email() (zero callers)
│ │ drops: template_route from context object (zombie field)
│ │
│ ├── classification.py
│ │ replaces: lib/verticals.py (classify_lead, classify_all)
│ │ replaces: scripts/classify-verticals.py (logic → CLI command)
│ │
│ ├── lifecycle.py
│ │ replaces: lib/lifecycle.py (get_outreach_window, get_formation_timing_context, get_historical_context)
│ │ drops: get_agent_clusters() (zero callers)
│ │ drops: get_industry_nurture_content() (template system retired)
│ │
│ └── patterns.py
│ replaces: lib/patterns.py (proper_case_name, is_pllc_fast_track, is_shell_lead)
│
├── pipelines/
│ ├── harvest.py
│ │ replaces: scripts/harvest.py (CT SoS fetch + score + merge)
│ │ replaces: scripts/morning-brief.py (orchestration → pipeline entry point)
│ │
│ ├── historical.py
│ │ replaces: scripts/historical-sweep.py (entry point)
│ │ replaces: lib/historical.py (milestone_age, is_historical, tag_source, snipe_match)
│ │
│ └── direct.py
│ replaces: scripts/ct-zoho-pipeline.py (CT → Zoho bypass)
│
├── integrations/
│ ├── ctsos.py
│ │ replaces: CT SoS API logic currently in scripts/harvest.py and scripts/historical-sweep.py
│ │
│ ├── brave.py
│ │ replaces: lib/brave_enrich.py (brave_search, brave_enrich_lead, extract_business_info)
│ │
│ ├── llm.py
│ │ replaces: _call_featherless() currently in lib/drafter.py
│ │ replaces: llm config loading from lib/config.py
│ │
│ └── zoho.py
│ replaces: lib/zoho.py (Zoho class, clean_business_name)
│ replaces: push logic from scripts/zoho-push.py, backlog-to-zoho.py, shepherd-to-zoho.py
│ consolidates: 4 push scripts → 1 with mode flags
│
├── storage/
│ ├── backlog.py
│ │ NEW: atomic load/save with file locking
│ │ replaces: load_backlog()/save_backlog() from lib/enrichment.py
│ │ replaces: get_backlog_path() from lib/enrichment.py
│ │
│ ├── config.py
│ │ replaces: lib/config.py (load_config, all get_* functions)
│ │ drops: get_template_route() (zombie)
│ │ drops: get_agent_clustering() (zero callers)
│ │ drops: get_scoring_pipeline() (zero callers)
│ │ drops: get_formation_signals() (zero callers)
│ │
│ └── metrics.py
│ NEW: pipeline metrics collection (currently spread across print statements)
│
├── cli/
│ ├── main.py # entry point with subcommands
│ ├── harvest_cmd.py # replaces: scripts/harvest.py CLI
│ ├── enrich_cmd.py # replaces: scripts/enrich-backlog.py CLI
│ ├── draft_cmd.py # replaces: scripts/draft-backlog.py, draft-random-50.py
│ ├── push_cmd.py # replaces: scripts/zoho-push.py, backlog-to-zoho.py, shepherd-to-zoho.py
│ ├── classify_cmd.py # replaces: scripts/classify-verticals.py
│ ├── audit_cmd.py # replaces: scripts/audit-phantom-drafts.py
│ ├── dashboard_cmd.py # replaces: scripts/backlog-dashboard.py
│ └── util_cmd.py # replaces: scripts/recalculate-priority.py, mark-called.py, rollback.py, etc.
│
├── web/
│ ├── dashboard.py # replaces: lib/dashboard.py (morning brief HTML)
│ ├── backlog_dashboard.py # replaces: lib/backlog_dashboard.py (Gentelella dashboard)
│ └── app.py # replaces: lib/webapp.py (Flask server)
│
└── prompts/ # NEW: prompt templates as files, not code strings
├── new_business.py # replaces: build_draft_prompt() string in lib/drafter.py
├── historical.py # replaces: build_historical_prompt() string in lib/drafter.py
└── variants/ # NEW: A/B test variants
DROPPED files (with justification):
| Dropped file | Justification |
|---|---|
scripts/haberdasher.py |
Hat assignment system retired; hat_assignment still read by Zoho push but should be removed from Zoho mapping |
scripts/seed-planter.py |
Template drafting retired; contains dead N8N webhook URL |
scripts/lead-heartbeat.py |
Signs-of-life monitoring; can be a CLI subcommand |
scripts/lead-tracker.py |
Lifecycle tracking; can be a CLI subcommand |
scripts/route-planner.py |
Geographic clustering; can be a CLI subcommand |
scripts/sales-brief-generator.py |
Brief generation; can be a CLI subcommand |
scripts/law-firm-pipeline.py |
Specialized; use filters in generic pipeline |
scripts/shepherd-to-zoho.py |
Specialized; use filters in generic push |
Preserved fields (with rename mapping where applicable):
| Current name | New name | Type | Written by | Read by |
|---|---|---|---|---|
id |
id |
str | harvest, historical-sweep | everywhere |
name |
name |
str | harvest | everywhere |
business_type |
entity_type |
str | harvest | scoring, dashboard |
naics_code |
naics |
str | harvest | scoring, drafter, Zoho |
filing_date |
filing_date |
str | harvest | scoring, drafter, historical |
email |
email |
str | harvest, enrichment | scoring, drafter, Zoho |
phone |
phone |
str | harvest, enrichment | scoring, drafter, Zoho |
city |
city |
str | harvest | scoring, drafter, Zoho |
state |
state |
str | harvest | Zoho |
is_shell |
is_shell |
bool | scoring | draft-backlog, enrich-backlog |
score |
score |
int | scoring | drafter, dashboard, Zoho |
priority |
priority |
float | scoring | dashboard, draft-backlog |
readiness_weight |
readiness_weight |
float | scoring | dashboard |
readiness_signals |
readiness_signals |
list | scoring | dashboard |
tier |
tier |
str | scoring | dashboard, Zoho |
domain |
domain |
str | enrichment | scoring, drafter |
domain_phone |
domain_phone |
str | enrichment | scoring |
website_url |
website_url |
str | enrichment | dashboard |
website_exists |
website_exists |
bool | enrichment | scoring, drafter |
brave_summary |
brave_summary |
str | enrichment | drafter |
brave_phone |
brave_phone |
str | enrichment | scoring |
county |
county |
str | enrichment | dashboard, route-planner |
competitor_displacement |
competitor_displacement |
bool | enrichment | scoring |
competitor_summary |
competitor_summary |
str | enrichment | dashboard |
equipment |
equipment |
dict | enrichment | drafter |
draft_subject |
draft_subject |
str | drafter | Zoho, dashboard |
draft_body |
draft_body |
str | drafter | Zoho, dashboard |
source |
source |
str | draft-backlog | drafter |
pushed_to_zoho |
pushed_to_zoho |
bool | zoho-push | dashboard, zoho-push |
zoho_id |
zoho_id |
str | zoho-push | zoho-push, mark-called |
called |
called |
bool | mark-called | dashboard |
call_count |
call_count |
int | mark-called | dashboard |
vertical |
vertical |
str | classify-verticals | Zoho |
first_seen |
first_seen |
str | harvest | dashboard |
last_seen |
last_seen |
str | harvest | dashboard |
appearance_count |
appearance_count |
int | harvest | dashboard |
needs_redraft |
needs_redraft |
bool | audit-phantom-drafts | draft-backlog |
citizenship |
citizenship |
str | harvest | scoring |
New fields:
| New name | Type | Justification |
|---|---|---|
brave_phone in context |
— | Already exists but NOT passed to drafter (confirmed bug) |
phone_sources |
list | Track where each phone number came from |
Dropped fields (each was classified as dead/write-only/zombie in section 1.3):
| Dropped field | Justification |
|---|---|
hat_assignment |
Written by retired haberdasher; still read by Zoho push but should be removed from mapping |
hat_name |
Same as hat_assignment |
template_route |
Zombie — still computed but not used in any prompt |
score_history |
Write-only — written by harvest, never read |
enrichment_date |
Write-only — never read by any pipeline step |
enrichment_method |
Write-only — never read |
outreach |
Write-only — lifecycle.py writes but no reader found |
historical_needs_followup |
Write-only — never read |
followup_reason |
Dead — no writer or reader found |
sub_status |
Dead — no writer or reader found |
category_survey_email_address |
Dead — raw CT SoS field, never used |
total_authorized_shares |
Dead — raw CT SoS field, never used |
date_of_organization_meeting |
Dead — raw CT SoS field, never used |
country_formation |
Dead — raw CT SoS field, never used |
original_push_date |
Dead — no writer or reader found |
brave_descriptions |
Write-only — written by brave_enrich, never read |
brave_results_count |
Write-only — written by brave_enrich, never read |
competitor_brands_found |
Write-only — written by enrichment, never read |
All mailing_jurisdiction_* fields (6 fields) |
Raw CT SoS data, never used in any pipeline logic |
All office_jurisdiction_* fields (5 fields) |
Raw CT SoS data, never used in any pipeline logic |
All billing_* fields (5 fields) |
Raw CT SoS data, never used in any pipeline logic |
| All diversity flags (5 fields) | Raw CT SoS data, never used in any pipeline logic |
annual_report_due_date |
Raw CT SoS data, never used |
began_transacting_in_ct |
Raw CT SoS data, never used |
business_name_in_state_country |
Redundant with name |
formation_place |
Redundant with state |
state_or_territory_formation |
Redundant with state |
accountnumber |
CT SoS internal, never used |
naics_score |
Component of score, not used independently |
name_score |
Component of score, not used independently |
Split gardener.json (1,976 lines) into focused
files:
config/
├── scoring.yaml # replaces: pllc_fast_track, tiers, keyword_fallback,
│ # name_penalty_patterns, name_bonus_patterns,
│ # scoring_rules, recency_bonus, location_quality,
│ # contact_info_bonus, known_cities
├── enrichment.yaml # replaces: brave_search section
├── llm.yaml # replaces: llm section
│ # API keys → environment variables
├── zoho.yaml # replaces: push_guardrails
│ # OAuth credentials → environment variables
├── pipeline.yaml # replaces: daily_territory_scan, lifecycle, formation_timing
├── branding.yaml # replaces: branding section
└── drops: # removed entirely
_meta # (unnecessary in code)
version # (zero callers)
scoring_pipeline # (zero callers)
agent_clustering # (zero callers)
lifecycle_tracking # (zero callers)
route_planner # (zero callers)
formation_signals # (zero callers)
template_route (197) # (zombie — computed but unused)
Pain points in current config: 1. 1,976 lines is too
large to reason about 2. API keys in plaintext in a git-tracked file 3.
7 sections have zero callers (dead config) 4. 197
template_route entries are zombie data 5. No schema
validation — typos in config fail silently
A lead enters via pipelines/harvest.py which calls
integrations/ctsos.py to fetch CT SoS filings. Each lead is
scored via core/scoring.py and saved to the backlog via
storage/backlog.py (which provides atomic load/save with
file locking).
Enrichment runs via core/enrichment.py which calls
integrations/brave.py for Brave data and does
domain/phone/competitor checks in parallel. Results are saved
atomically.
Drafting runs via core/drafting.py which builds context
from enrichment data, selects a prompt from prompts/, and
calls integrations/llm.py for the Featherless API. Drafts
are saved atomically.
The operator reviews via web/dashboard.py HTML output.
Push to Zoho goes through integrations/zoho.py with a
confirm gate.
Concurrency: storage/backlog.py uses
file locking (fcntl or filelock) for all reads and writes. No more
concurrent-write corruption risk.
Failure modes: Each pipeline step saves progress after each lead (existing checkpoint pattern). If a step crashes, re-running it skips already-completed leads.
Logging: storage/metrics.py collects
structured metrics (leads processed, time per lead, errors) replacing
the current print-statement approach.
Scenario 1: A/B test two prompt variants for new-business outreach
Current architecture: 1. Edit the prompt string inside
lib/drafter.py:build_draft_prompt() (line ~186) 2. Run
draft 3. Manually compare results 4. No structured way to track which
prompt produced which draft 5. To revert, edit the string back
Proposed architecture: 1. Create
prompts/variants/new_business_v2.py with the variant prompt
2. Add variant name to config/pipeline.yaml under
prompt_variants 3. Run
gardener draft --variant new_business_v2 --limit 50 4. Each
draft gets a prompt_variant field in the backlog 5. Compare
results:
gardener audit --compare-variants new_business new_business_v2
6. The prompts/ directory becomes the single source of
truth for all prompt text — no more prompt strings buried in Python
code
Files changed: 1 new file
(prompts/variants/new_business_v2.py), 1 config edit, 0
core code changes.
Current architecture: 1 core code edit (risky), 0 new files, no tracking.
Scenario 2: Swap drafting model from DeepSeek-V3.1 to a new model
Current architecture: 1. Edit scripts/gardener.json →
llm.models.draft 2. Verify _call_featherless()
in lib/drafter.py handles the new model’s response format
(the GLM-5.1 bug showed this is not guaranteed) 3. No model-version
tracking in the backlog — can’t tell which model produced which draft 4.
If the new model produces phantom drafts, you discover it manually
Proposed architecture: 1. Edit config/llm.yaml →
models.draft 2. Each draft gets a model field
in the backlog 3. integrations/llm.py has model-specific
response parsers (one per model, not one parser that guesses) 4.
gardener draft --model new-model-name --limit 5 for testing
5. audit-phantom-drafts equivalent runs automatically after
each batch
Files changed: 1 config edit, potentially 1 new parser in
integrations/llm.py.
Current architecture: 1 config edit, 1 potential parser fix in
lib/drafter.py (all models share one parser), no
tracking.
Scenario 3: Introduce “renewal leads” from a new data source
Current architecture: 1. Add new fields to backlog JSON (no schema
enforcement) 2. Create a new script (copy of harvest.py) for the new
data source 3. Add scoring logic to lib/scoring.py (edit
existing functions) 4. Add prompt variant by editing
build_draft_prompt() in lib/drafter.py 5. Add
new CLI entry point 6. Update morning-brief.py to call the new script 7.
No way to distinguish renewal leads from new-business leads in the
backlog without a convention
Proposed architecture: 1. Add lead_type field to schema
(values: “new_business”, “historical”, “renewal”) 2. Create
pipelines/renewal.py (new entry point, reuses core/
modules) 3. Add renewal scoring rules to
config/scoring.yaml 4. Create
prompts/renewal.py (new prompt template) 5. The pipeline
automatically routes based on lead_type 6.
storage/backlog.py validates schema on save
Files changed: 1 new pipeline file, 1 new prompt file, 1 config addition, 1 schema addition.
Current architecture: 1 new script (copy-paste of harvest.py), edits to 3 existing files (scoring.py, drafter.py, morning-brief.py), no schema enforcement.
Scenario 4: Add LinkedIn presence as an enrichment signal
Current architecture: 1. Add linkedin_url field to
backlog JSON (no schema enforcement, no validation) 2. Add LinkedIn
scraping function to lib/enrichment.py (already 327+ lines)
3. Add linkedin_url to build_context_object()
in lib/drafter.py 4. Add linkedin signal to
calculate_readiness() in lib/scoring.py 5.
Update calculate_priority() weight calculation 6. No way to
know which leads have LinkedIn data without scanning the backlog
Proposed architecture: 1. Add linkedin_url and
linkedin_signal to schema in
storage/backlog.py (validated) 2. Create
integrations/linkedin.py (new integration module) 3. Add
enrichment step to core/enrichment.py (calls the new
integration) 4. Add signal to core/scoring.py with weight
in config/scoring.yaml 5. Drafter automatically includes it
via build_context_object() which iterates all enrichment
signals 6. New signal is tracked in readiness_signals
list
Files changed: 1 new integration file, 1 enrichment step addition, 1 config addition, 1 schema addition.
Current architecture: edits to 3 existing files (enrichment.py, drafter.py, scoring.py), no schema enforcement, no separation of concerns.
Backlog: The existing
cumulative-backlog.json migrates via a transformation
script that drops dead fields, renames inconsistent fields, and adds
missing schema fields with defaults. The script produces a new backlog
file and a diff report. Old file is archived, not deleted.
Coexistence: The new codebase lives in
gardener/ alongside gardener-fork/. A
--backlog-path flag lets either codebase point at either
backlog file. During migration, both codebases read the same file. Once
the new codebase is validated, gardener-fork/ is
archived.
Rollback: The migration script is idempotent and
reversible. The old backlog file is never modified in place — a copy is
made before transformation. If the new codebase produces bad drafts,
switch the --backlog-path flag back to the old file and run
the old codebase.
Timeline: Not specified here — this is a sketch, not a plan. The operator and Claude will refine it together.
Reply rates: A cleaner codebase does not produce better cold emails. The substance injection improvements (brave_summary, equipment_talk_track) were the right move, but architecture doesn’t fix copy.
Audience targeting: Whether brand-new PLLCs are the right audience for copier outreach is a strategy question, not an architecture question. No refactor changes this.
CT SoS data quality: The pipeline trusts NAICS codes from CT SoS, which are often wrong. Garbage in, garbage out.
LLM hallucination: The drafter trusts brave_summary content without validation. A cleaner codebase still passes unvalidated enrichment data to the LLM.
Zoho automation decay: If Zoho automation sequences
depend on Hat_Assignment and no new leads get hats, the
Zoho side degrades. This proposal drops hat_assignment but doesn’t fix
the Zoho automation.
Single-operator bus factor: This codebase has one operator. A cleaner architecture doesn’t create a second operator.
The phone coverage gap: brave_phone is
collected but NOT passed to build_context_object() in
lib/drafter.py. This is a confirmed bug that exists
regardless of architecture.