Plugin ecosystem Round 3 stability and extensibility plan (v0.4.x)
This plan builds on Round 2 (M1-M6). The goal is to move sharelife from feature-complete to operationally stable and easier to evolve at scale.
1. Goals and boundary
1.1 Goals
- Stability under concurrency, failures, and version drift.
- Extensibility without repeated architectural rewrites.
- Alignment with product direction: high-fidelity replication plus strict governance.
1.2 Non-goals
- No full frontend rewrite to a heavy SPA framework in this stage.
- No distributed data layer in this stage.
- No default-open install/approval behavior.
2. Recommendation triage
2.1 Adopt now
- Introduce
Vitebuild pipeline while preserving vanilla module boundaries. - Consolidate UI state updates through one event bus (
EventTarget). - Sanitize all user-sourced dynamic HTML with
DOMPurify. - Add accessibility baseline: ARIA, keyboard flow, drawer/modal focus trap.
- Move persistence to repository interface + SQLite implementation + JSON fallback.
- Move heavy scan/diff tasks to background execution (
asyncio.to_thread/workers). - Strengthen middleware: auth dependency consistency, rate limiting, strict CORS, security headers.
- Add observability: structured logging (
structlog) and Prometheus metrics export. - Provide official container path: multi-stage Docker image + compose baseline.
- Add CI gates for coverage, i18n/docs/protocol consistency.
2.2 Defer
- PWA offline and install shell.
- Infinite-scroll UX as default interaction model.
- Full Tailwind restyling of existing pages.
- Direct PostgreSQL/
asyncpgmigration before SQLite phase is proven.
2.3 Not adopted in this stage
- Full migration to a heavy SPA stack.
- Relaxing review/install guardrails for speed.
3. Architecture decisions (ADR summary)
ADR-1: Frontend evolution
Decision:
- Keep vanilla JS module responsibilities.
- Add Vite bundling and artifact governance.
- Add centralized event bus for state synchronization.
Reasoning:
- Lowest migration risk with immediate maintainability gains.
- Removes script-global coupling and manual load cost.
- Creates a path for incremental declarative updates later.
ADR-2: Persistence
Decision:
- Standardize repository interfaces for market/profile-pack/audit/trial data.
- Switch default storage to SQLite.
- Keep JSON repository for local fallback and rollback.
Reasoning:
- Better write concurrency and query capability with low migration cost.
- Preserves existing layered architecture.
- Enables staged cutover rather than hard break.
ADR-3: Security and observability first
Decision:
- Prioritize auth/rate-limit/security-header/exception hardening.
- Add structured logs and metrics in the same phase.
Reasoning:
- Lowers exposure before feature acceleration.
- Avoids blind operation during scale-up.
4. Milestones (N1-N5)
N1 (consistency closure)
Scope:
- Version consistency across plugin metadata, API metadata, and docs baseline.
- Round 2 status text aligned with actual implementation.
- Canonical error-code documentation alignment.
Acceptance:
- Version drift meta-tests pass.
- Docs status assertions pass.
N2 (frontend maintainability and safety)
Scope:
- Vite build integration.
- Event-bus migration for key state flows.
- DOMPurify integration.
- ARIA/keyboard/focus accessibility baseline.
Acceptance:
node --test tests/webui/*.jspasses.- E2E keeps
market -> drawer -> wizard -> comparechain green. - XSS regression payloads are sanitized.
N3 (backend persistence and concurrency baseline)
Scope:
- SQLite repository implementation and migration utility.
- Service layer repository abstraction adoption.
- Query indexes on
pack_id/status/risk_level/created_at.
Acceptance:
- JSON and SQLite backends both pass regression.
- No state corruption in concurrent write scenarios.
N4 (security and observability hardening)
Scope:
- Login/API rate limiting.
- Security headers and strict CORS defaults.
- Structured log schema (
request_id/actor/route/error_code). - Metrics endpoint and baseline alert suggestions.
Acceptance:
- Security regression (authz/bruteforce/CORS) passes.
- Failures are diagnosable from logs and metrics.
N5 (delivery and operations)
Scope:
- Official Docker image and compose sample.
- Data volume mount and health checks.
- Deploy and rollback runbook update.
Acceptance:
- Container startup reproduces WebUI and core APIs.
- Docs cover both local and container deployment.
5. Risks and trade-offs
- Build tooling adds frontend complexity. Mitigation: preserve module boundaries and avoid heavy framework adoption.
- Storage migration introduces compatibility risk. Mitigation: migration tooling, dual-path validation, staged cutover.
- Stronger security defaults can reduce local convenience. Mitigation: explicit dev/prod config profiles.
6. Progress snapshot (as of April 2, 2026)
- N1 completed: version/doc consistency tests landed and are in CI.
- N2 completed: event bus, i18n synchronization hardening, accessibility/focus controls, and browser E2E chain are green.
- N3 completed: repository abstraction landed for
MarketService,ProfilePackService,PreferenceService,RetryQueueService,TrialService,TrialRequestService,AuditService, andInMemoryNotifierwith JSON/SQLite implementations, SQLite indexes, and legacy migration path. - N4 completed baseline: login/API rate limiting, strict security headers, request-id tracing, structured request logging, unified web error envelopes (including
internal_server_errorfallback),/api/metricswith dedicated auth/rate-limit counters, plus metric path-cardinality guardrails and error-storm scrape regression tests. - N5 completed baseline: official Dockerfile/compose path, health checks, standalone WebUI runner, and dedicated WebUI observability/rollback runbook for on-call operations.
- N5+ operational closure completed: shipped
docker-composeobservability overlay, automated smoke script (scripts/smoke_observability_stack.sh), and a dedicatedops-smokeGitHub Actions workflow for scheduled/manual runtime checks with diagnostic artifact upload. - N5++ diagnostics acceleration completed: smoke diagnostics now generate structured triage markdown/json (
output/ops-smoke/triage.md+triage.json), publish markdown to GitHub Actions Job Summary, and emit signal/action annotations for faster first response.
7. Recommended execution order
- N1 -> N2 -> N3 -> N4 -> N5.
- Resume major feature expansion only after N1-N2 closure.
- Each milestone must ship tests, docs, and rollback notes together.
8. Relation to existing roadmap
- Round 2 capabilities remain unchanged.
- Round 3 focuses on engineering quality and operational readiness.
- Governance invariants stay fixed: dry-run before apply, rollback availability, and auditable actions.