Data Flow Overview
Data Flow Overview
Envoy operates through a clear, one-way data flow across three distinct services: Portal, Agent, and Tools. This architectural choice ensures a strong separation of concerns, simplifies the system's logic, and enhances security by limiting direct inter-service dependencies and data exposure.
The fundamental data flow direction is strictly enforced:
Portal (React) → Agent (Python) → Tools (Node/Playwright)This means that services only initiate communication with the service to their right in the chain.
Service Responsibilities in Data Flow
Each service plays a specific role in how data moves through the Envoy system:
-
Portal (React)
- Originator of User Input: Captures all user interactions, such as search queries, review decisions (approve/ignore), and application form edits.
- Consumer of Processed Data: Displays job listings, application status, generated cover letters, and interactive prompts during the application process.
- Communicates Exclusively with Agent: The Portal UI is designed to communicate only with the Agent service via its HTTP API. It is unaware of the Tools service's existence.
-
Agent (Python)
- Central Orchestrator: Manages the entire application lifecycle workflow, from job discovery to final submission.
- AI and Business Logic: Houses the AI models for evaluating job fit, generating cover letters, and handling profile interviewing. Implements application policies and filters.
- Single Source of Persistence: The Agent is the only service that writes to and reads from the SQLite database, ensuring data integrity and consistency.
- Gateway to Browser Automation: Translates Agent-level tasks (e.g., "fetch job description," "fill application form") into specific commands for the Tools service.
- API Provider: Exposes an HTTP API for the Portal and internal background workers.
-
Tools (Node/Playwright)
- Browser Automation Engine: Uses Playwright to interact with external job portals (e.g., SEEK, Indeed, LinkedIn). This includes navigating pages, clicking elements, filling forms, and extracting HTML content.
- HTML Parsing: Extracts structured data (job details, form fields) from web pages as instructed by the Agent.
- Stateless Operation: The Tools service is entirely stateless. It holds browser sessions and scraped data only in memory, which is lost upon restart. It does not have direct access to the database.
- One-Way Communication: Always receives commands from the Agent and returns results. It never initiates callbacks or makes decisions independently.
Key Data Flow Rules
To maintain the system's integrity and simplicity, several strict rules govern data flow:
- Portal Isolation: The Portal UI communicates exclusively with the Agent. It never sends requests directly to the Tools service, nor does it know about the Tools service's existence.
- Tools Unidirectional: The Tools service performs browser operations and returns data to the Agent. It never initiates communication back to the Agent.
- Agent as Data Steward: All persistent data (job listings, application states, profile information, cover letter drafts) is stored in and managed only by the Agent's SQLite database.
- Stateless Tools: The Tools service maintains browser sessions and any intermediary data solely in memory. All state within Tools is transient and is discarded upon service restart or session termination.
Illustrative Data Flows
Let's examine how data flows through the system for key workflows:
1. Job Search
When a user initiates a job search:
- Portal → Agent: The user enters keywords and location in the Portal and clicks "Run Search." The Portal sends a
POST /workflows/searchrequest to the Agent, including the search criteria. - Agent → Tools: The Agent processes the request and dispatches a command to the Tools service (e.g.,
POST /tools/providers/seek/search) with the search parameters. - Tools (External Interaction): The Tools service launches a Playwright browser instance, navigates to the specified job board (e.g., SEEK), performs the search, and scrapes relevant job listing data (title, company, location, URL, etc.).
- Tools → Agent: The raw job listing data is returned from Tools to the Agent.
- Agent (Internal Processing & Persistence): The Agent applies policy filters (e.g., blocking internships), enriches the data, and persists new job entries to its SQLite database with a
discoveredstate. - Agent → Portal: The Agent sends back a response indicating the number of jobs persisted and blocked. The Portal then makes a
GET /api/jobsrequest to retrieve the updated list of jobs, which it displays as job cards.
2. Application Preparation
When a user selects a job for review:
- Portal → Agent: The user clicks "Review" on a job card. The Portal sends a
POST /jobs/{id}/queuerequest to the Agent. - Agent (Internal Processing & Persistence): The Agent creates an application placeholder, sets its state to
preparing, and enqueues a "prepare" item in its internal work queue. It returns an immediate success response to the Portal. - Agent (Background Worker) → Tools: A background worker process within the Agent picks up the "prepare" item. It then sends a command to the Tools service to fetch the full job description from the external job board's detail page.
- Tools (External Interaction) → Agent: Tools navigates to the job detail page, extracts the full description, and returns it to the Agent.
- Agent (Internal Processing & Persistence): The Agent's AI component evaluates the candidate's profile against the full job description, generates a personalized cover letter, and produces match evidence. These drafts are then saved to the Agent's SQLite database. The application state is updated to
prepared(orunsuitable). - Portal → Agent: The Portal periodically polls the Agent (e.g.,
GET /api/applications/{id}) to check the application's status. Once the Agent reports thepreparedstate, the Portal displays the review panel with the generated cover letter and match breakdown.
3. Applying for a Job
When a user approves an application for submission:
- Portal → Agent: The user clicks "Start Applying" in the Portal. The Portal sends a
POST /applications/{id}/applyrequest to the Agent. - Agent (Internal Processing & Persistence): The Agent updates the application's state to
applyingand enqueues an "apply" item in its work queue. It returns an immediate success response to the Portal. - Agent (Background Worker) → Tools: The Agent's background worker picks up the "apply" item. It orchestrates a series of commands to the Tools service to interact with the external application form. This involves instructing Tools to open the job page, click "Apply," and then progressively inspect and fill out form fields.
- Tools (External Interaction) → Agent: Tools performs the browser actions (navigation, form filling) and returns information about each step, including detected form fields, their labels, and current values.
- Agent (Orchestration & Potential Pause): The Agent uses the information from Tools to determine the next step, using its internal profile data and AI to fill fields. If user input is required (e.g., for external portal questions), the Agent pauses the workflow.
- Agent → Portal: If the workflow pauses for user input or requires authentication, the Agent communicates this state back to the Portal (e.g., as seen in
ApplyPage.tsxusingstartApplyandresumeApply). The Portal displays the relevant prompts (e.g., "SEEK login required," or fields for user input). - Portal → Agent: The user provides input in the Portal. The Portal sends this input (e.g., via
resumeApplywithapprovedValues) back to the Agent. - Agent (Resumption) → Tools: The Agent receives the user's input, updates its state, and instructs Tools to continue with the application process based on the new information. This cycle continues until the application is submitted.
- Agent (Persistence) → Portal: Throughout the process, the Agent updates the application's state and run logs in SQLite. Upon successful completion or failure, the Agent updates the final state. The Portal displays the final outcome to the user, typically allowing them to navigate back to the queue.