AI Developer at FM Home Loans, LLC | Findjobs | Findjobs
AI Developer
FM Home Loans, LLC
Location
🇺🇸 United States
Type
Full-time
Salary
$5.8k–$7.9k
Posted
0mo ago
Job Description
Job Overview
We are seeking a pragmatic, security-minded AI Developer to design and build an on-premises document analysis and reporting platform for a confidential real estate workflow. Because the documents in scope contain sensitive deal, financial, and legal information, no data may leave our environment — this role is explicitly about building with self-hosted, open-weight models (Ollama, Gemma, Qwen, Mistral, Llama) rather than commercial LLM APIs. You'll own the full stack: local model serving, RAG pipelines over large mixed-format documents, a FastAPI backend, structured storage, and a hardened Windows deployment. If you are genuinely excited about running powerful models on your own iron and engineering around the tradeoffs that implies, this is
the role
for you.
Responsibilities
Architect and implement a self-hosted LLM stack using Ollama as the model runtime, selecting and evaluating open-weight models (Gemma, Qwen2.5, Mistral, Llama) against accuracy, latency, and VRAM constraints.
Build a production RAG pipeline (LangChain or LlamaIndex) over a mixed-format document corpus — PDFs including scanned documents requiring OCR, Word, Excel, and images — with chunking, embedding, and retrieval strategies tuned to long, heterogeneous deal documents.
Stand up and operate a local vector database (ChromaDB, Qdrant, or pgvector) alongside PostgreSQL for structured metadata, and MinIO (or equivalent S3-compatible local object storage) for the document store.
Develop a FastAPI backend exposing retrieval, summarization, extraction, and reporting endpoints, with rigorous input validation and audit logging.
Implement authentication and authorization using JWT with role-based access control (RBAC), integrating with Microsoft 365 / Entra ID where appropriate for multi-tenant identity.
Deploy the stack on Windows Server, registering services via NSSM, fronted by Nginx as a reverse proxy with HTTPS termination and sensible security headers.
Design field-extraction and document-classification workflows with confidence scoring and an exception queue, so low-confidence outputs route to human review rather than silently propagating errors.
Establish evaluation harnesses for prompts, retrievers, and end-to-end accuracy, and maintain them as models and documents evolve.
Apply data protection discipline end-to-end: encryption at rest and in transit, least-privilege service accounts, secret management, backup/restore procedures, and clear data-flow documentation suitable for internal and external review.
Required Skills
Strong Python, with production experience in FastAPI (or a comparable async Python framework).
Hands-on experience running local LLMs via Ollama, vLLM, llama.cpp, or similar — including quantization tradeoffs, context-window management, and GPU/CPU capacity planning.
Demonstrated work building RAG systems with LangChain or LlamaIndex: chunking strategies, embedding model selection, hybrid retrieval, and reranking.
Practical experience with vector databases (ChromaDB, Qdrant, Weaviate, or pgvector) and with PostgreSQL for relational data.
Document processing: PDF parsing (including scanned PDFs), OCR (Tesseract, PaddleOCR, or commercial equivalents that run locally), and Office format extraction (Word, Excel).
Windows Server administration fundamentals — services via NSSM, PowerShell, IIS or Nginx, certificate management.
Auth patterns: JWT, OAuth2, and RBAC design; familiarity with Microsoft Entra ID / Azure AD integration.
Containerization (Docker) for local development and reproducible deployments.
A security and privacy mindset: comfort talking about threat models, data flows, and what "the data never leaves this box" actually requires in practice.
Nice to Have
Experience with multi-tenant architectures inside a Microsoft 365 / Azure tenant.
Background building internal tools where accuracy, auditability, and human-in-the-loop review matter more than raw throughput.
Familiarity with real estate, lending, legal, or other document-heavy regulated domains.
GPU provisioning and tuning on on-prem hardware.
Prior work replacing or avoiding commercial LLM APIs (OpenAI, Anthropic, Gemini) for compliance or confidentiality reasons. Pay: $70,000.00