Should we use OpenAI, Anthropic, or build our own model?

Almost always start with a frontier API, evaluate honestly, and only consider self-hosted or fine-tuned open-source models if cost, latency, or data residency makes it necessary. We'll write down which it is and why.

Can our data stay private?

Yes. Frontier APIs offer enterprise privacy terms (no training on your data). For stricter requirements we use Anthropic on AWS Bedrock, OpenAI via Azure, or self-hosted open-source models in your environment.

How do we know if it's working?

We build the eval before we build the model. Every change is measured against the same test set. You see the numbers, not just a demo.

Will AI replace my employees?

Not the way the headlines suggest. The work that's actually automatable is the rote document and lookup work that nobody enjoys. The judgment, relationships, and exception-handling stay with people. We design accordingly.

Can you work onsite in NYC?

Yes. We're headquartered in NYC, so discovery, data review, and eval-design sessions across the five boroughs are routine. Northern New Jersey clients we serve onsite on the days we're in-state. Build and deployment work is generally remote.

№ 006

Applied AI & Machine Learning in NYC & Northern NJ

We help businesses in New York City, Northern New Jersey, and across the U.S. apply AI to work that's actually worth automating. Most engagements come down to three things: pulling structure out of unstructured documents, building a chat or search experience that uses your data, or replacing manual judgment in a defined workflow with a measurable model. We will tell you when the trendy answer isn't the right one.

Start a project Service area & locations

§ 01 — What We Build

What we actually build.

Document Understanding & Extraction

Contracts, invoices, statements, lab reports, intake forms — turned into structured data your systems can act on. Modern LLM-based extraction with verification, confidence scoring, and human-in-the-loop where it matters.

Retrieval & Internal Search (RAG)

Chat or search interfaces that ground their answers in your documents, with citations. For internal knowledge bases, customer support, or domain-specific research workflows.

Agentic Workflows & Automation

Multi-step processes — triage, drafting, follow-up, exception handling — where an LLM orchestrates real actions inside your systems. Built with eval suites and guardrails so behavior stays predictable.

Predictive Models & Forecasting

Demand forecasting, churn prediction, propensity models, anomaly detection. Classical ML where it outperforms an LLM — which is more often than the marketing suggests.

§ 02 — How We Work

The process.

01
Problem Framing
We sit with the team doing the work today and figure out what success actually looks like — measurably. If we can't define the metric, we don't take the project.
02
Eval Build
Before any model, we build the test set. A few hundred real examples, scored by a domain expert. The eval is what tells us whether the model is good enough — and whether the next iteration is better.
03
Iterate
Try the simplest thing that could work. Measure against the eval. Improve the prompt, the retrieval, the model, or the process. Stop when the curve flattens.
04
Ship & Monitor
Production deployment with logging, monitoring, and a feedback loop. AI systems drift; we'll tell you when yours is, and what to do about it.

§ 03 — Local Delivery

NYC & Northern NJ in person.

AI engagements often involve sensitive data — contracts, customer records, internal documents — and the kickoff conversations work better in person. We're NYC-headquartered and run discovery onsite anywhere in the five boroughs, including the data review, with whatever security arrangements you require (NDAs, locked-down environments, on-prem development). For Northern New Jersey clients we schedule discovery and eval-design sessions on the days we're in-state (twice a week). Build and deployment are largely remote afterward. We can also pair onsite with your team during the eval-design phase, which is often the highest-leverage hour of the engagement.

See full service area →

§ 04 — Tech & Tooling

The stack.

OpenAI
Anthropic Claude
Gemini
Open-Source Models
LangGraph / Inngest
pgvector / Pinecone
Python
TypeScript

Frontier model APIs (Anthropic, OpenAI, Google) by default; open-source models (Llama, Qwen, DeepSeek) when data residency or cost demands it. RAG infrastructure on Postgres + pgvector for most cases; Pinecone or similar at higher scale. Eval and orchestration in Python or TypeScript depending on your team.

§ 05 — Who This Is For

Who we work with.

Firms drowning in documents

Law, insurance, healthcare, financial services. Hours per week spent reading, classifying, and extracting from unstructured text — measurable ROI from automation.

Companies with knowledge in too many places

Internal docs, wikis, Slack history, ticket archives. An internal search or chat experience that uses all of it pays for itself in time saved.

Operators with predictive needs

Demand forecasting for retail and distribution; churn or activation models for SaaS; anomaly detection for operations.

§ 06 — FAQ

Common questions.

Should we use OpenAI, Anthropic, or build our own model?: Almost always start with a frontier API, evaluate honestly, and only consider self-hosted or fine-tuned open-source models if cost, latency, or data residency makes it necessary. We'll write down which it is and why.
Can our data stay private?: Yes. Frontier APIs offer enterprise privacy terms (no training on your data). For stricter requirements we use Anthropic on AWS Bedrock, OpenAI via Azure, or self-hosted open-source models in your environment.
How do we know if it's working?: We build the eval before we build the model. Every change is measured against the same test set. You see the numbers, not just a demo.
Will AI replace my employees?: Not the way the headlines suggest. The work that's actually automatable is the rote document and lookup work that nobody enjoys. The judgment, relationships, and exception-handling stay with people. We design accordingly.
Can you work onsite in NYC?: Yes. We're headquartered in NYC, so discovery, data review, and eval-design sessions across the five boroughs are routine. Northern New Jersey clients we serve onsite on the days we're in-state. Build and deployment work is generally remote.

§ 07 — Other Services

The other eight.

Start a conversation.

Direct reply from the founder. NYC & Northern NJ in person; U.S. clients remotely.

Get in touch →