What proof do you have of ML experience?

The equipment-cluster project above is deployed and public — a DINOv2 → UMAP → HDBSCAN → CLIP vision-ML pipeline over 20,000+ images, served through a React viewer. It's a working system, not a notebook.

Which models / providers do you work with?

Provider-agnostic — Claude, OpenAI, and open models via PyTorch / Hugging Face. Embeddings and vision via open_clip, DINOv2, and the usual ecosystem. We pick based on the task and your constraints.

Can you build a RAG system end to end?

Yes — ingestion, chunking, embedding, retrieval, re-ranking, generation, and the evals to know it's actually working. Vector store can be pgvector, FAISS, or a hosted option.

Will you tell me if AI is the wrong choice?

Yes. Part of the value is honesty about when a simpler classifier, a SQL query, or no model at all is the better answer. We won't sell you an LLM you don't need.

Corp-to-corp through Levelbrook LLC — fixed-scope for a defined feature, hourly for ongoing AI work. MSA / SOW / NDA / COI ready on day one.

Python AI & LLM Engineering — RAG, Agents, ML Tooling

Python AI & LLM engineering.

An LLM feature that works in the demo is easy; one that's reliable in production is the actual job. Levelbrook builds RAG pipelines, agents, embeddings and vector search — plus the evals, retries, and guardrails that make them dependable. The deployed equipment-cluster vision-ML pipeline above is proof of the applied-ML side.

RAG & agentsEmbeddings · vector searchPyTorch · CLIP · DINOv2Evals & guardrails

Applied AI with the unglamorous reliability plumbing — not a one-prompt demo.

equipment-cluster

A visual history pipeline for heavy-equipment rental: 20,000+ photos across 40 machines, embedded with DINOv2, reduced with UMAP, clustered with HDBSCAN, and auto-labelled zero-shot with CLIP — then served through a React viewer. A working, deployed Python ML system, not a notebook.

PyTorch · DINOv2UMAPscikit-learn HDBSCANopen_clipReact viewer

# 20k+ rental photos -> a browsable visual # history, clustered by machine & viewpoint. import torch, umap from sklearn.cluster import HDBSCAN feats = dinov2.encode(photos) # ViT-B/14 coords = umap.UMAP(n_neighbors=15).fit_transform(feats) labels = HDBSCAN(min_cluster_size=12).fit_predict(coords) # zero-shot names for each cluster via CLIP names = clip.zero_shot(centroids(coords, labels), prompts=EQUIPMENT_VIEWS)

Applied AI, with the boring parts done

The gap in most AI and LLM work isn't the model — it's everything around it: chunking and retrieval that actually returns the right context, evaluation so you know when a change made things worse, retries and fallbacks for when an API hiccups, and guardrails so the feature behaves. Levelbrook builds Python AI/LLM tooling with that reliability plumbing as a first-class concern, not an afterthought.

The applied-ML credibility is deployed and public: equipment-cluster (above) embeds 20,000+ images with DINOv2, reduces with UMAP, clusters with HDBSCAN, and applies CLIP zero-shot labels — a real vision-ML pipeline wired into a usable viewer. The same engineering discipline carries into LLM and RAG work.

What we build

RAG pipelines — ingestion, chunking, embedding, hybrid retrieval, and re-ranking that returns context worth generating from.
Agents & tool use — multi-step agent workflows with tool calling, scoped carefully so they're predictable.
Embeddings & vector search — embedding pipelines and vector stores (pgvector, FAISS, hosted) for semantic search and similarity.
Evals & observability — evaluation harnesses and tracing so quality is measured, not guessed.
Applied vision / ML — wiring models like CLIP and DINOv2 into a pipeline that ships, as in equipment-cluster.

Honest about the hype

We'll tell you when an LLM is the wrong tool, when a simpler classifier or a SQL query beats a model, and when a feature isn't ready to ship. Billed corp-to-corp through Levelbrook LLC, scoped as a project or run as ongoing staff augmentation.

Python AI & LLM engineering.

equipment-cluster

Applied AI, with the boring parts done

What we build

Honest about the hype

From prompt to production.

RAG pipelines

Agents & tool use

Embeddings & search

Evals & vision ML

AI / LLM work, answered.

What proof do you have of ML experience?

Which models / providers do you work with?

Can you build a RAG system end to end?

Will you tell me if AI is the wrong choice?

How is it billed?

Have an AI or LLM feature to ship? Let's make it reliable.