May 1, 2026

ConnorLLM

An AI runtime reliability platform — benchmarking, validation, and observability for production LLM systems.

Most teams shipping LLM agents, RAG pipelines, and tool-calling workflows still can't answer: Did the new model silently break JSON output? Why did latency spike under load? Did hallucinations or structured-output failures regress after a prompt change?

ConnorLLM brings reliability engineering to AI in production: multi-provider benchmarking (via OpenRouter), runtime stress tests (retries, timeouts, fallbacks, queue pressure), structured output validation, regression detection across model versions, and hallucination evaluation.

The stack splits runtime orchestration (Go, Fiber, OpenTelemetry) from evaluation intelligence (Python, FastAPI, sentence-transformers) — so execution stays observable while semantic scoring stays flexible.

It's not a chatbot or a prompt wrapper. It's infrastructure for operating AI systems reliably: TTFT, p95/p99 latency, fallback rates, cost per request, and quality gates before deploy.

Long term: an AI Reliability Engineering platform — tracing, replay, benchmark infra, and production validation pipelines alongside tools like Langfuse, Promptfoo, and vLLM — with a sharper focus on runtime reliability and deterministic benchmarking.

AI InfrastructureRuntime ReliabilityObservabilityBenchmarkingOpenTelemetryGoPythonStructured OutputRegression TestingLocust
April 1, 2026

InferGate

LLM inference control plane focused on bounded cost, latency, and reliability across stochastic providers.

Production LLM systems have outgrown the traditional API abstraction. Workloads span heterogeneous providers with non-deterministic latency, unstable throughput, variable token pricing, and opaque runtime behavior. InferGate introduces a control-plane layer for operating inference under explicit constraints.

A closed operational loop: ingress normalization, budget estimation and enforcement, adaptive routing, bounded execution, retry and fallback policies, telemetry aggregation. Each request carries deadline, budget, context, and trace identity—deterministic orchestration over probabilistic models. Admission, token-bucket budgets, worker pools with backpressure, adaptive routing, cooperative cancellation, circuit breakers, and tracing feed POST /infer: constraints in, decision path out.

Routing optimizes on runtime feedback—p95/p99, TTFT, tokens/sec, retry rate, saturation, cost per request, provider reliability—not offline RLHF. The project separates model-quality work from runtime reliability and infrastructure orchestration: it governs how inference is operated in production, not how weights are trained.

AI InfrastructureLLM ServingControl PlaneGoOpenTelemetryDistributed SystemsObservabilityReliability
May 12, 2025

Noodl

Started with curiosity. Became a tool for thinking out loud.

NOODL started as a playful experiment — a personal dive into React Flow, Supabase, and Next.js to explore the creative power of graphs and visual thinking.

I didn't plan to build a full platform. I just wanted to see what would happen if I mixed mind-mapping, AI, and a bit of flow-based design — no pressure, just curiosity.

It quickly turned into NOODL: an AI-augmented mind-mapping tool where ideas grow visually, nodes think with you, and collaboration feels natural.

Inspired by tools like Miro and XMind, it's built for creatives, designers, and teams who love structure without rigidity.

Still Evolving. Still learning. Still Noodling.

Mind-MappingNode-based ThinkingContext-AwareVisual ThinkingCollaborationReact FlowSupabaseDesign EngineeringLLMOpenAI
March 1, 2025

Gomon

Gomon started as a simple experiment :

I wanted to understand how to monitor Go applications in a clean, efficient, and self-hosted way — without relying on heavy external services.

So I built Gomon, a lightweight monitoring tool that exposes key metrics (latency, memory, goroutines, error rates) through HTTP endpoints, with native Prometheus support and built-in pprof profiling.

MonitoringGoPrometheuspprofSelf-hostedLightweightObservabilityMetrics
February 11, 2025

TerraLambda

TerraLambda started as a hands-on project to explore the world of cloud infrastructure :

I wanted to learn how to work with AWS, write real-world tools in Go, and understand how Terraform and the Cobra CLI could power developer workflows.

What began as a technical curiosity turned into TerraLambda : a lightweight tool to deploy and manage AWS Lambda functions using Go + Terraform, designed to make serverless deployments smoother, more consistent, and easier to automate

It's minimal, self-contained, and made to help developers gain visibility into their Go apps — fast.

No dashboards. No fluff. Just observability where it matters.

CloudLambdaServerlessAWSIAMTerraformOrchestrationDeployment
January 9, 2025

Rayon

Rayon began as the final-year project of a designer friend —a way for me to dive into speech recognition, audio processing, and natural language understanding through a unified system.

I wanted to understand how to capture voice, transcribe it, analyze the text, and enrich the context with location-based services.

So I built Rayon: a modular platform that records audio, performs speech-to-text, applies NLP for meaning extraction, and uses geocoding to ground interactions in the real world.

It's still experimental, but through it I learned a lot — about voice pipelines, context-aware systems, and what it takes to craft intuitive, voice-first experiences

Speech RecognitionAudio ProcessingNoise reductionText AnalysisLocation-Based ServicesWhisperNER-dsdslim/bert-base-NER