NeoCraft Blog

Practical insights on GPU systems, vector search, and enterprise GenAI architecture from Manish Kumar.

AI in Production

Shipping GenAI Features Without Breaking Core Systems

Integration patterns for reliability, governance, and graceful fallback when bringing GenAI into enterprise software.

Read article
Data Infrastructure

Scaling Vector Search Beyond a Billion Points

Design principles for shard strategy, hot/cold tiers, and recall-preserving performance at very large scale.

Read article
Engineering

How GPU-First Architecture Cuts Model Latency

A practical guide to reducing inference latency through memory-aware design, dynamic batching, and end-to-end profiling.

Read article