Biraj

i sometimes build things from scratch to understand how they work, write blogs, play chess (1900 rapid on lichess), badminton, and read books.

biraj.pub@gmail.com

github.com/biraj21

x.com/biraj21_

lichess.org/@/biraj21

linkedin.com/in/biraj21

work experience

(you can click on the role summary to see the details)

Outspeed sept 2024 - present
- founding engineer sept 2024
  realtime voice AI at Outspeed.
  - developed WebRTC based real-time voice AI agent system. used message-driven Actor Model architecture for concurrent units (threads) in the code.
  - designed the system to be compatible with OpenAI Realtime API specifications. also built the React SDK to simplify integration for developers.
  - added default system tools to our voice AI infra that enable agents to automatically determine when to skip turns or end conversations.
  - implemented Voice Activity Detection system using Silero VAD. also implemented semantic VAD, where the model is less likely to interrupt the user.
  - built real-time log monitoring infrastructure that ships application logs from cloud to Grafana Loki and streams them to frontend via SSE (server-sent events).
  - automated cloud infrastructure deployment by creating Terraform configurations for Nomad cluster on AWS, managing EC2 instances, S3 buckets, Redis ElastiCache, ECR, etc.
  - deployed a text-to-speech model to production.
LaneSquare Technology Pvt. Ltd. feb 2023 - sept 2024
- software developer sept 2023 - sept 2024
  built thinkstack.ai, a platform where businesses can create AI chatbots to automate customer support.
  - implemented Retrieval Augmented Generation (RAG) to enable agent to answer questoins using users' knowledge base.
  - built user intent detection system using tool calls, that automatically identifies appropriate actions (collecting user details, creating tickets, human handoff) based on user queries, enabling more intelligent conversation routing.
  - developed Thinkstack's full-stack app (app.thinkstack.ai). worked using Node.js, GraphQL, Flask (Python), Redis, Next.js (SSG), Tailwind CSS and AWS (Cognito, S3, API Gateway etc).
  - created a real-time event notification system using pub/sub to send backend events (like website crawling completion) to the frontend, which uses a React context provider i wrote to consume these events and trigger UI updates.
  - built sub-users, roles and permissions management system for the app - both backend and frontend.
  - created Zapier, Slack, WhatsApp, Facebook Messenger and Instagram integrations for Thinkstack. also built its
  - built API key management system for authentication. users can use their API keys to authenticate themselves on Zapier to use Thinkstack integration.
- software developer intern feb 2023 - aug 2023
  migrated Pickcel's backend codebase from node 12 to node 18. also learnt Docker.
  - worked with TypeScript, Node.js, and Mongoose to update and optimize existing REST APIs by making some things concurrent. set up integration testing CI using GitHub Actions.
  - optimized Dockerfile for our on-premise setup by properly reordering layers, reducing consecutive build times and improving cache utilization. also added redis to the stack.
  - introduced semantic git commits, coding conventions, and structure, which is now embraced by all developers in the organization.

skills

programming languages: JavaScript, TypeScript, Python, Go, C, Bash
libraries and frameworks: React, Node.js, Express.js, FastAPI, GraphQL
tools and infra: Docker, AWS, Terraform, Firebase, Git, Linux, GitHub Actions
database: MongoDB, SQL, Redis, Firebase Firestore

projects

DotDB: in-browser vector database

a lightweight vector database that runs entirely in the browser. supports cosine similarity search, persistent storage via IndexedDB, and batch operations. uses transformers.js with bge-small-en-v1.5 for generating embeddings. built to understand how vector databases work under the hood - implements brute-force k-NN search with Float32Array for memory efficiency.
| try it

LLM serving engine from scratch

a toy LLM serving engine built from scratch with batched inference and token streaming. implements auto-regressive generation loop, request queueing, and independent sequence completion. supports both streaming and non-streaming responses via FastAPI. built this to understand model serving. work in progress.

Ask My AI: Chrome extension

a Chrome extension that brings AI assistants (ChatGPT, Claude, Gemini, Perplexity) into a sidebar. highlight any text and press Cmd/Ctrl+Shift+E to get instant explanations without tab-switching. hit 50+ users and 16+ five-star ratings within days of launch. built it because existing extensions charged money just to wrap AI APIs in a new UI, which seemed pointless when users already pay for ChatGPT/Claude/Gemini subscriptions.
| chrome web store

Obsy: AI observability platform

Obsy is an AI observability tool that provides insights into AI operations. it has a Node.js SDK that automatically instruments OpenAI, Pinecone and Vercel's AI SDK. the dashboard shows LLM trace, with timing and usage metrics for each stage. built this as a hackathon project in 15 hours.

tensor visualizer

a tool that helps you visualize multi-dimensional arrays. uses HTML canvas for rendering the image. i wrote this when i was trying to understand neural networks and tensors in pytorch.
| try it

focker: Linux containers

Focker is a toy container runtime written in Go. it is based on Ubuntu 22.04 rootfs. this was an attempt to understand how Docker works internally.

NumPy from scratch in Go

i was implementing Neural Networks from scratch in Go and as a byproduct of it, i created my own version of NumPy for tensor operations like +, -, *, /, matrix multiplication, transpose etc. it also supports broadcasting.

AI search

an AI search tool (like Perplexity) that pulls data from Google Custom Search JSON API and uses Llama 3 to generate answer with citations.

TCP server

a single-threaded TCP server written in C with an event loop using the poll() system call, without any third-party library. the server listens on a port and echoes back received data. i've also written a simple client to test the server.

gomon: nodemon for Go files

a CLI took to run Go programs in watch mode. to watch for changes, it uses kqueue() system call on macOS and inotify on Linux. no 3rd-party dependencies.

brainfuck interpreter

a Brainfuck interpreter in C that combines repeated instructions, like +++++ +++++ into { '+' : 10 }, speeding up execution. it can also transpile Brainfuck code to C.

web wanderer

a multi-threaded web crawler written in Python. uses concurrent.futures.ThreadPoolExecutor and Playwright to crawl and download web pages. it handles dynamically rendered websites, making it capable of extracting content from sites written in React, Vue, etc.

JSON parser

a JSON parser written in Go from scratch. prints error message with line and column number. Phil Eaton's blog post helped me write the lexer and parser. it was fun.

texterm

a minimal, nano-like text editor written in C from scratch. i followed Build Your Own Text Editor article. learnt a lot about C and linux terminals during this.

findREp

a GUI tool written with Tkinter in Python to find and replace all the matches of a regular expression in files and folders (recursively). i wrote this because i wasn't aware of VS Code's search functionality at the time.