Notebooks — Arshad Kazi

Time-boxed Inquiry

The why

Transfer learning works in brains too. Neuroscience research shows that learning across domains strengthens abstract reasoning. The prefrontal cortex builds schema, mental frameworks that compress patterns from one domain and apply them to another. The more domains you feed it, the richer the schema. Convolutions in V1 and convolutions in CNNs are the same idea discovered independently. Attention in the parietal cortex and attention in transformers solve the same resource allocation problem. These are not metaphors. They are convergent solutions.

Leonardo da Vinci studied anatomy to paint better and painted to understand anatomy better. Feynman learned to pick locks to understand information theory. Ramanujan found number theory in temple floor patterns. Cross-domain learning is not a distraction from depth. It is how depth actually works.

The practice

Spaced repetition research shows that retrieval under time pressure consolidates memory faster than passive review. The hippocampus encodes better when the learning session has a clear boundary. One question, one week, one hour per day. The constraint forces active recall. You cannot passively read for sixty minutes and pretend you understood it. You have to write what you know, find the gaps, and come back the next day to fill them.

I am a Leonardo fanboy, not because I think I am smart like him because I am really not, but because the man had no formal education past age 14 and still taught himself anatomy and optics and geology and engineering from whatever books and mentors he could get his hands on, and that is the only way I have ever been able to learn anything either. He kept notebooks his entire life, thousands of pages, anatomy next to fluid dynamics next to machine designs, not organized by subject but by date, and he called himself "discepolo della esperienza" which means disciple of experience, and I think the notebook was never documentation for him, it was the actual thinking tool, the place where looking at something carefully enough turns into understanding it.

Running 1

0 entries 6 months · whenever it happens

SLAM from Scratch

Building a full SLAM pipeline from scratch. Camera models, feature extraction, matching, motion estimation, mapping, loop closure, graph optimization. Theory and code, no shortcuts.

computer visionroboticsgeometry

Started May 13, 2026

Completed

7 days 1 hr / day

Overview of Modern Nets

Revisiting transformers, tokenizers, attention, and the GPT family. Intuitive understanding for interviews and curiosity.

Everything builds on the same transformer attention mechanism. From BPE tokenization to multi-head attention to the full GPT family. The ecosystem on top (RAG, LangChain, agents) is plumbing to make it useful. Still want to go deeper on RoPE, SwiGLU, the chain of thought paper, and diffusion models.

The journey

1 The encoder block, BPE, and how tokens are made

2 Multi-head attention and why we use dot products

3 Masking, padding, and BERT vs GPT

4 The transformer architecture, drawn from scratch

5 Pretraining, feedforward, residual connections, and layer norm

6 LLaMA, Mistral, and the road to reasoning models

7 RAG, LangChain, LangGraph, and the LLM tooling landscape

deep learningtransformersLLMs

Read the full experiment

4 days 1 hr / day, weekdays

Search Images with Words

From neuroscience to CLIP to building a working on-device search prototype. How do VLMs bind words to pixels?

Went from the CLIP paper to a working implementation from scratch. Two encoders, one shared embedding space, contrastive loss. The key insight is that aligning text and image representations lets you do zero-shot classification and text-to-image search without task-specific training.

The journey

1 CLIP: the paper that married text and images

2 CLIP's contrastive loss: how two encoders learn one space

3 Building CLIP from scratch: reading the paper properly

4 Zero-shot classification: from embeddings to probabilities

computer visionmachine learningmultimodal

Read the full experiment

7 days 1 hr / day

Understanding Emotion

What are emotions, really? The neuroscience of feeling, and why an autistic brain might process them differently.

Nobody agrees on what emotions are, but evolution built them for a reason. Basic emotions are universal biology. Higher cognitive emotions like guilt are trust signals. Happiness comes from relationships, not money. Emotions actively distort memory, attention, and judgement. Empathy is not mirroring, it is feeling what others feel. The people we need are those who can manage it.

The journey

1 What even is an emotion?

2 Are guilt and love just layered explanations?

3 Why did evolution bother building emotions?

4 Guilt, cheating, and why monogamy won

5 What is happiness and where does emotional pressure go?

6 Beyond language: the other tools that shape how we feel

7 How emotions control what you think, who you trust, and what you remember

neurosciencepsychologyautism

Read the full experiment

7 days 1 hr / day

Understanding the Visual Cortex

How does the brain process what the eye sees? Tracing the path from retina through two parallel streams, the mechanisms at each stage, and the computations behind perception.

The brain doesn't see the world as it is. It builds a model of what the world should be, and checks it against incoming data. The whole journey, from photon hitting the retina to a fully contextualised, emotionally-tagged memory, takes roughly 150 milliseconds.

The journey

1 How does the brain begin processing vision?

2 What mechanisms power each visual stream?

3 How do neurons detect and assemble edges?

4 How do neurons use context from their neighbours?

5 How does the brain perceive depth?

6 How does seeing become knowing?

7 The complete architecture — how vision flows through the brain

neuroscienceperceptionvisual cortex

Read the full experiment