Notebooks
The why
Transfer learning works in brains too. Neuroscience research shows that learning across domains strengthens abstract reasoning. The prefrontal cortex builds schema, mental frameworks that compress patterns from one domain and apply them to another. The more domains you feed it, the richer the schema. Convolutions in V1 and convolutions in CNNs are the same idea discovered independently. Attention in the parietal cortex and attention in transformers solve the same resource allocation problem. These are not metaphors. They are convergent solutions.
Leonardo da Vinci studied anatomy to paint better and painted to understand anatomy better. Feynman learned to pick locks to understand information theory. Ramanujan found number theory in temple floor patterns. Cross-domain learning is not a distraction from depth. It is how depth actually works.
The practice
Spaced repetition research shows that retrieval under time pressure consolidates memory faster than passive review. The hippocampus encodes better when the learning session has a clear boundary. One question, one week, one hour per day. The constraint forces active recall. You cannot passively read for sixty minutes and pretend you understood it. You have to write what you know, find the gaps, and come back the next day to fill them.
I am a Leonardo fanboy, not because I think I am smart like him because I am really not, but because the man had no formal education past age 14 and still taught himself anatomy and optics and geology and engineering from whatever books and mentors he could get his hands on, and that is the only way I have ever been able to learn anything either. He kept notebooks his entire life, thousands of pages, anatomy next to fluid dynamics next to machine designs, not organized by subject but by date, and he called himself "discepolo della esperienza" which means disciple of experience, and I think the notebook was never documentation for him, it was the actual thinking tool, the place where looking at something carefully enough turns into understanding it.
Overview of Modern Nets
Revisiting transformers, tokenizers, attention, and the GPT family. Intuitive understanding for interviews and curiosity.
Everything builds on the same transformer attention mechanism. From BPE tokenization to multi-head attention to the full GPT family. The ecosystem on top (RAG, LangChain, agents) is plumbing to make it useful. Still want to go deeper on RoPE, SwiGLU, the chain of thought paper, and diffusion models.
Search Images with Words
From neuroscience to CLIP to building a working on-device search prototype. How do VLMs bind words to pixels?
Went from the CLIP paper to a working implementation from scratch. Two encoders, one shared embedding space, contrastive loss. The key insight is that aligning text and image representations lets you do zero-shot classification and text-to-image search without task-specific training.
Understanding Emotion
What are emotions, really? The neuroscience of feeling, and why an autistic brain might process them differently.
Nobody agrees on what emotions are, but evolution built them for a reason. Basic emotions are universal biology. Higher cognitive emotions like guilt are trust signals. Happiness comes from relationships, not money. Emotions actively distort memory, attention, and judgement. Empathy is not mirroring, it is feeling what others feel. The people we need are those who can manage it.
Understanding the Visual Cortex
How does the brain process what the eye sees? Tracing the path from retina through two parallel streams, the mechanisms at each stage, and the computations behind perception.
The brain doesn't see the world as it is. It builds a model of what the world should be, and checks it against incoming data. The whole journey, from photon hitting the retina to a fully contextualised, emotionally-tagged memory, takes roughly 150 milliseconds.