Reproducing GPT-2 124M: Key Insights from Andrej Karpathy’s 4-Hour Deep DiveThis is a summary of Andrej karpathy’s video about pre-training a GPT-2 124M parameter model from scratch.Feel free to check it using this…Sep 10Sep 10
Multi-Agent System with Crew AI: A Short Course SummaryPlease note that this article contains direct quotes content from the short course on multi-agent systems with Crew AI.Jul 4Jul 4
Stanford’s CS25: Lecture 2summaryLecture 2 of Stanford University’s CS25 V4 Transformers course was delivered by Jason Wei and Hyung Won Chung. Highly recommended to watch…Jul 2Jul 2
Stanford’s CS25: Lecture 1 summaryI recently watched the first lecture of Stanford University CS25 V4 Transformers course, and presented by Div Garg, Steven Feng, Emily…Jun 11Jun 11
𝐕𝐢𝐬𝐮𝐚𝐥𝐢𝐳𝐚𝐭𝐢𝐨𝐧 𝐨𝐟 𝐓𝐡𝐨𝐮𝐠𝐡𝐭 𝐄𝐥𝐢𝐜𝐢𝐭𝐬 𝐒𝐩𝐚𝐭𝐢𝐚𝐥 𝐑𝐞𝐚𝐬𝐨𝐧𝐢𝐧𝐠…Imagine a scenario where you’re listening to a story. As you follow along, your mind naturally starts to visualize the the unseen objects…May 24May 24
LLMOps IntroductionI recently completed a short course called LLMOps by DeepLearning.AI, in collaboration with Google Cloud, instructed by Erwin HuizengaMay 12May 12
How to Optimize LLMs for Efficient ServingWe all know that serving LLMs in production is complicated. It requires extensive research and the design of the best architecture and…May 8May 8
Mastering LLM: A Comprehensive Guide to Legal Language Model ServingWe all know that serving LLMs in production is complicated. It requires extensive research and the design of the best architecture and…May 2May 2
Mastering Data Preparation: A Comprehensive GuideWe all know that data is the most crucial element in training an AI model. Recently, we noticed the importance of this when smaller…Apr 25Apr 25