Bits & Bytes

YUJI

Ben Yu — Sat, 30 Mar 2024 23:09:00 GMT

Welcome fellow food enthusiasts! Today, I'm thrilled to recount my extraordinary dining experience at Yuji, a hidden gem nestled in the heart of the Japantown that specializes in Kappo Ryouri - the japanese art of cutting and cooking, with the focus on seasonal ingredients.

Ankimo tofu, Eggplant with Uni, Simmered Bamboo Shoots and Wakame, Firefly Squid with Mustard Miso Vinegar, Omlette with Caviar

Our evening commenced with a tantalizing array of appetizers, each one a masterpiece in its own right. The Ankimo tofu, delicately crafted from monkfish liver, melted on the palate with its creamy texture, while the Eggplant with Uni offered a delightful fusion of flavors, enhanced by the richness of sea urchin. Simmered Bamboo Shoots and Wakame provided a refreshing contrast, perfectly complemented by the tangy Firefly Squid with Mustard Miso Vinegar. The Omlette with Caviar, a symphony of indulgence, left us yearning for more.

Dobin Mushi

Next up was the Dobin Mushi, a traditional Japanese broth served in a teapot. Infused with fragrant dashi and adorned with tender morsels of seafood and mushrooms, each sip transported us to culinary bliss.

Sashimi - Otoro, Kinmeda, Squid

The Sashimi dish showcased the freshest catch of the day, featuring luscious slices of Otoro, Kinmeda, and Squid. Each bite was a revelation, highlighting the purity of the ingredients and the skill of the chef.

Grilled Chillean Sea Bass w/ Yuzu Kosho Miso Sauce

For our main courses, we savored the Grilled Chilean Sea Bass with Yuzu Kosho Miso Sauce, a harmonious blend of smoky flavors and citrusy notes.

Madai Shabu Shabu

The Madai Shabu Shabu, allowed us to cook tender slices of sea bream at our table with the cutest candlelit broth. Definitely the most unique dish of the night.

Chawanmushi with Ikura

The Chawanmushi with Ikura was a delicate custard infused with the essence of dashi, topped with plump salmon roe that burst with briny goodness.

A5 Wagyu Steak!

And who could forget the pièce de résistance – the A5 Wagyu Steak! Each bite of this marbled masterpiece melted in the mouth, leaving a lingering sensation of unparalleled luxury.

Deep Fried Prawns Stuff with Gingko Cake & Okra

Crab Porridge with Truffle!

The Deep Fried Prawns stuffed with Gingko Cake offered a delightful contrast of textures, with the crispy exterior giving way to a succulent filling. And the Crab Porridge with Truffle elevated comfort food to new heights, with the earthy aroma of truffle permeating every spoonful.

Matcha Creme Brulee

To conclude our epicurean journey, we indulged in the Matcha Crème Brûlée, a sublime marriage of creamy custard and bitter-sweet matcha, perfectly caramelized to create a crispy topping.

CS-7650 Natural Language Processing

Ben Yu — Tue, 13 Feb 2024 07:38:51 GMT

CS-7650 is the newest machine learning OMSCS course that delves into the intricacies of Natural Language Processing, offering a comprehensive exploration of both foundational concepts and contemporary techniques and a history of how we arrived at Large Language Models.

I was lucky enough enroll in it's 2nd iteration for Fall 2023. At the time of writing, the course consists of 6 coding assignments, 6 quizzes, a final project and two exams. The assignments were pretty standard faire and should be pretty straight forward if you're well-versed in PyTorch. I particularily enjoyed the exam structure which was both open book and we were given almost a week to complete our writeups before submitting. I much prefer this structure which better tests your knowledge rather than the more traditional closed book time-limited exam formats that value rote memorization, anxiety management and reading comprehension more than anything else.

Curriculum:

Foundational Concepts: CS-7650 begins with a solid foundation in neural network basics, ensuring students are well-equipped with the fundamental knowledge required for more advanced topics. Concepts such as tokenization, part-of-speech tagging, and syntactic analysis are covered comprehensively.
Text Classification: Moving into the realm of natural language processing, the course transitions to text classification, exploring techniques for classifying text using both traditional regression approaches and basic neural network models.
Recurrent Neural Networks (RNNs): RNNs, a crucial component in NLP, are extensively covered. The course delves into the architecture of RNNs, LSTMs and Seq2Seq models and their ability to handle sequential data, and applications like language modeling.
Distributional Semantics: The course explores distributional semantics, focusing on representing the meaning of words based on context. Topics include word embeddings, semantic similarity, and methods like Word2Vec and GloVe.
Transformers: The revolutionary transformers take center stage in this section, with an in-depth exploration of attention mechanisms, transformer architecture, and their applications in tasks like sequence-to-sequence models and language understanding.
Machine Translation: The classic problem of machine translation is addressed in the context of modern techniques and models. Approaches to machine translation, neural machine translation, and the application of attention mechanisms are covered.
Current State-of-the-Art NLP Techniques (Meta AI): One of the highlights of CS-7650 is the expertise brought to the table by the state of the art researchers in the field today. Several Facebook research at FAIR present their current research in Question Answering, Text Summarization, Privacy Preservation and Responsible AI.

Key-Value Memory Networks

My biggest learning from this course was really appreciating how the current state of the art in LLMs with transformers came from an interative evolution of neural architectures from the NLP research community. For the final project, we were challenged to design and optimize a key-value based memory network. It was both interesting to see how we could implement such a basic concept from computer science within a differentiable neural network, and insightful to see how this concept of memory would eventually evolve into attention mechanisms in later state of the art architectures. This was one of the more challenging assignments I had to complete within OMSCS, ranking up there with training MARL agents in Reinforcement Learning.

Conclusion

CS-7650 has been one of the better courses I've taken in OMSCS. The lectures were done very well and the subject matter was very relevant given the recent surge in popularity around LLMs and NLP. I did kind of wish that the course covered more recent advances in the field like RLHF, but it was still a great foundational introduction to the field.

My Favourite Restaurants of 2023

Ben Yu — Mon, 01 Jan 2024 19:43:00 GMT

January - Empress By Boon

‌ Nestled within Empress Boon lies a culinary treasure: Uni & Sardine Fried Rice, a standout amidst a prix fixe menu that oscillates between hits and misses. The rice was skillfully infused with the briny essence of the sardines, creaminess of the uni an impressive sense of 鑊氣 that evokes the essence of authentic Cantonese cuisine.

Moreover, the ambiance at Empress Boon is nothing short of captivating. The restaurant boasts breathtaking views of Chinatown that served as a picturesque backdrop to our Chinese New Year meal. One also can't help but be enchanted by the warm welcome and attentive service.

February - 金蓬萊 Golden Formosa

金蓬萊 (Golden Formosa), is a 1-Michelin star restaraunt tucked away in a quite neighourhood in the Shilin District. The restaurant effortlessly marries tradition with innovation, showcasing dishes that are a symphony of flavors and textures.

The 蓬萊排骨酥 (Crispy Pork Ribs) set the tone for our extravagant lunch, presenting a harmonious blend of crispy exterior and tender, succulent meat. Each bite was a burst of savory delight, perfectly balanced with aromatic spices that lingered on the palate.

The 乾拌古早味蚵仔麵線 (Oyster Noodles), a lesser-known delight on the menu, surprised and delighted with its simplicity yet depth of flavor. The noodles, perfectly cooked, were bathed in a savory sauce that carried the essence of fresh, plump oysters. Each slurp was a celebration of the sea, as the delicate yet distinct flavor of the oysters intertwined flawlessly with the umami-rich sauce.

The 烏魚子炒飯 (Prime Mullet Roe Fried Rice) was a revelation—a testament to the chef's artistry. The rice, delicately infused with the essence of prime mullet roe, offered a medley of umami flavors that danced on the taste buds. The subtle yet distinct seafood notes elevated the dish to a level of unparalleled indulgence.

However, the pièce de résistance was the 佛跳牆 (Buddha Jumps Over the Wall)—a culinary masterpiece that surpassed all expectations. This traditional Taiwanese delicacy was a complex symphony of premium ingredients, each contributing its unique essence. I'm not usually a fan of taro and traditional Chinese dried seafood, but the rich, flavorful broth enveloped a luxurious assortment of seafood and meats, creating a range of flavors and textures that was nothing short of extraordinary. I've never had soup and taro that was so luxurious and velvety!

The attentive service and inviting ambiance further enhanced the overall dining experience, creating a memorable afternoon that celebrated the diverse and exquisite flavors of Taiwanese gastronomy.

March - House of Prime Rib

House of Prime Rib sets the standard for quality dining in San Francisco. The moment you walk in, you're greeted by an festive energy that exudes warmth and tradition.

The portions are massive, and the prime rib, is always cooked to a perfect medium rare. Juicy, tender, and precisely prepared, it's the kind of dish that beckons for seconds, even if you're struggling not to overeat. The quality is consistent, making each visit just as exceptional as the last.

Don't underestimate the sides here—they're as remarkable as the main attraction. And here's a tip: don't sleep on the creamed spinach! These seemingly humble sides are always a delightful surprise, adding layers of flavor that complement the star dish beautifully.

House of Prime Rib isn't just a meal; it's an experience. It's the kind of place where you'll find yourself eagerly planning your next visit while still savoring the remnants of your current feast. If you're in San Francisco, this spot is an absolute must-try for a classic, indulgent dining experience.

April - Rintaro

Rintaro is probably one of my favourite restaraunts in SF. We came by on a random weekday night and opted for their set menu, but I'd higly recommend going a la carte and ordering to your heart's desire.

We started with the Kani Dashimaki Tamago, a fluffy, light folded omlette blend with local San Francisco Dungeness Crab was a perfect started and really highlighted Rintaro's vision of Northern Californian and Japanese cuisine.

Next we had San Ten Mori, which was several pieces of sashimi showcased high-quality ingredients. The San Diego Bluefin tuna was a highlight, but we didn't think it was anything particularily mindblowing or adventurous.

The Tsukene (minced chicken skewers) at Rintaro emerges as a true highlight, capturing the essence of perfectly grilled skewers. Each bite was a testament to the chef's mastery, offering a symphony of flavors that tantalize the taste buds.

The Chizu Tori Katsu, a fried delight, impressed with its impeccable preparation. Notably light and devoid of excess oil, it retains a crispy texture while allowing the chicken to shine. The accompanying katsu sauce added an extra layer of savory magic.

Ending on a unique note, the Hojicha Panna Cotta might not have been a personal favorite, but its distinctiveness cannot be overlooked. It's unique toasted tea notes in the syrup added a touch of adventure to the whole dining experience and was a great end to a wonderful meal.

May - 景成 - City View Restaurant

Regarded by many as the pinnacle of dim sum in San Francisco, City View's classics set the standard for what good Cantonese food should taste like. From the impeccably crafted 蝦餃 (Shrimp Dumplings) to the flavorful 糯米雞 (Stuffed Sticky Rice in Bamboo Leaves), each dish carries the hallmark of expert craftsmanship and attention to detail.

However, what truly steals the show is their XO醬炒腸粉 (Fried rice rolls with XO sauce). A rarity to find executed at such a high standard, this dish is a testament to City View's dedication to authenticity and innovation. The 腸粉 was expertly wok-fried, achieving a textural marvel—bouncy and QQ, with a delightful crunch on the exterior. The marriage of flavors between the XO sauce and the delicate rice rolls was a symphony of tastes that's hard to forget.

June - Noodle in a Haystack

Noodle in a Haystack has the hardest to get reservation in the city for good reason. You can feel Clint and Yoko's dedication to their craft through the attention to smallets of details in each on of their dishes. Seating is very intimate, with an L-shaped bar surrounding the prep area. We opted for the sake pairing which came with 8 different dishes.

1) Our soirée began with a Financier adorned with Caviar. The gentle sweetness of smoked shoyu harmonized with an exquisite touch reminiscent of a sophisticated lox bagel, a whimsical yet refined appetizer.

2) Next was the Chawanmushi unveiled itself with an audacious twist. Chicken intertwined with the nuanced depths of dashi-infused egg and seaweed. The XO sauce played mischievously, adding textures and layers that challenged the norms of this classic dish.

3) Enter the Cold Tomato and Uni Ramen—a delicate dance of flavors. The sundried tomatoes lent a surprising depth to the broth, while the velvety richness of uni bestowed an opulence that resonated with each spoonful, crafting a symphony of sensation.

4) Bluefin Tuna and Arugula Salad - meticulously selected, was a testament to the restaurant's uncompromising commitment to quality

5) The A5 Wagyu Beef and Curry arrived, accompanied by ethereal fried milk bread—each bite a sublime exploration. The beef melted like poetry, while the curry caressed the senses, culminating in a crescendo of flavor and tenderness. The dish was exteremley playful and hit a nostalgiac note for me, reminding me of the best parts of Japanese comfort food.

6) The Yuzu Daikon Pickles offered a palate-cleansing interlude—a clean, citrusy burst that revitalized the senses, leaving a trail of zesty elegance. However, it was the humble cucumbers that stole the spotlight—a seemingly unassuming creation transformed into a mesmerizing delicacy. The balance of salt, sugar, and shio konbu created a harmonious dance on the palate, leaving an enduring impression.

7) Lastly, the Shio Butter, Corn, Whelk and Clam Ramen—an opus of depth and complexity that rewrote the boundaries of noodle artistry. As the konbu butter melts into the clam broth, the ramen transforms into the most deeply flavourful seafood broth. The whelk and corn provide a great textural contrast to the amazingly toothsome noodles and chashu. This was quit possibly the best single bowl of ramen I've ever had the priveledge to try.

8) Dessert was a combination of shaved yuzu ice and burnt basque cheescake. I'm not much of a dessert person, but both were a satisfying way to end an exquisite meal.

Noodle in a Haystack transcends a mere dining experience—it's an immersive tapestry of flavors, textures, and narratives. Each dish is a chapter in a story, orchestrated by a chef's genius and enriched by hosts who transform a meal into an unforgettable saga. A reservation here isn't just access; it's an entrée into the extraordinary.

July - Llama San

Llama San isn't just a restaurant; it's a collision of Japanese and Peruvian cuisine that beckons the palate on an exhilarating journey. I was able to snag a seat in July at the bar for a quick dinner. They offer a prix fixe menu but I opted to order a la carte.

1) Marasheen Oysters, corn cream, grilled baby corn and papa sec. The grilled corn added a textural complexity to the dish and complimented the brininess of the oysters perfectly.

2) The Mackerel Ceviche is like a canvas painted with Peruvian zest—a vibrant melody of freshness and tanginess that sparks an instant connection with your taste buds.

3) Iberico Pork Tonkatsu, Udon Verde & Tsukemono Cucumber - This dish was the undisputed star of the night. This Katsu, an epitome of culinary brilliance, seduces with tenderness and an explosion of flavors. The Udon Verde was a creamy, flavor-packed symphony with a nuanced peppery kick. It's like a fusion of the familiar and the unexpected, an intriguing dance of taste and texture that marries beautifully with the standout amazingly fried iberico pork.

Llama San isn't just about food; it's a celebration of innovative fusion that bridges continents. It's where Peruvian vibrancy meets Japanese finesse, inviting your palate on an uncharted voyage through a world of extraordinary flavors.

August - The Anchovy Bar

If you're going to the The Anchovy Bar, you definitely have to try their most popular dish - Anchovy toast. The anchovies, carefully arranged atop the bread, unveil a tapestry of briny richness that dances across the palate. Each bite, a delicate interplay of umami, harmonizes with a subtle olive oil drizzle, adding depth without overwhelming the senses.

The Anchovy Bar proves that with a focus on sourcing the highest quality local ingredients, and orchestrating their preparation with precision and heart can create a culinary composition that delights the senses and elevates a common dish to extraordinary heights.

September - Sparrow and Wolf

A hidden gem of Vegas, skip the fancy casino buffet and celebrity chef spots and come here instead! Their prix fixe menu is typically 8 different dishes that changes regularily based on seasonality. Some highlights form what we tried:

1) Oxtail hummus—a revelation that redefines traditional hummus. The richness of stewed oxtail harmonized with the creamy chickpea base, elevating it to an indulgent, savory delight that leaves an unforgettable impression.

2) Foie Gras Chashu Bahn Mi—a playful twist on a classic. The opulence of foie gras meets the succulent chashu in a fusion of textures and flavors that dance gracefully on the taste buds, delivering an indulgent and innovative experience.

3) Octopus Confit—an epitome of culinary finesse. Tender and succulent, it embodies meticulous preparation and artistry, offering a delicate balance of flavors and kick of spice that delightfully surprised with each bite.

October - Angler‌

I first heard of this place from David Chang's podcast a couple years ago. Angler is a sea-life focused Michelin-starred restaurant from Saison group. The embered oysters and parker house rolls were a deadly delicious combo that will knock your socks off. The uni and trout roe rice was so buttery and briny in the best possible way and was the surprise highlight of our night. The grilled sea bream was a bit dry for our taste but was made up for by an amazing vermouth butter sauce. I wouldn't recommend the grilled hen of woods mushroom personally. It was cooked well but the sauce was too reminiscent of a franks red hot sauce.‌‌‌‌Overall an amazing experience with impeccable service. Seeing the chefs cook in the open concept kitchen was also a total delight!

November - Kokkari Estiatorio

A SF institution that lives up to the hype. This classic Greek restaurant is named after a small fishing village on the island of Samos and is the sister restaurant of the acclaimed Evvia Estiatorio in Palo Alto. You'll be greeted by a cozy cabin-like interior adorned with a welcoming fireplace and extensive woodwork making you feel right at home.

‌‌‌‌Lamb and fresh seafood were a definite must order when you're here. Their lamb shanks were cooked to perfection. Simple, light but packed with flavor. The sea bass was also delightful, offered grilled or steamed. We found the grilled skin was a bit too charred making it slightly overwhelming, given the fishes more delicate flavor.‌‌‌‌The surprise of the night was the home made grilled pita with Melitzanosalata, Favasalata and Tirokafteri. The pita had an amazingly crispy exterior but was still fluffy and light. This was the first time I've tried favasalata which was amazingly light but still packed a punch in terms of flavor.‌‌‌‌This meal exceeded all expectations and I'm already looking forward to coming again.

December - ILCHA

The soy-marinated shrimp at ILCHA, a rare culinary gem in SF. Imagine succulent shrimp, delicately marinated in a luscious soy marindate that creates a perfect balance of salty and savory notes. Each bite encapsulates a harmonious blend of umami-rich soy, gently infusing the shrimp with layers of depth and a hint of sweetness. What sets ILCHA's soy-marinated shrimp apart is the meticulousness of the marinade, which not only enhances the natural sweetness of the shrimp but also imparts a tantalizing complexity that elevates the dish to an unforgettable dining experience. And don't forget the rice! The perfectly cooked Koshihikari rice and egg provides a perfect backdrop for all the fatty goodness of the shrimp heads.

Festive Trip

Ben Yu — Tue, 26 Dec 2023 01:33:20 GMT

'Festive Trip' is an invitation to embrace the magic of the moment, to revel in the joyous experience of exploration and discovery. I hoped to capture the essence of joyful ski adventures through an endless spiral of vibrant, alternating trees, reminiscent of a kaleidoscope of colors.

Sushi Making Progression 2023

Ben Yu — Mon, 25 Dec 2023 06:05:54 GMT

I set a goal for myself to learn how to make sushi over 2023. Let's see how I progressed!

April 22 - Tamago Sushi

April 29 - Katsuo, Ikura and Salmon Nigiri

May 7 - Negitoro, Ebi, Salmon Nigiri & Unagi Handrolls

May 13 - Sushi Rolling 101 with Chef Mark Gyotoku. Hosomaki and California Rolls

May 20 - Shime Saba and Salmon Hakozushi

May 28 - Hotate Scallop Nigiri & Salmon, Takuan Maki

June 10 - Salmon Aburi

June 18 - Salmon (Aburi or Regular) and Ikura

July 2 - Shrimp and Smoked Salmon Hakozushi

July 7 - Chutoro Nigiri and Salmon Hakozushi

Aug 8 - Unagi Nigiri

Aug 13 - Mosaic Sushi Attempt 1

Aug 19 - Flower Sushi and Mosaic Attempt 2

Aug 26 - Triple Tomoe sushi, Mosaic Sushi Attempt 3and Tamago Nigiri

Sept 1 - Salmon, Ikura and Mosaic Attempt 4

Oct 7th - Salmon and Mosaic Attempt 5

Nov 5th - Abstract Flower Sushi

Nov 26 - Picnic Gimbap

Dec 2 - Salmon

Dec 3 - Negitoro and Cucumber Maki

Dec 7 - Sushi Demo at Work

Dec 9 - Mosaic Attempt 6

Dec 23 - Homemade Sushi Platter - Green Dragon Roll, Black Dragon Roll, Mosaic Sushi, Spam Musubi, Unagi Nigiri and Maki

pink dream

Ben Yu — Wed, 20 Dec 2023 23:08:00 GMT

‘Pink Dream’ invites viewers to immerse themselves in the intangible, to explore the connections that weave through our existence. Color palette inspired by Annie.

CS-7643 Deep Learning

Ben Yu — Mon, 30 Oct 2023 01:09:28 GMT

Instructor(s): Zsolt Kira
Course Page: Link

CS-7643 is a foundational course for the OMSCS Machine Learning specialization. It largely follows the curriculum of similar Deep Learning courses like CS231n from Stanford, going through the evolution of deep learning through Computer Vision lens. We cover Image Classification, Perceptrons, Backpropogation, Convolutional Neural Networks, Recurrent Neural Networks, Deep Reinforcement Learning and finally Attention and Transformers.

The most interesting part of the course was special topics covered by researchers from Meta AI. We also had the chance to meet them virtually during office hours and ask questions regarding their research. I was able to meet some of the collaborators from on the No Language Left Behind translation model that can translate from over 200 different languages!

I took the summer variation of the course which has 1 less assignment than the regular 4 during the Fall/Spring semesters. The course still culminated in a final open-ended group project. Each assigment consisted of basic proof or excerises from the past weeks lecture topics, a paper review, and a coding assignment with a substantial experimental and analysis component. I'll briefly cover the most interesting parts of only the papers I reviewed as assignment contents are still confidential.

1) Weight Agnostic Neural Networks

The paper demonstrates a novel network search algorithm that can solve a given machine learning problem without any explicit weight training. They demonstrate that this method is able to find minimal architectures that can solve reinforcement learning tasks like 2D bipedal walking and driving. They also demonstrate it can find architectures to solve supervised learning problems like MNIST digit classification. The results of this research seem deeply connected to learning and evolution. It appears to signal that our brains may not actually be giant general purpose learning machines and the neural architecture of our brains may bias ourselves towards specific ways of learning. In my mind, this could even indicate different modes of thought or reasoning that are beyond our human comprehension due to limitations of our existing brain structure and architecture.

We’ve also seen this in the history of deep learning innovations where novel architectures seem to be the triggering point for large improvements in performance (RNN, LSTM, CNNs, Transformers, Diffusion Networks, etc...) We are also constrained by the limitations of our search algorithms as our best method remains gradient descent optimization. New search algorithms could potentially unlock further innovations in deep learning.

2) Taskonomy: Disentangling Task Transfer Learning

In this paper, Zamir et al. the authors explore the structure and relationships between different visual learning tasks via transfer learning. They use a fully computational approach to model the relationships between twenty six different semantic tasks such as finding surface normal, 2D segmentation, edge detection, etc... The authors also demonstrate that taxonomy transfer generalises to novel tasks that are not in their trained task dictionary and they also train the taxonomy on other datasets to show that the what they found is generalizable. This leads the reader to conclude that there is an inherent structure in visual tasks that are being learned by the neural networks and that this structure can be used to model redundancies across different tasks and reused via transfer learning.

This study seems to indicate that deep neural networks are capable of learning high level features or concepts that roughly map to our own perceived relationships or actual physical relationships between different visual tasks (surface normal to depth maps via derivative). In order to learn where to transfer from a new learning task, we may want to leverage our own prior knowledge to find networks that were trained on similar tasks on a conceptual level.

3) Do Vision Transformers See Like Convolutional Neural Networks?

This paper identifies key structural differences in the features learned from Resnet based CNN’s versus Vision Transformers (ViTs). They identify that ViTs are better at incorporating global information than ResNets at lower layers. The paper identifies key parts of the transformer architecture that lead to such performance such as the importance of information flow through skip connections and how global average pooling vs CLS token helps maintain spatial localization.

My personal takeaway is that the historical motivations for convolutional neural networks and finding operations that mimic our ‘common sense’ understanding of existing vision systems are likely incorrect. The paper suggests that the power of attention and transformers in representing global features is more important and can lead to potentially better performance. Similar to my learnings from Paper 1, network architecture is probably the largest contributing factor of a model's representational power. Future research should focus on different architectures that can further improve model performance.

Final Project - AlphaZero & Connect 4

For my final project, I worked with another classmate to study an implementation of AlphaZero that could play Connect 4. The most fascinating part of this model is that it leverages Deep Reinforcement learning and self-play to learn how to play the game. This means that all the concepts it learns were self-taught without any human input or bias!

AlphaZero architecture and MCTS Search

We did an abaltion study to understand how much it architecture and reliance on Monte-Carlo Tree Search affected performance. We also explored using linear probes to see if we could tease out if it was learning any specific game concepts from training. I hope to find some free time in the coming months to continue in that vein of research.

Hugging Face Deep RL Course

Ben Yu — Sun, 09 Apr 2023 22:15:29 GMT

Back in December, Hugging Face released an eight unit course covering the fundamentals of Deep Reinforcement Learning. The course covers fundamental theories of Deep RL, core libraries and gives you hands-on experience training your own agents in unique environments ranging from classical control problems all the way to video games like Space Invaders and even Doom!

As opposed to a more classical graduate course like OMSCS's CS-7642, this course puts a larger emphasis on major advancements in the past couple of years that deep learning techniques have introduced to the field. The course covers the following topics:

Q-Learning, Deep Q-Learning and MC vs TD Learning
Policy Gradient with REINFORCE
Actor-Critic Methods
Multi-Agent Reinforcement Learning
Proximal Policy Optimization

I'll try to highlight the portions of the course that I found the most interesting or were particularily unique to this course.

Units 1-3: Q-Learning, Deep Q-Learning and MC vs TD Learning

The course first formulates the reinforcement learning problem and the basic paradigms of solving RL problems. We first focus on two paradigms within the Model-free RL algorithms: Policy-based Methods vs Value-based methods.

Taxonomy of RL Algorithms (OpenAI - Spinning Up)

We start off with Value-Based methods, where we want to learn a value function that maps state to it's expected value. The course reviews the Bellman Equation, which defines how one can recursively calculate the value of any given state. We then briefly look at two major learning paradigms of training value-based methods: Monte Carlo Learning - where you update your value function based on an entire episode of data or Temporal Difference (TD) Learning - where we update our value every n-steps.

The course then builds on this to introduce Q-Learning, which is an off-policy value-based method with TD(0) learning. We implement Q-learning from scratch and solve some basic OpenAI gym problems like Frozen Lake and Taxi Driving.

One major limitation of Q-learning is it's a tabular method which stores a value for each state-action pair, which is very memory intensive for problems with high state and action space dimensionality. The key innovation with the advent of deep learning is we can now approximate the Q-table with a deep neural network.

DQN adds several tricks to enable better generalization: Memory Replay and a Value/Target Network

Similar to my work in Georgia Tech's CS-7642 course, we implement our own DQN network based on Mnih et al. Their paper introduces the concept of Experience Replay, which acts as a buffer to store previous experiences and lets the network train on a larger range of samples, rather than the sequential experiences it gets during a normal training episode. Minh et al. also adds the concept of a target network, which helps stabilise training. With a single network we are both shifting the Q-values and the TD target with each update. By having a separate network, we can model the TD target separately and avoid oscillations during training.

My Deep Q-Network Agent playing Space Invaders

Understanding these concepts, we train a agent levering stable-baselines3 to play Space Invaders!

Unit 4: Policy Gradient with REINFORCE

A second approach to Reinforcement learning is to try to learn the policy function itself rather than approximating through a value function. To do this we parameterize the policy, typically modelling it as a probability distribution over a set of actions (stochastic policy). You can now model the policy with a function or neural network and optimise the function by maximising the performance of the policy using gradient ascent.

TODO: Write-up on Policy Gradient Theorem Derivation

To explore Policy Gradient methods, we first implement the REINFORCE algorithm which is a basic Monte-Carlo approach that estimates your return through multiple sample trajectories.

REINFORCE agent solving Cart Pole and playing Flappy Bird/Pixelcopter

We then train two agents to solve the classic Cart Pole problem and also Flappy Bird!

Unit 6: Actor-Critic Methods

One downside to the REINFORCE algorithm is it's high variance since it relies on Monte-Carlo sampling. To mitigate this you need to sample over a large number of trajectories, which reduces sample efficiency and is cost prohibitive.

One methodology that tries to combat this is the Actor-Critic process, which attempts to combine both Policy-based and Value-based methods. You train an Actor which tries to learn our policy function, and also a Critic, which can also learn a value function which assists the policy by measuring how good each action was taken. By knowing how good each policy update is, we reduce gradient variance.

A2C agent solving robotics/control problems like learning to walk or manipulating a robotic arm!

We then train two agents to solve several classic control problems from the PyBullet and Panda-Gym with Stable Baselines 3 and A2C.

Unit 7: Multi-Agent Reinforcement Learning

We take a brief detour into the world of Multi-Agent RL. The courses treatment of this problem space is very brief compared to Georgia Tech CS-7642's coverage, as we only briefly review several common implementation patterns. There's no exploration of the Game Theory underpinnings of this problem space.

Multi-Agent Soccer

The most interesting part of this course is the introduction of Self-Play and leveraging multiple copies of our agent to learn an train itself. We briefly look into the MLAgents library which leverages the Unity Game Engine to train agents in pre-made environments. The library predetermines matches between different copies of our Agent based on their ELO. It the continually matches agents against each other and progressively each agent should start gradually improving and learning from the process!

Unit 8: Proximal Policy Optimization

Any Deep RL course wouldn't be complete without looking at Proximal Policy Optimization (PPO), a state-of-the-art RL policy optimization algorithm. It's model-free, few hyperparameters and typically performs very well on most RL problems out of the box.

PPO and it's algorithmic brethren approach the RL problem by trying to converge on an optimal policy by avoiding to have too many large policy updates during training. This is primarily motivated by empirical observations that smaller updates tend to converge to an optimal solution and larger policy updates lead to "falling off a cliff" and having no chance of recovering to a previously better policy. PPO achieves this by enforcing a constraint on it's objective function:

TODO: insert clip objective function

We implement our own version of PPO with CleanRL as a reference implementation. We check our implementation on the classic Lunar Lander problem:

Solving Lunar Landar with PPO - 2M timesteps

Finally to demonstrate the versatility of PPO, we try out PPO with the SampleFactory library and train an agent to play Doom!

Playing a simplified DOOM level with PPO

Conclusion & Next Steps

100% Completion!

Completing all 8 units and having your 12 models pass the required benchmark will reward you with a Certificate of completion. Have all your assignments pass at 100% will get you an honors certificate!

This course only briefly explored the world of Reinforcement Learning. For myself, I'm going to explore multi-agent systems, explore building a RHLF system and applications with LLMs, read up on Decision Transformers and play around with MineRL. I'd like to also explore building my own game adapter like integrating Mupen64 and Starfox 64 into a RL training library.

Zen Japanese Restaurant - 2022

Ben Yu — Sat, 31 Dec 2022 22:29:13 GMT

It's been more than 2 years since I was last at Zen Japanese Restaurant. We opted for the $180 CAD Sushi Omakase course. Like last time, it was 13 pieces of nigiri, but with the addition of appetizers and dessert!

Appetizer Course - Abalone with Caviar and cucumber jelly salad. Fried Lobster with snow pea. Duck breast with miso daikon

Tuna

Sea Bream with a dusting of yuzu

Chutoro

Shrimp

Trout

Amberjack

Amberjack and Flounder. The flounder had an amazing texture that was soft but rubbery and chewy at the same time.

Tuna Belly

Scallop

Uni

Skipjack

Eel - Favourite piece of the meal. The flavour and texture was almost like birthday cake

Tuna Handroll - The nori was amazingly crunchy and crisp

Dessert - Melon, Mochi with Red Bean and Strawberry and Mango Sorbet

Sushi Making Progression 2018-2022

Ben Yu — Wed, 28 Dec 2022 15:24:00 GMT

Dec 2022

Oct 2022

Mar 2022

Jan 2022

2020

2018

Kusama Semi-Infinite Mirrors

Ben Yu — Wed, 28 Dec 2022 07:06:48 GMT

Ramping back up with three.js.

Inspiration:

Link: https://denim-jungle-blinker.glitch.me/

CS-7642 Reinforcement Learning

Ben Yu — Mon, 26 Dec 2022 00:19:52 GMT

Instructor(s): Charles Isbell / Michael Littman
Course Page: Link

CS-7641 is a core course for the OMSCS Machine Learning specialization. It serves as a introduction to reinforcement learning, and a continuation of CS-7641 Machine Learning

At the time of writing, the course consists of 3 major written assignments, 6 homework assignments and a final exam. The course follows Richard Sutton's RL Book very heavily, as do most undergraduate/graduate courses nowadays. The assignments were the main highlight of the course and are designed to mostly be open-ended and force you to demonstrate your understanding of the material. You are challenged to write a technical paper (6 page max) usually either solving a particular reinforcement learning problem or replicating a key result in RL research.

Assignment 1 - Temporal Difference Learning

You are tasked with replicating key results from Sutton's seminal 1988 paper on temporal difference learning methods. We basically need to show that TD(λ) is more efficicient that perceptron learning. This intuitively makes sense as we're now updating our agent continuously rather than waiting for the final outcome label. We also run several experiments looking at the trade-offs between different lambda parameters. As with most things in ML, there's a tradeoff decision with setting lambda, as you want to balance how far you look into the future and how fast you propogate learnings to your agent.

Assignment 2 - Lunar Lander

We get to apply our learnings to solve harder and more state of the art toy learning problems. You are tasked with solving OpenAI's Lunar Lander environment. Your agent needs to land a 2D lander without crashing. Your lander has left/right and upward thrusters and you're rewarded if you land safely and softly within the target area.

A successful run of my lunar lander agent

To solve this problem we implement we leverage Deep Q-Networking and implement a DQN agent with action replay. This technique was first introduced an popularized by DeepMind researchers Mnih et al. back in 2015. I essentially replicated their algorithm verbatim from their paper in PyTorch (we are restricted from using any existing libraries like rl-baselines).

Assignment 3 - Football

The problems get harder! We are now tasked with solving a multi-agent reinforcement learning problem. In this assignment we're given a modified version of Google's Football environment, and we're tasked with training an agent that can play 3v3 football. If you thought traning one agent was already difficult, you know have the added problem of training several agents that have to co-ordinate and interact with your environment together. The goal is to demonstrate an improvement in agent behaviour compared to 3 provided baseline algorithms.

My Agent learning how to pass and shoot

My paper ended up investigating how centralized critic methods improve learning performance and potentially help agents better co-ordinate with each other.

Conclusion

CS-7642 has been one of the more challenging and rewarding courses I've taken in OMSCS. Definetly complement your learning with other RL courses from other universities. Most notably I watched David Silver's RL Lectures. Berkley's Deep RL course was also extremely helpful for understanding current state of the art algorithms that weren't covered heavily in the lecture material like Deep Q-Learning and PPO. Reinforcement learning is very facintating field that's advancing very quickly. Most interstingly it played a pivotal part in ChatGPT's recent success, which relied on RL wit Human Feedback for it's training.

I'll be continuing my learning journey into the Spring as I take HuggingFace's Deep RL course. See you then!

Ben Yu does Benu

Ben Yu — Sun, 25 Dec 2022 16:51:51 GMT

I try to recreate my favourite dishes from Benu - a 3 Michelin star restaraunt in San Francisco and one of The World's 50 Best Restaurants back in 2019. Head chef Corey Lee draws from many different cuisines, with a focus on Korean and Cantonese techniques and flavours.

I could only find frozen mackerel and had to fry it for it be edible. I also couldn't figure out the correct medley of vegetables and what they used for the outer wrap. Just gave up on this one :(

Attempt #1 - Couldn't find jellyfish. Just did a light tempura batter and garnished with 'leaves'

Pear marinade on the steak worked very well. Glaze on the baby anchovies could have been sweeter.Not sure how thye made a sauce of that consistency so I made a chimichurri with scallion and basil

Abalone - Steamed mine in oyster sauce and scallion & garlic rather than basting in butter

Attempt #2 - Jellyfish added a great textural component. Still couldn't quite get the batter right. This time it was too thick

Milk Pudding - I couldn't get the same consistency and I skipped the peat jam/sauce

Dreams of Blade Runner

Ben Yu — Tue, 01 Nov 2022 14:13:00 GMT

Model: Stable Diffusion 1.5

Prompt:

💡

A Chinese ink painting of a modern city and pokemon. The city is bustling and raining with a neon glow. In the distance, there are dragon pokemon fighting each other. by zhang zeduan, mi fu, painting on silk, immaculate scale, hyper-realistic

CS-6460 Educational Technology

Ben Yu — Mon, 17 Oct 2022 08:06:00 GMT

Instructor(s): David Joyner
Course Page: Link

This class was simultaneously an introductory course about educational technology and an advanced, project-oriented class on designing or researching technology’s intersection with education. The course provides student's with information about a large number of topics within educational technology, including pedagogical strategies, research methodologies, current tools, open problems, and broader issues.

COURSE HIGHLIGHTS

CS-6460 was an extremely open-ended course, so you'll likely get as much out of this course as you put into it. You're given the option to either pursue one of three tracks:

1) Development - Work on a project/tool that will improve educational technology

2) Research - Conduct research on some field in educational technology, typically some form of study or survey of MOOCs

3) Content - Develop your own course material and/or MOOC

For myself, I chose to pursue a combination of both the development and research task, using this course as a structured format for myself to learn more about Natural Language Processing and it's application to educational technology

Course Gotchas

Start early especially if you're taking this course during the summer semester! Look ahead at the assignments and prepare ahead of time. There is a substatntial amount of writing in the first couple of weeks and you'll be reading ALOT of papers.
If you already have a topic or project you want to tackle, structure your research and preperation before the course starts. It'll make your life alot easier since you can spend more time on development/research
There is a course participation component. You should be able to get full marks through just completing regular peer feedback. Make sure you review how the points are calculated so you know at a minimum how much points you're currently at. The course instructor will provide you snapshots every month, but it'll also help to have a mental model of where you should be at any point in the semester

Research Topic: Multi-Document Summarization

For my research topic, I investigated the problem of multi-document summarization of medical research for literature reviews as part of a shared task for the workshop on Scholarly Document Processing 2022. The goal of the task was to build a machine learning model that could be applied to any set of medical research documents and generate a succinct summary that is understandable by a medical researcher. This task used two datasets of review summaries derived from the scientific literature [1][2]. Participating teams were then evaluated using automated and human evaluation metrics.

I wasn’t able to make any improvements on the dataset benchmark, but I was able to establish some evidence that current summarization metrics are insufficient in measuring summarization accuracy. I also built a small web tool to demonstrate the viability of summarization models for future investigators. Luckily enough my work was accepted into the workshop and presented at the workshop proceedings at COLING 2022!