This AI Paper Proposes Retentive Networks (RetNet) as a Foundation Architecture for Large Language Models: Achieving Training Parallelism, Low-Cost Inference, and Good Performance

Transformer, which was first developed to address the sequential training problem with recurrent models, has since come to be accepted as the de facto architecture for big language models. Transformers’ O(N) complexity per step and memory-bound key-value cache make it unsuitable for deployment, trade-off training parallelism for poor inference. The sequence’s lengthening slows inference speed,…

We know That LLMs Can Use Tools, But Did You Know They Can Also Make New Tools? Meet LLMs As Tool Makers (LATM): A Closed-Loop System Allowing LLMs To Make Their Own Reusable Tools

Large language models (LLMs) have excelled in a wide range of NLP tasks and have shown encouraging evidence of achieving some features of artificial general intelligence. Recent research has also revealed the possibility of supplementing LLMs with outside tools, considerably increasing their problem-solving powers and efficiency, similar to how human intelligence has evolved. However, the…

Researchers from UC Berkeley Introduce Gorilla: A Finetuned LLaMA-based Model that Surpasses GPT-4 on Writing API Calls

A recent breakthrough in the field of Artificial Intelligence is the introduction of Large Language Models (LLMs). These models enable us to understand language more concisely and, thus, make the best use of Natural Language Processing (NLP) and Natural Language Understanding (NLU). These models are performing well on every other task, including text summarization, question…

Meet CHARM: A New Artificial Intelligence AI Tool that can Decode Brain Cancer’s Genome during Surgery for Real-Time Tumor Profiling

In a groundbreaking development, Harvard researchers have unveiled an artificial intelligence (AI) tool capable of rapidly decoding a brain tumor’s DNA during surgery, providing critical information that can significantly impact patient outcomes. This innovative technology, known as CHARM (Cryosection Histopathology Assessment and Review Machine), has the potential to revolutionize the field of neurosurgery by enabling…

Do You Really Need Reinforcement Learning (RL) in RLHF? A New Stanford Research Proposes DPO (Direct Preference Optimization): A Simple Training Paradigm For Training Language Models From Preferences Without RL

When trained on massive datasets, huge unsupervised LMs acquire powers that surprise even their creators. These models, however, are trained on information produced by people with a diverse range of motivations, objectives, and abilities. Not all of these ambitions and abilities may be emulated. It is important to carefully select the model’s desired responses and…

Meet StyleAvatar3D: A New AI Method for Generating Stylized 3D Avatars Using Image-Text Diffusion Models and a GAN-based 3D Generation Network

Since the advent of large-scale image-text pairings and sophisticated generative model topologies like diffusion models, generative models have made tremendous progress in producing high-fidelity 2D pictures. These models eliminate manual involvement by allowing users to create realistic visuals from text cues. Due to the lack of diversity and accessibility of 3D learning models compared to…

Meet LLaMaTab: An Open-Source Chrome Extension that Runs an LLM Entirely in the Browser

LLaMaTab – An Insightful Chrome Extension A Chrome add-on called LLaMaTab New Tab will display a different image of a llama every time a new tab starts. It’s a silly add-on, but it can keep one going when things become tough. LLaMaTab New Tab is a fantastic extension if one is using Chrome and wants…

Dive Thinking Like an Annotator: Generation of Dataset Labeling Instructions

We are all amazed by the advancement we have seen in AI models recently. We’ve seen how generative models revolutionized themselves by going from a funky image generation algorithm to the point where it became challenging to differentiate the AI-generated content from real ones.  All these advancements are made possible thanks to two main points….

3 Questions: Honing robot perception and mapping

Walking to a friend’s house or browsing the aisles of a grocery store might feel like simple tasks, but they in fact require sophisticated capabilities. That’s because humans are able to effortlessly understand their surroundings and detect complex information about patterns, objects, and their own location in the environment. What if robots could perceive their…