Over the years, the impact of large language models on search has been profound, particularly with the emergence of ChatGPT and other chatbots. Today, we’ll delve into the world of Large Language Models and explore some of the cutting-edge LLMs that are shaping the landscape.
When discussing technology in 2023, it’s impossible to overlook trending subjects like Generative AI and large language models (LLMs), which play a crucial role in powering AI chatbots. Following the launch of OpenAI’s ChatGPT, the competition to develop the most exceptional LLM has intensified significantly.
Various players, including major corporations, startups, and the open-source community, are dedicating their efforts to crafting the most advanced large language models. The number of released LLMs has already surpassed hundreds, but which ones truly stand out as the most capable contenders?
Let’s explore some of the foremost large language models making waves today. They possess impressive natural language processing capabilities and are shaping the architecture of future models.
In 2023, OpenAI unveiled GPT-4, the latest and most extensive addition to their GPT series. Following the transformer-based architecture like its predecessors, GPT-4’s exact parameter count hasn’t been publicly disclosed, yet speculations suggest it exceeds a staggering 170 trillion.
What sets GPT-4 apart is its multimodal capability, allowing it to handle both text and images, a departure from its predecessors that focused solely on language. OpenAI introduced a notable feature known as the “system message,” empowering users to specify the tone and task they desire. For more ChatGPT tips, read our guide on How to Protect Your Sensitive Work Data When Using ChatGPT.
The academic realm witnessed GPT-4’s remarkable abilities, showcasing human-level performance in various exams. The model’s release triggered discussions about its potential proximity to artificial general intelligence (AGI), implying equivalence or even surpassing human intellect. Currently, GPT-4 plays a pivotal role in powering Microsoft Bing’s search engine, and it can be accessed via ChatGPT Plus, with plans of integration into Microsoft Office products.
GPT-3.5, an improved iteration of GPT-3, features a reduced parameter count. To enhance its performance, GPT-3.5 underwent fine-tuning through reinforcement learning with valuable input from human feedback. Notably, ChatGPT relies on GPT-3.5 as its core version. Within the GPT-3.5 family, there exist, multiple models, with the GPT-3.5 turbo standing out as the most capable, as stated by OpenAI. The training data for GPT-3.5 encompasses information up until September 2021.
While GPT-3.5 did find its way into integration with the Bing search engine initially, it has since been succeeded by the more advanced GPT-4.
Google’s AI chatbot Bard is powered by the Pathways Language Model (PaLM), a massive transformer-based model containing 540 billion parameters. Developed by Google and trained on multiple TPU 4 Pods, which are custom hardware designed for machine learning, PaLM exhibits remarkable abilities in handling reasoning tasks such as coding, math, classification, and question answering. Additionally, PaLM showcases its prowess in breaking down intricate tasks into more manageable subtasks.
The name “PaLM” is derived from Google’s Pathways research project, aimed at creating a unified model to cater to diverse use cases. Consequently, various fine-tuned versions of PaLM have been developed, each tailored for specific purposes. Med-PaLM 2, for example, is designed for life sciences and medical information applications, while Sec-PaLM focuses on cybersecurity deployments, accelerating the process of threat analysis.
In 2018, Google unveiled BERT, a family of Language Models (LLMs). BERT operates as a transformer-based model, proficient at transforming sequences of data into other sequences of data. Its architecture comprises a series of transformer encoders, incorporating a remarkable 342 million parameters. Initially, BERT underwent pre-training on an extensive corpus of data, following which it was fine-tuned to excel in various tasks, including natural language inference and assessing sentence text similarity.
Notably, BERT played a pivotal role in enhancing query comprehension in the 2019 version of Google search, proving its efficacy in improving search results and user experience.
Cohere, a startup driven by former Google Brain team members, including Aidan Gomez, a co-founder recognized for their contributions to the transformative “Attention is all you Need” paper, has set its sights on addressing generative AI needs for enterprises. Unlike many AI companies, Cohere focuses on catering to corporations seeking advanced AI solutions.
With a diverse range of models, Cohere offers options that vary in size, ranging from a mere 6B parameters to robust models trained on an impressive 52B parameters. If you’re a business owner on the lookout for the finest LLM to integrate into your product, exploring Cohere’s models would be a worthwhile endeavor.
Among Cohere’s recent innovations, the Cohere Command model stands out for its remarkable accuracy and robustness, earning commendation from the prestigious Stanford HELM, which ranked it highest in accuracy compared to its peers. Notably, prominent companies like Spotify, Jasper, and HyperWrite have embraced Cohere’s model to elevate their AI experiences.
The Technology Innovation Institute has introduced Falcon 40B, a transformer-based, causal decoder-only model. This remarkable model, specifically designed for English language tasks, comes with an impressive 40 billion parameters. The best part is, that it is an open-source project, allowing developers to access and leverage its capabilities freely.
For those looking for alternatives with fewer parameters, Falcon 1B (1 billion parameters) and Falcon 7B (7 billion parameters) offer scaled-down variants of the Falcon model, catering to various project needs and resource requirements.
In November 2022, Meta introduced Galactica, their LLM tailored for scientists, and within a mere three days of its public release, it caused quite a stir. Galactica underwent training on an extensive dataset consisting of 48 million academic papers, lecture notes, textbooks, and online resources. As is common with many models, it gave rise to AI “hallucinations” that raised concerns among the scientific community. These generated outputs sounded convincingly authoritative, making them challenging to identify swiftly, particularly in a domain where precision is paramount.
Since the leakage of LLaMA models online, Meta has taken a full-fledged approach towards open-source. They officially launched various sizes of LLaMA models, ranging from 7 billion parameters to 65 billion parameters. According to Meta, their LLaMA-13B model surpasses OpenAI’s GPT-3 model, which was trained on a massive 175 billion parameters.
Many developers have embraced LLaMA, leveraging it to fine-tune and create some of the most exceptional open-source models available. However, it’s important to note that LLaMA’s usage is limited to research purposes only, and it cannot be employed commercially, unlike TII’s Falcon model.
Speaking of the LLaMA 65B model, it has demonstrated remarkable capabilities across a wide range of applications. It proudly holds a place among the top 10 models on the Open LLM Leaderboard hosted by Hugging Face. Meta assures that no proprietary materials were involved in training the model; instead, they relied on publicly available data sources such as CommonCrawl, C4, GitHub, ArXiv, Wikipedia, StackExchange, and more.
Among the several LLaMA-derived models, Guanaco-65B has turned out to be the best open-source LLM, just after the Falcon model. In the MMLU test, it scored 52.7 whereas the Falcon model scored 54.1. Similarly, in the TruthfulQA evaluation, Guanaco came up with a 51.3 score and Falcon was a notch higher at 52.5. There are four flavors of Guanaco: 7B, 13B, 33B, and 65B models. All of the models have been fine-tuned on the OASST1 dataset by Tim Dettmers and other researchers.
As to how Guanaco was fine-tuned, researchers came up with a new technique called QLoRA that efficiently reduces memory usage while preserving full 16-bit task performance. On the Vicuna benchmark, the Guanaco-65B model outperforms even ChatGPT (GPT-3.5 model) with a much smaller parameter size.
The best part is that the 65B model has trained on a single GPU having 48GB of VRAM in just 24 hours. That shows how far open-source models have come in reducing cost and maintaining quality. To sum up, if you want to try an offline, local LLM, you can give a shot at Guanaco models.