AI Models

Clicking the links below will open the listed AI tools in new tabs in your browser. The list includes the frontier LLM models (proprietary, open-source, regional and reasoning) as well as consolidators (like Poe) and a few fine-tuned models (like Latimer). Agents and browser extensions are below, but API tools are now on a separate page.

PROPRIETARY FRONTIER MODELS: Here are some AI models you should know. They are from different companies, using different different neural networks and with different personalities and abilities. The paid versions are often substantially better and smarter.

THE BIG THREE 

There are now 1000s of GenAI products, models and tools. Some models are more creative writers while others are better at math or video. Specialized models (below) offer more realistic conversation (Sesame), better infographics (Napkin) or scientific research tools (Future House), but if you want to use one model: the big three (ChatGPT  from OpenAI, Gemini from Google & Claude from Anthropic) still provide the best collection of power and features in one place. 

All of them have slow, medium and fast models (slow for analysis, medium for most tasks, and fast chat when you just need a quick idea from your creative Uncle Claude). ChatGPT 5 will even select the right mode for your prompt. The big three also offer the most standard features in one place: reasoning and deep research models, voice mode, visual mode (to see images and documents), the ability to create images, documents and code (and run the code) and a mobile app. There are specialty tools for video (Hailou, Kling and Google’s own Veo 3) but you can also make video natively in ChatGPT and Gemini; making video within a smarter multimodal model allows for easier prompting but offers fewer cinematic tools. Even these general tools are much better at math now.

What you get for free changes constantly and the names and placements of models are a confusing mess. You will need to hunt for buttons and drop-down menus to find what you want while you ponder the difference between Deep Research (Gemini) and Extended Thinking (Claude).

You will need to login, even for the free versions, but this also allows you to adjust your settings to turn off training features. You may or may not want ChatGPT to remember everything about you in order to respond better. 

Start with a month of a paid model and explore. (Poe and other consolidators allow you to try a variety of models through one platform for one fee.) Also try the app (and voice mode): it will give you a better idea of how AI is about to change everything. Here is a chart of the features—and what they are called– in some basic models. I attempt to update this, but the specific names of the models change very often.

AIFast ChatMediumSlow AnalysisInternet SearchVoiceDoc AccessCreate Docs
ChatGPT5 Mini5 Auto5 Thinking or
5 Pro
Search buttonYesUpload onlyCanvas
GeminiFlashProDeep ResearchIntegrated with GoogleApp onlyGoogle DriveCanvas
ClaudeHaiku SonnetOpus & “Extended Thinking”Web search default is “on”App onlyGoogle DriveArtifacts
Grok 3Grok 4Deeper SearchDefault “on” Deep SearchApp onlyXYes
Deep SeekLLMV3RI + Deep ResearchSearch buttonNot yetNoNo
QwenFlash33 + Thinking + Deep ResearchSearch buttonYesNoYes
PerplexitySonarModel ChoiceModel Choice + Deep Research Designed for search from the beginningYesNoLabs & Spaces

Copilot is just another version of ChatGPT (Microsoft owns half of OpenAI) that integrates with Microsoft projects–so it is better with Excel and Ppt, but none are great with that yet. If your organization gives you access to this in MS Office you should also be FERPA and HIPPA secure. In CoPilot you can try Think Deeper (the o1 reasoning model). Paid ChatGPT ($20/month) gets you more advanced versions of all of this and the ability to create custom GPTs (see Custom Bots.)

You will also find other differences in the big three: Gemini gives you the option of turning your research report into a podcast, an infographic or a quiz. Claude seems more creative and good at writing (if overly verbose) and GPT 4.5 is a specialized writing tool. Google also has a deal with the Associated Press to get real-time news updates.) Also try Google AI Studio which has the latest betas and cool new tools.

OTHER MODELS

  • Grok (released July 9, 2025) is at the top of many leaderboards, but is designed with “extreme personalities.” It can analyze images and has real-time access to the internet and social network X. Grok Studio is a collaborative space and also has direct Google Drive integration.
  • Sonus is a new “set”family” of proprietary models (Pro, Air, Mini and Pro with Reasoning, see below) that is already competitive with the very best existing models. For the moment, this is the best (only) way you can try the new reasoning models for free: here.
  • WolframAlpha combines the computational powers of Wolfram|Alpha with ChatGPT. Google’s AlphaGeometry 2 now competes at the level of gold-medal students in the International Mathematical Olympiad, but not generally available. Even the general models are much better at math than earlier AI. Read this about AI and Math.
  • Ernie from Baidu (the Chinese Google search engine) is another multimodal frontier model. Ernie X1 is a reasoning model.
  • Pi is focused on emotional intelligence, dialogue and role-playing. At the moment, it can talk, but can’t hear, so you have to text it.
  • You.com is set up to be a search engine competitor to Google but with more privacy and easier customization (so it now faces competition from Gemini, but also ChatGPT Search). 
  • Amazon Nova are both good Class 2 models (so on a par with GPT 4).
  • Ethan Mollick has written an excellent summary (Jan 26, 2025) of the differences and how to pick which model to use. 

OPEN SOURCE MODELS: There are now open source models that are just as good as the best proprietary frontier models, and even better in some specialized areas. You can download most of these models from GithubAzure (Microsoft) or HuggingFace. You can then fine-tune and run them on your laptop, which deals with most privacy issues but also transfers the security risk to you.

  • Chinese DeepSeek is strong with text and can also search the web. You can try the excellent R3 model here. It is a cheaper API option and it was built for a fraction of the price/chips/energy of the big models through the clever use of Multi-head Latent Attention (MLA) that combines even more values into tokens (the simple version is tokens that read phrases, so less precision but turns out it was not needed and not all tokens are active all the time for a huge energy, cost and time savings). Here is a great non-technical summary of how DeepSeek is important or you could read this tech paper.
  • Qwen 3 from Alibaba has a range of models that can do all of the usual things and allows you to determine the reasoning level with a slider. You can also use Qwen Chat in guest mode without a login (although you have to login to use voice and some other features). Qwen2.5 beat GPT-4o, Claude-3.5 Sonnet, and DeepSeek-V3 while Qwen2-Math does very well at math.
  • Meta AI is now Llama 4 (and now a family of models) which is a huge Class 3 model (which means it can remember more pages than others) but it they also seemed to have fudged the benchmarks. It does not require a login.
  • Mistral (available as an API and Le Chat and also in a reasoning version called Magistral) is an open source LLM from France that real time internet search (with press wire access for news) and is very fast and more multilingual than the big four. It also creates great images using Black Forest Labs Flux Ultra. Mistral Medium outperformed Claude 3.7 and 4o in many benchmarks.
  • Kimi is an excellent free multimodal open source Chinese AI that has a particularly large context window (good for long papers, prompts, and conversations – you can upload 50 files 100MB EACH), does very well in math and coding (beating GPT-4o and Claude Sonnet 3.5 on Codeforces), searches the web, can analyze charts and also has reasoning.
  • MiniMax is another excellent Chinese AI company that has open source reasoning models (M1) and other tools including video and agents.
  • Deep Cognito also has a family of open-sources models in a variety of sizes.
  • MiMo from Xiaomi is an open-source reasoning model that outperforms o1mini.
  • Huggingface is a chatbot running on Llama. Start here to get a sense of what open source can do. No login is required.
  • Falcon (Mamba 7B) is an open source LLM from the UAE uses new “state space” architecture (SSLM) instead of the transformer architecture.

REGIONAL and CULTURALLY-SPECIFIC MODELS: Since people and cultures think differently, we are starting to see LLMs that are trained on culturally specific data sets. Note that if you want a culturally specific answer, you can and should still try this with the frontier models (try asking Claude to response as a Black professor and compare the response to Latimer).

  • Latimer (named after African-American engineer Lewis Latimer) aims to better represent diverse communities by adding further training from (verified and licensed) books, oral histories and sources from Black and Brown communities.(Latimer is a fine-tuned version of LLAMA.)
  • Fanar is a “culturally and regionally aware” Arabic LLM fluent in Arabic dialects from the Qatar Ministry of Communications and Information Technology (MCIT) and the Qatar Computing Research Institute of Hamad Bin Khalifa University (HBKU).
  • Doubao (from ByteDance) also has voice mode and is one of the most popular AI in China. (It is in Chinese, but can be used in English with Google translation in your browser.)
  • Mistral Saba is a 24B parameter version of the French Mistral model trained on curated datasets from across the Middle East and South Asia. It supports Arabic and many Indian-origin languages like Tamil.  
  • Nanda is trained on a dataset containing 65 billion Hindi tokens.
  • Jais is “trained on the largest Arabic dataset ever used to train an open-source foundational model, ensuring linguistic accuracy and cultural sensitivity across standard Arabic and its dialects.”
  • GigaChat is an open source Russian language model. Claude, Gemini and DeepSeek all do better on MERA (the leaderboard for testing how models do with Russian tasks) but it still useful to have a cultural Russian model.
  • Sherkala is pre-trained on Kazakh and English sources with some Russian and Turkish sources.
  • Hunyuan from Chinese Tencent is a 13B parameter open-source model that scores on par with DeepSeek despite being 15% of the size–so more efficient–and another native Chinese speaker.
  • LatamGPT from Chile’s National Center for Artificial Intelligence (Cenia) is an open source model trained on “characteristic data from Latin America” and is due to launch in fall of 2025.
  • There is also a group of behavioral scientists exploring the training of LLMs on historic texts Viking, Latin or Medieval Arabic etc.) 

REASONING MODELS: The next big thing are models that process through problems before answering. They do NOT actually reason (although it appears that way) but they have internal instructions that break problems down into steps which (especially when combined with web searching) improves accuracy and allows much more complicated problem solving. You need to use them a little differently (more here): give it something hard to do and note (or ask) how it describes its reasoning. Look at this example. The progress here has been rapid and substantial (read this report about the new o3 from Dec 2024), but in mid-January 2025 they were mostly behind paywalls. Many of the free versions now also include a button that activates this feature.

  • ChatGPT (o3 and 4.5) have a “Thinking” or “Reason” mode in the chat box (the button types and location keep changing). You can also access OpenAi’s o1 model through CoPilot by clicking the Think Deeper button. With a paid Pro subscription there are many more models, but the numbering is a joke.
  • In Gemini this is the “Deep Reasoning” button (which includes web search) also available in Google’s AI Studio
  • DeepSeek’s v3 R1 is also open source which means you can download and build with it. You can try DeepSeekR1 on an American server using Perplexity (Pro Search). Here is a great non-technical summary of how DeepSeek works that includes a good summary of how reasoning models work.
  • Qwen 3 (another Chinese open source reasoning model) allows you to control how long it thinks–so you have/get to decide based upon the difficulty of the task. It is also available in HuggingFace.
  • Magistral is the reasoning version of Mistral. It is close to DeepSeek in scores but falls behind in of math and code.
  • Kimi also has reasoning, but I’ve not tried it yet.
  • Claude.ai Sonnet also has a “thinking mode” for paid subscribers.
  • MiniMax has an open-sources reasoning model.
  • Manus is an agent (see below) but also works as a reasoning model that can conduct research and produce documents.
  • Sonus Pro with Reasoning (a new model from a start-up!) Select the Pro version with Reasoning turned on). 

EpochAI is an important independent organization that is keeping track of these models, how they compare and where we might be going. They maintain a great dashboardcomparing capabilities of the best models (against their own benchmarks) and also this larger data set of virtual all models. They produce excellent reports about trends including a recent prediction that AI will continue to improve rapidly.

CONSOLIDATORS: Poe (currently $5/month!!) ChatPlayground ($17/month) and ChatHub are consolidators that provides access to multiple AI through one interface.

CUSTOM BOTS & TUTORS: (How to prompt a simulation and build a custom bots have their own pages. Each of the big models also has a way to build and then distribute your own fine-tuned applications, but there are also educational platforms, like BoodleBox which allows the teacher to see everything students do–and has lots of other faculty features like “coach mode” which the chat default (and won’t provide students with direct answers. There are also GPTs (from OpenAI), Assistants(from HuggingFace), Bots (from Poe). Faculty developed writing tutors, for example, include one from Mark MarinoAI Tutor Pro from a group of Canadian faculty and MyEssayFeedback in beta from Eric Kean. The University of Sydney has now created Cognifi which also allows complete security and control over student use but there will be some institutional cost (and I think pricing is still being worked out for other countries?)

  • BoodleBox, SchoolAIMagicSchool and Khanmigo all provide tools to help with specific tasks that are free, FERPA compliant and secure. This includes creating specialized tutors. BoodleBox has these instructions. In SchoolAI go to Spaces and then Create. You can simply prompt it (Help students master content X by providing an overview and asking questions etc) or you can upload documents and set a standard for mastery. Importantly, SchoolAi also has a backend that tells you have students have engaged and what they might still be confused about. Here is a great example (solving Linear Equations in One Variable from Rebecca Tyler at Great Falls College MSU).
  • Snorkl provides feedback to student on their verbal or visual thinking.
  • How to Build Your Own Customized Chatbot (free chapter from Levy and Albertos (2024 Teaching Effectively with ChatGPT.
  • How to use Speaker Progress in Microsoft Teams to get feedback on your/student presentations

MINI MODELS and EDGE AI: These are smaller, faster and more specialized (often) OPEN SOURCE tools that you customize to live and run on your phone. Note that the ways to make an LLM better are model size (see Frontier models above), data set size and and the amount of training. Since it is not clear that larger more capable models will be cost effective, these faster smaller models (with more training) may end up being more useful. Apple Intelligence will test this idea. More smaller models are coming. 

  • Phi-3.5 from Microsoft comes in three sizes Mini, Small and Medium (3.8-41B parameters)
  • OpenELM is the Apple version that comes in four sizes (270M-3B parameters)
  • Gemma is the open source smaller model from Google also in several sizes

BROWSER EXTENSIONS: Google now has its Gemini AI built into its browser but there new browsers (like Dia and Comet from Perplexity) with more to come that will transform searching, research and shopping. You can also add an AI extension to your browser. If you use Chrome, some good free extensions are Perplexity AI, in the Chrome store hereSciSpace (which does everything SCiSpace above does, but in your browser), Merlin AIBing, or Clipy AI: now every time you do a Google search, you will also get an AI response.

AGENTS: A chatbot can only chat with you, but an “agent” can plan and execute a series of tasks, like building you a website or finding information on your computer. Agents can use multiple tools and know when to switch, so an AI agent can manage a workflow. Here are details about the “Agent2Agent” (or A2A) or “Model Context Protocol” that create these two-way connections between data sources and AI-powered tools.

Start by watching the demos from Genspark or Manus. Here are some use case examples. Then ask it to build you an interactive course website using the best research and including links to video and with interactive learning activities (or just a new episode of a TV show you like). I have also build simulations with Macaly which pitches itself as more of a vibe-coding app. Here is a website created with MiniMax by Marc Watkins. ChatGPT o3 is starting to have some agenic capabilities. Devin is not quite there, but other early tools include Swarm and Codex (also open source!) from OpenAI, Claude Computer Use (Claude Code and even 3.7 can also do tasks like create and bedbug code), Kimi-Researcher and Asana. OpenAI has introduced Operator (Jan 23, 2025)–it is called “computer use” in CoPilot. Here is a demo (from Graham Clay) where Operator has been asked to write an essay in a GoogleDoc at human speed with edits. There are now lots of demos of agents doing students homework. Another use of agents (that is also about growing use of synthetic data) is this simulated hospital with AI agents as both patients and doctors, which allowed the AI doctors to gain experience (treating 10,000 patients) and “evolve” become better. LinkedIn has an agent that helps recruit job seekers (and also an AI jobs match tool). Zoom has also given its AI companion some agency capability.

ROBOTS: There are new robots and robots platforms coming out every day, but start with the ridable hydrogen-powered Corleo from Kawasaki

You can find a complete list of AI products (tracked by Ithaka S+R) here

Here is a great AI guide for students.