On Tuesday, OpenAI announced a sizable update to its large language model API offerings (including GPT-4 and gpt-3.5-turbo), including a new function-calling capability, significant cost reductions, and a 16,000 token context window option for the gpt-3.5-turbo model.
In large language models (LLMs), the “context window” is like a short-term memory that stores the contents of the prompt input or, in the case of a chatbot, the entire contents of the ongoing conversation. In language models, increasing context size has become a technological race, with Anthropic recently announcing a 75,000-token context window option for its Claude language model. In addition, OpenAI has developed a 32,000-token version of GPT-4, but it is not yet publicly available.
Try this: https://openrouter.ai