Sharing our experience with both open-source and closed-source LLMs, we initially considered adopting the open-source model Llama 2 for one of our use cases. We deployed the model on one of our cloud servers, and it performed satisfactorily. However, as our workload increased, scalability became a crucial factor. With over 10,000 hits per day in our scenario, optimizing the computing power of our machines became imperative. This led to the necessity of scaling up our infrastructure, incurring additional costs for us.
We evaluated the OpenAI and Gemini models, and both models met our performance expectations. However, when it came to scalability, Gemini 1.0 was more flexible compared to the OpenAI model. With the help of the Google support team, we could easily increase the per day/per minute rate limits, whereas that flexibility was limited with OpenAI models. The cost was also more beneficial compared to the hosting charges of the open-source models, so we ultimately chose Gemini 1.0.
From a cost perspective, for our particular use case, closed-source models have proven to be more economical. However, the choice between open-source and closed-source may still depend on specific user requirements. Factors such as data privacy, security, performance, and reliability are critical in the decision-making process.
The journey of finding the right LLM is unique for each organization, and we hope our insights help others navigate their own path!