Coinbase Cuts AI Spending Nearly in Half, Makes Open-Weight Models Default Option

According to BlockBeats, Coinbase CEO Brian Armstrong stated on June 27 that the key to maintaining stable AI costs while token usage grows exponentially is not restricting usage, but using better default models and caching mechanisms. Coinbase is defaulting to open-weight models such as GLM 5.2 and Kimi 2.7 through its LLM gateway, while still encouraging engineers to select appropriate models for specific tasks. The company noted that 91% of employees never hit usage caps, so rather than lowering quotas, it shifted to lower-cost default models.

Coinbase has implemented cache-aware request handling and smart model routing based on cache hit rates. For example, after optimizing cache implementation, LibreChat's cache hit rate improved from 5% to 60%. Through these practices, Coinbase has reduced AI spending by nearly half while token usage continues to grow.

Disclaimer: The information on this page may come from third-party sources and is for reference only. It does not represent the views or opinions of Gate and does not constitute any financial, investment, or legal advice. Virtual asset trading involves high risk. Please do not rely solely on the information on this page when making decisions. For details, see the Disclaimer.
Comment
0/400
No comments