<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[We Cut Our OpenAI Costs by 50% Without Changing the Model]]></title><description><![CDATA[<p dir="auto"><img src="/assets/uploads/files/1771382665499-micheile-henderson-zvprbbmt8qa-unsplash.jpg" alt="micheile-henderson-ZVprbBmT8QA-unsplash.jpg" class=" img-responsive img-markdown" /></p>
<p dir="auto">When we launched our AI chatbot product, it worked beautifully.</p>
<p dir="auto">Great responses. Happy users. Strong engagement.</p>
<p dir="auto">But there was one problem: That’s our Cost.</p>
<p dir="auto">On average, our OpenAI cost per user was around $5 per month.</p>
<p dir="auto">At first, that didn’t feel scary, But then we did the math.</p>
<p dir="auto">If we scaled to 1,000 users, that’s: $5,000 per month</p>
<p dir="auto">LLM pricing depends on the number of tokens and the number of requests. We couldn’t reduce requests but we could reduce tokens.</p>
<p dir="auto">And that’s where TOON came in.</p>
<p dir="auto"><strong>What Is TOON?</strong><br />
TOON (Token-Oriented Object Notation) is a compact, human-readable encoding of the JSON data model designed specifically for LLM prompts.</p>
<p dir="auto"><a href="https://toonformat.dev" target="_blank" rel="noopener noreferrer nofollow ugc">https://toonformat.dev</a></p>
<p dir="auto">Think of it as a translation layer:</p>
<p dir="auto">Use JSON programmatically in your backend.<br />
Convert it to TOON before sending it to the LLM.<br />
After get the response, convert it back to json.<br />
TOON combines:</p>
<p dir="auto">YAML-style indentation for nesting<br />
CSV-style tabular format for uniform arrays<br />
<strong>The Real Problem: JSON Is Token-Expensive</strong><br />
Imagine our chatbot needs to send product data to the model:</p>
<pre><code>
[
  {
    "id": 201,
    "title": "Wireless Bluetooth Headphones",
    "seoTitle": "Best Wireless Bluetooth Headphones 2025",
    "seoDescription": "Premium noise-cancelling wireless headphones with 30-hour battery life.",
    "tags": ["electronics", "audio", "wireless"],
    "price": 129.99,
    "currency": "USD",
    "active": true
  },
  {
    "id": 202,
    "title": "Ergonomic Office Chair",
    "seoTitle": "Comfortable Ergonomic Office Chair for Long Hours",
    "seoDescription": "Adjustable lumbar support chair designed for productivity and comfort.",
    "tags": ["furniture", "office", "comfort"],
    "price": 249.00,
    "currency": "USD",
    "active": true
  }
]
</code></pre>
<p dir="auto">At first glance, this looks normal.</p>
<p dir="auto">But look carefully.</p>
<p dir="auto">For every product, we repeat:</p>
<pre><code>"id"
"title"
"seoTitle"
"seoDescription"
"tags"
"price"
"currency"
"active"
</code></pre>
<p dir="auto">That’s 8 keys repeated for every single row.</p>
<p dir="auto">If you have 100 products, that’s 800 repeated field names.</p>
<p dir="auto">LLMs don’t need that repetition.</p>
<p dir="auto">They just need structure.</p>
<p dir="auto"><strong>The Same Data in TOON</strong></p>
<pre><code>products[2]{id,title,seoTitle,seoDescription,tags,price,currency,active}:
  201,"Wireless Bluetooth Headphones","Best Wireless Bluetooth Headphones 2025","Premium noise-cancelling wireless headphones with 30-hour battery life.","electronics|audio|wireless",129.99,USD,true
  202,"Ergonomic Office Chair","Comfortable Ergonomic Office Chair for Long Hours","Adjustable lumbar support chair designed for productivity and comfort.","furniture|office|comfort",249.00,USD,true
</code></pre>
<p dir="auto">What changed ?</p>
<p dir="auto">Field names declared once<br />
No repeated keys<br />
No curly braces per row<br />
No structural duplication<br />
Tags flattened in a compact way<br />
Same information &amp; Much fewer tokens.</p>
<p dir="auto"><strong>Our Case</strong><br />
In our production workload, we use Python and integrating TOON was straightforward.</p>
<p dir="auto">We only modified the translation layer:</p>
<p dir="auto">Convert JSON → TOON before sending to the LLM<br />
Convert TOON → JSON after receiving the response<br />
That was it.</p>
<p dir="auto">By changing only the input and output formatting layer, we reduced our OpenAI costs by nearly 50%.</p>
<p dir="auto">Based on our experience, TOON works best when:</p>
<p dir="auto">Your data is flattened or mostly flat<br />
You’re sending lists of similar objects<br />
Your structure is simple or semi-structured<br />
You’re not feeding deeply nested hierarchies<br />
In our case, we weren’t sending complex nested objects to the model.<br />
Most of our chatbot context consisted of structured records: product data, analytics summaries, event logs which fit perfectly into TOON’s tabular format.</p>
<p dir="auto">If your data looks like:</p>
<pre><code>[
 { "id": "", "title": "", "price": "" },
 { "id": "", "title": "", "price": "" },
]
</code></pre>
<p dir="auto">You’re likely paying for repeated keys you don’t actually need.</p>
<p dir="auto"><strong>Final Thought</strong><br />
If you’re building with LLMs, take a closer look at your token usage.</p>
<p dir="auto">Before changing models or reducing features, inspect the structure of the data you’re sending.</p>
<p dir="auto">In many cases, the waste isn’t in the intelligence — it’s in the formatting.</p>
<p dir="auto">We only changed our translation layer.</p>
<p dir="auto">Nothing else.</p>
<p dir="auto">And that alone reduced our OpenAI costs by nearly 50%.</p>
<p dir="auto">Try TOON in your own workload.<br />
Measure the difference.<br />
Run the numbers.</p>
<p dir="auto">You may find that scaling becomes much more affordable than you expected.</p>
<pre><code></code></pre>
]]></description><link>https://lankadevelopers.lk/topic/3034/we-cut-our-openai-costs-by-50-without-changing-the-model</link><generator>RSS for Node</generator><lastBuildDate>Tue, 16 Jun 2026 09:08:29 GMT</lastBuildDate><atom:link href="https://lankadevelopers.lk/topic/3034.rss" rel="self" type="application/rss+xml"/><pubDate>Wed, 18 Feb 2026 02:46:08 GMT</pubDate><ttl>60</ttl></channel></rss>