Benchmark a stock with Equity Strategist
NewUpdated
Run the Equity Strategist agent over a list of tickers in parallel and aggregate the rationale and latency into a single table — the same shape you would use to A/B-test against another provider.
import { Chaos, EQUITY_MODEL } from '@chaoslabs/ai-sdk';
const chaos = new Chaos({ apiKey: process.env.CHAOS_API_KEY! });
const tickers = ['NVDA', 'AAPL', 'MSFT', 'TSLA'];
const results = await Promise.all(
tickers.map(async (t) => {
const start = Date.now();
const r = await chaos.chat.responses.create({
model: EQUITY_MODEL,
input: [
{
type: 'message',
role: 'user',
content: `Is ${t} fairly valued at the current price? Give a rationale.`,
},
],
metadata: {
user_id: 'benchmark-runner',
session_id: `benchmark-${t}-${start}`,
},
});
// info blocks are emitted at runtime but aren't yet in the public Block helper union;
// widen locally to read the discriminator + content.
const blocks = (r.messages ?? [])
.filter((m): m is Extract<typeof m, { type: 'block' }> => m.type === 'block')
.map((m) => m.data.block as { type: string; content?: string });
const rationale = blocks
.filter((b) => b.type === 'info' && typeof b.content === 'string')
.map((b) => b.content)
.join('\n\n');
return { ticker: t, rationale, latencyMs: Date.now() - start };
})
);
console.table(results);Copy code
Open in Cursor
Open in VS Code
Open in v0
Open in Claude
Open in ChatGPT
Adapting for benchmarking
Swap the chaos.chat.responses.create(...) call for your incumbent provider's client to compare latency, rationale length, and answer quality side-by-side. Capture Date.now() deltas around each call to track end-to-end latency including network and streaming time.
Was this helpful?