Generative benchmarking to create representative test data for LLMs07-08-2025 https://research.trychroma.com/generative-benchmarking https://github.com/chroma-core/generative-benchmarking