Listwise reranking with uncertainty-aware adaptive computation via Bayesian modeling of documents' relevance with TrueSkill
A bilingual benchmark for abstraction, comprehension, and reasoning evaluation in academic contexts
Multi-scale adaptive context RAG for long-context large language models