@edoardo-ponti.bsky.social

Assistant professor in Natural Language Processing at the University of Edinburgh and visiting professor at NVIDIA | A Kleene star shines on the hour of our meeting.https://bsky.app/profile/edoardo-ponti.bsky.social@edoardo-ponti.bsky.social - Edoardo Pontihttps://bsky.app/profile/edoardo-ponti.bsky.social/post/3lqwtsngghs2h🚀 By *learning* to compress the KV cache in Transformer LLMs, we can generate more tokens for the same compute budget. This unlocks *inference-time hyper-scaling* For the same runtime or memory load, we can boost LLM accuracy by pushing reasoning even further!06 Jun 2025 12:33 +0000at://did:plc:ctizc4hhwflolzdos4gregjo/app.bsky.feed.post/3lqwtsngghs2hhttps://bsky.app/profile/edoardo-ponti.bsky.social/post/3lnnklig7ls2eSparse attention is one of the most promising strategies to unlock long-context processing and long-generation reasoning in LLMs. We performed the most comprehensive study on training-free sparse attention to date. Here is what we found:25 Apr 2025 15:39 +0000at://did:plc:ctizc4hhwflolzdos4gregjo/app.bsky.feed.post/3lnnklig7ls2ehttps://bsky.app/profile/edoardo-ponti.bsky.social/post/3lgzycek3cc2sI have a scholarship for a PhD in efficient memory and tokenization in LLM architectures at @edinburgh-uni.bsky.social! Eligibility: UK home fee status Starting date: flexible, from July 2025 onwards. https://informatics.ed.ac.uk/study-with-us/our-degrees/postgraduate-research-and-cdts/postgraduate-research-funding/phd-efficient-llm-inference Please contact me if you're interested!31 Jan 2025 12:20 +0000at://did:plc:ctizc4hhwflolzdos4gregjo/app.bsky.feed.post/3lgzycek3cc2shttps://bsky.app/profile/edoardo-ponti.bsky.social/post/3lgzx7fdjuk2dCode and models for Dynamic Memory Compression are finally available! Stay tuned for architectures with even more efficient inference. https://developer.nvidia.com/blog/dynamic-memory-compression/31 Jan 2025 12:00 +0000at://did:plc:ctizc4hhwflolzdos4gregjo/app.bsky.feed.post/3lgzx7fdjuk2dhttps://bsky.app/profile/edoardo-ponti.bsky.social/post/3ldv4zvoovk22We're hiring a lecturer or reader in embodied NLP at the University of Edinburgh! Deadline: 31 Jan 2025 Call for applications: https://elxw.fa.em3.oraclecloud.com/hcmUI/CandidateExperience/en/job/1181222 Dec 2024 09:46 +0000at://did:plc:ctizc4hhwflolzdos4gregjo/app.bsky.feed.post/3ldv4zvoovk22https://bsky.app/profile/edoardo-ponti.bsky.social/post/3ldra3xd6ps2t**Grounded typology**: a new paradigm. Traditionally, linguists posit functions to compare forms in different languages; however, these are aprioristic and partly arbitrary. Instead, we resort to perceptual modalities (like vision) as measurable proxies for function. [contains quote post or other embedded content]20 Dec 2024 20:30 +0000at://did:plc:ctizc4hhwflolzdos4gregjo/app.bsky.feed.post/3ldra3xd6ps2thttps://bsky.app/profile/edoardo-ponti.bsky.social/post/3ld4t5tlksc2y Two amazing papers from my students at #NeurIPS today: ⛓️💥 Switch the vocabulary and embeddings of your LLM tokenizer zero-shot on the fly (@bminixhofer.bsky.social) https://neurips.cc/virtual/2024/poster/95143 🌊 Align your LLM gradient-free with spectral editing of activations (Yifu Qiu) https://neurips.cc/virtual/2024/poster/9352912 Dec 2024 17:45 +0000at://did:plc:ctizc4hhwflolzdos4gregjo/app.bsky.feed.post/3ld4t5tlksc2yhttps://bsky.app/profile/edoardo-ponti.bsky.social/post/3lbyyrsfhkc2dWe had a blast at this year's @ellis.eu Dagstuhl seminar on "Modular and Agentive LLMs". Thanks everyone for participating!28 Nov 2024 11:50 +0000at://did:plc:ctizc4hhwflolzdos4gregjo/app.bsky.feed.post/3lbyyrsfhkc2dhttps://bsky.app/profile/edoardo-ponti.bsky.social/post/3lbi3m4d5wk2dP.S. Make sure to follow @pnawrot.bsky.social! [contains quote post or other embedded content]21 Nov 2024 18:25 +0000at://did:plc:ctizc4hhwflolzdos4gregjo/app.bsky.feed.post/3lbi3m4d5wk2dhttps://bsky.app/profile/edoardo-ponti.bsky.social/post/3lbhlqzgbcc2aLast 5 days to apply for a PhD at #EdinburghNLP! Deadline: November 25 https://www.ed.ac.uk/studying/postgraduate/degrees/index.php?r=site/view&edition=2025&id=491 If you are passionate about: - adaptive tokenization and memory in foundation models - modular deep learning - computational typology please message me or meet me at #NeurIPS2024!21 Nov 2024 13:41 +0000at://did:plc:ctizc4hhwflolzdos4gregjo/app.bsky.feed.post/3lbhlqzgbcc2ahttps://bsky.app/profile/edoardo-ponti.bsky.social/post/3lbeyh62ga22vAnother nano gem from my amazing student Piotr Nawrot! A repo & notebook on sparse attention for efficient LLM inference: https://github.com/PiotrNawrot/nano-sparse-attention This will also feature in my #NeurIPS 2024 tutorial "Dynamic Sparsity in ML" with André Martins: dynamic-sparsity.github.io Stay tuned!20 Nov 2024 12:51 +0000at://did:plc:ctizc4hhwflolzdos4gregjo/app.bsky.feed.post/3lbeyh62ga22v