<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"><channel><description>Postdoc at Meta and university of Washington in NLP. Before: PhD from Charles University (Prague 🏰).&#xA; Interested in going into the inner workings of neural networks 🔍, multilinguality 🌍, tokenization 🔡 and fairer NLP ⚖️ (he/him)</description><link>https://bsky.app/profile/tomlim.bsky.social</link><title>@tomlim.bsky.social - Tomasz Limisiewicz</title><item><link>https://bsky.app/profile/tomlim.bsky.social/post/3mlt7xbj5r22r</link><description>See you there! 🌉🔠&#xA;&#xA;[contains quote post or other embedded content]</description><pubDate>14 May 2026 16:20 +0000</pubDate><guid isPermaLink="false">at://did:plc:uulvy5pdfmvxbrxtzeibwq2x/app.bsky.feed.post/3mlt7xbj5r22r</guid></item><item><link>https://bsky.app/profile/tomlim.bsky.social/post/3mkzwjnfvys26</link><description>We present Compute Optimal Tokenization! 🔠&#xA;Common in LLM scaling  works stick to one tokenizer, sweeping data/model size.&#xA;But what happens when we control the tokenizer’s compression rate (bytes/token)? &#xA;Here we sweep tokenizers, params, and data across compute budgets: [1/N]</description><pubDate>04 May 2026 14:55 +0000</pubDate><guid isPermaLink="false">at://did:plc:uulvy5pdfmvxbrxtzeibwq2x/app.bsky.feed.post/3mkzwjnfvys26</guid></item><item><link>https://bsky.app/profile/tomlim.bsky.social/post/3lubayfns4c2y</link><description>Check the BLT poster at @aclmeeting.bsky.social . It’s just fortaste before the main presentation at @tokshop.bsky.social next week from Artidoro Pagnoni!</description><pubDate>18 Jul 2025 20:11 +0000</pubDate><guid isPermaLink="false">at://did:plc:uulvy5pdfmvxbrxtzeibwq2x/app.bsky.feed.post/3lubayfns4c2y</guid></item><item><link>https://bsky.app/profile/tomlim.bsky.social/post/3lu7fe6sskc2s</link><description>It’d be great to meet at Tokenization Workshop @tokshop.bsky.social #icml&#xA; tomorrow July 18 starting at 8:45 in Meeting 112-113!&#xA;&#xA;[contains quote post or other embedded content]</description><pubDate>18 Jul 2025 02:24 +0000</pubDate><guid isPermaLink="false">at://did:plc:uulvy5pdfmvxbrxtzeibwq2x/app.bsky.feed.post/3lu7fe6sskc2s</guid></item><item><link>https://bsky.app/profile/tomlim.bsky.social/post/3lu2amyu24s2y</link><description>I&#39;m pleased to be in Vancouver for @ICML this week 🇨🇦🤖. I&#39;ll be happy to chat about multilingual, multimodal LMs and tokenization(free).</description><pubDate>16 Jul 2025 01:16 +0000</pubDate><guid isPermaLink="false">at://did:plc:uulvy5pdfmvxbrxtzeibwq2x/app.bsky.feed.post/3lu2amyu24s2y</guid></item><item><link>https://bsky.app/profile/tomlim.bsky.social/post/3lqnpjk3ypc2e</link><description>If you have experience with tokenization (who doesn’t) your help with reviewing will be hugely  appreciated! 🔠🔡&#xA;&#xA;[contains quote post or other embedded content]</description><pubDate>02 Jun 2025 21:23 +0000</pubDate><guid isPermaLink="false">at://did:plc:uulvy5pdfmvxbrxtzeibwq2x/app.bsky.feed.post/3lqnpjk3ypc2e</guid></item><item><link>https://bsky.app/profile/tomlim.bsky.social/post/3lmukzmog5c2f</link><description>It’s finally official: the long-awaited Tokenization Workshop is here!&#xA;&#xA;[contains quote post or other embedded content]</description><pubDate>15 Apr 2025 17:10 +0000</pubDate><guid isPermaLink="false">at://did:plc:uulvy5pdfmvxbrxtzeibwq2x/app.bsky.feed.post/3lmukzmog5c2f</guid></item><item><link>https://bsky.app/profile/tomlim.bsky.social/post/3llwewx7tc22k</link><description>So, apparently, confusing these two buttons can ignite a serious flame-war in reviewer-author discussion🔥 @aclmeeting.bsky.social</description><pubDate>03 Apr 2025 17:01 +0000</pubDate><guid isPermaLink="false">at://did:plc:uulvy5pdfmvxbrxtzeibwq2x/app.bsky.feed.post/3llwewx7tc22k</guid></item><item><link>https://bsky.app/profile/tomlim.bsky.social/post/3llokom4ho22x</link><description>Excited to continue my research adventure as a postdoc at @uwnlp.bsky.social and Meta! I’ve joined @lukezettlemoyer.bsky.social’s fantastic lab. Together, we plan to rethink how LLMs perceive data to unlock their capabilities to uncharted language and, further, beyond text!</description><pubDate>31 Mar 2025 14:23 +0000</pubDate><guid isPermaLink="false">at://did:plc:uulvy5pdfmvxbrxtzeibwq2x/app.bsky.feed.post/3llokom4ho22x</guid></item><item><link>https://bsky.app/profile/tomlim.bsky.social/post/3laqgrg2igs2l</link><description>Tokenization is so back! at #EMNLP&#xA;&#xA;[contains quote post or other embedded content]</description><pubDate>12 Nov 2024 08:41 +0000</pubDate><guid isPermaLink="false">at://did:plc:uulvy5pdfmvxbrxtzeibwq2x/app.bsky.feed.post/3laqgrg2igs2l</guid></item><item><link>https://bsky.app/profile/tomlim.bsky.social/post/3laopkmlszc2g</link><description>#firstpost&#xA;&#xA;Are you working on NLP for low-resource or non-Latin script languages?&#xA;&#xA;If yes, I have great news for you! Our MYTE tokenizer and MyT5 models 🪲 are now easily available through🤗. It’s easy to try:</description><pubDate>11 Nov 2024 16:13 +0000</pubDate><guid isPermaLink="false">at://did:plc:uulvy5pdfmvxbrxtzeibwq2x/app.bsky.feed.post/3laopkmlszc2g</guid></item></channel></rss>