Why General Tokenizers Struggle with Drug Safety Text
A practical look at why general-purpose tokenizers fragment pharmacovigilance text, inflate token counts, and miss domain-specific meaning.
PharmacovigilanceTokenizationAINLP
Read post Blog
Long-form writing, technical observations, demos, and field notes from the systems I build and study.