Hacker News

nayajunimesh

Show HN: Parse LLM Markdown streams incrementally on the server or client github.com

Most AI chat applications (such as ChatGPT or Claude) stream their responses to the client as markdown text. As each new chunk of text arrives, the front end typically re-parses the entire markdown document to render the updated message. This works, but it can quickly slow down the UI for long responses.

I’ve been obsessing over ways to make this more efficient, so I wrote a markdown parser that can parse streaming markdown (semi) incrementally. Instead of re-processing the whole document each time, it only parses what’s new, processing each line only once. Block‑level nodes are buffered until they’re complete (for example, once a paragraph is done and won’t be extended by more text). This also makes parsing the markdown on server possible. The main demo does exactly that. Additionally, animating markdown blocks becomes much simpler and efficient, as a result.

Here’s a demo if you’d like to see it in action: https://markdownparser.vercel.app/experimental

Feel free to type 'Render a table with 10 rows' to see each table row animate in.

I’ve spent a lot of time thinking about this problem, so if you’re working on similar issues, I'd love to chat.

nayajunimeshop13 hours ago

I also wrote an interactive blog post sharing my thoughts on what I think makes a better markdown streaming UI here: https://nimeshnayaju.com/markdown. The library was a result of those opinions and thoughts.

source