Would you say it's interesting to explore after spending much time on them ? Do you feel like one could make an use for it pragmatically within certain context or it's way too much of a toy where most of the time getting a service / coherent llm would ease-in the work ?
Yes. I think learning them and learning their limitations is the best way to learn neural networks actually.
Give a class of students an image with horizontal lines where every second line is a solid color and every other is random static. See how their left to right markov chains do here (should make ~50% correct predictions).
Then rotate the image 90degrees. Have the class observe a left to right markov chains gets 0% when predicting this (every second pixel being random will do that). What to do? Maybe input both ways and weight towards the best one with a perceptron? Hey first step to learning a neural network!
From there you can iterate more and more until you no longer really have markov chains but instead neural networks with a type of attention mechanism.