I Fixed My LLM OOM Crashes by Shrinking the Draft Model (Speculative Decoding on Real Hardware)

· Dev.to