Submissions from runanywhere.ai

		FASTEST LLM decode engine on Apple Silicon. 658 tok/s on M4-Max,beats MLX by 19% (runanywhere.ai)
		5 points by sanchitmonga 14 hours ago \| past \| 3 comments