They have thought about how they can improve the DS experience. Inconsistent storage? DeltaLake. Slow Spark queries? Databricks Delta. Model management? MLFlow (I haven't adopted this, but can't pin down why -- on face value it seems great). Development environment? Databricks Connect. Cluster management? Core.
But the same is not true for SQL analysts. Today's offering does not empathise with them. I'm unsure integrating Redash is a genuine reply to their needs.
The upside here is that (1) Databricks (or at least, Databricks' marketing) appears to be prioritising this need, and (2) A lot of people are betting a lot money that they can do this well.
>Model management? MLFlow (I haven't adopted this, but can't pin down why -- on face value it seems great).
Probably because you want your code to be about the problem you're trying to solve, not about tracking experiments. Similar to Anti-lock braking system or Electronic stability control systems in a car: you want them to be "on" by default, not to activate them every five minutes while driving.
They have thought about how they can improve the DS experience. Inconsistent storage? DeltaLake. Slow Spark queries? Databricks Delta. Model management? MLFlow (I haven't adopted this, but can't pin down why -- on face value it seems great). Development environment? Databricks Connect. Cluster management? Core.
But the same is not true for SQL analysts. Today's offering does not empathise with them. I'm unsure integrating Redash is a genuine reply to their needs.
The upside here is that (1) Databricks (or at least, Databricks' marketing) appears to be prioritising this need, and (2) A lot of people are betting a lot money that they can do this well.
Tomorrow looks sunny.