Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We consider it a glorified cron replacement. The main selling point is its scheduling feature and the ability to view logs via the web UI it provides.

You write DAGs in Python to do 'stuff', schedule it to run, say, every hour. You can then get a history of its runs, failures, what went wrong. Rerun things if needed.

Those are the pros.

Cons - when new devs try to treat it as a programming paradigm, things can get difficult to work with. Some aspects aren't easily automatable - eg creating users. Needs to make its authentication options obvious and would be good to have some finer grain control over who can do what in the Airflow UI.

Overall we're quite happy with it and also using it for datascience as well as data feeds, data workflows, ETL processes.



There is an attempt to create a role based access control by the guys at WePay according to the shared slide "RBAC talk" below [1]. Don't know why their repo [2] can't be accessed now, though.

[1] https://www.meetup.com/ja-JP/Bay-Area-Apache-Airflow-Incubat...

[2] https://github.com/wepay/airflow-webserver


The RBAC UI has since been merged into Airflow and is now released in Airflow 1.10.

https://github.com/apache/incubator-airflow/tree/master/airf...

https://github.com/apache/incubator-airflow/commit/05e1861e2...

To enable it, set `rbac = True` under the `webserver` group in your airflow.cfg, or via env var:

    export AIRFLOW__WEBSERVER__RBAC=True


That's great to know, thank you. Eager to try 1.10!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: