10 Reasons for Migrating to Prefect 2
From Prefect 1 or another orchestration framework
At Prefect, we have an ambassador program called Club42. One of our club members recently asked: why should I migrate to Prefect 2? If you are using Prefect 1 or another orchestrator, here is a list of reasons you should consider migrating to Prefect 2.
#1 Just Python
Prefect 2 embraces the philosophy that your code is already the best representation of your workflow. You shouldn’t have to learn new logic to implement branching and conditional flow. Loops shouldn’t be forbidden. Therefore, in Prefect 2, DAGs are optional. Your workflow graph doesn’t have to be directed or acyclic. You simply write native Python with function calls, if/else
statements, or for/while
loops and you can optionally add tasks to your @flow
if you want to add more visibility, caching, parallelism, concurrency, and other features that @task
provides. Dataflow in Prefect is quite literally as easy as .py
.
#2 Ephemeral API
Remember when you had to first deploy or register your workflow DAG before you could see your workflow execution in the UI? That’s no longer the case in Prefect 2. You write a Python function and run it like any Python script.
from prefect import flow
@flow(log_prints=True)
def hello():
print("Greetings from Prefect! 🤗")
if __name__ == "__main__":
hello()
Then… magic! ✨ You can see the workflow execution and logs directly in the UI.
The same applies to Python functions executed based on triggers from AWS Lambda. You add the Prefect Cloud workspace URL and API key, and your serverless functions become observable in your Prefect coordination plane.
This also allows you to run your flows with other schedulers, e.g., with CRON jobs or scheduled GitHub Actions while maintaining the observability from the Prefect UI.
#3 Simpler deployment patterns
Prefect 2 allows something simple but unique: multiple deployments of the same flow. This means that you can schedule the same workflow:
- with different parameters — imagine the possibilities where you build one generic flow that can be used by many (business) users depending on which default parameter values are passed to a deployment; those default parameter values are validated with
pydantic
to ensure correct data types and they can be dynamically overridden at runtime when needed - across various types of infrastructure — the same flow can run on multiple different Python versions and even on different public clouds and simultaneously on-prem; some users leverage that pattern to test their build artifacts across different Python versions
- across various environments — the same flow can have a separate deployment for
dev
andprod
simply by pointing to different storage (e.g., GitHub block withmain
anddev
branches) and infrastructure blocks (e.g.dev
andprod
Kubernetes job blocks)
The deployment experience is simple and intuitive. Once you have defined your storage and infrastructure blocks, you can reuse them across all deployments. The deployment’s entrypoint tells Prefect which flow script and flow function to run, and the queue name helps to ensure that the right agent will pick up scheduled flow runs from that deployment.
Additionally, you can define your deployment from the CLI so that your code is simple and clean. Your deployment metadata can live in your CI/CD pipeline or a bash script without cluttering flow code representing your business logic.
#4 No GraphQL
One common pain point we’ve heard from our users was the complexity of GraphQL. While powerful, it can be difficult to integrate with the Python ecosystem, and it can result in unexpected and complicated access patterns. Prefect 2, in contrast, ships with a REST API and Swagger UI, making it easy to interact with and simplifying many API calls.
#5 Package dependency management
Given that infrastructure blocks are self-contained, packaging your code and dependencies is finally painless. When you create a deployment, your code dependencies can be automatically uploaded to S3, GCS, or Azure blob storage. In the same way, Prefect can clone your entire GitHub repository at runtime so that your code dependencies are properly added to your execution environment. And if you prefer to package your code and dependencies into a single Docker image, that’s possible too!
#6 More open-source features
The open-source Prefect 2 is more powerful than Prefect 1. Concurrency limits, automated flow run notifications, and secrets were previously only available to Prefect Cloud users. All that is now available in Orion.
Blocks allow you to securely store credentials to external systems and allow adding business logic relevant to those systems.
#7 Notifications
Along with blocks, we introduced the ability to create notifications. You can configure them directly from the UI without boilerplate code. They allow you to send automated alerts, e.g., when your flow fails, crashes, or gets canceled. Similarly, you can get notified, e.g., via Slack, anytime your critical flow is completed.
#8 Collections & Blocks as integrations
Third-party systems integrations are managed via dedicated libraries that we call Prefect Collections. To avoid unnecessary dependency conflicts, you only install the collections that you need. Additionally, thanks to Prefect Blocks, managing credentials to third-party systems is handled simply and securely — once you fill out the form in the UI or via code, you can load the block in any part of your flow and call its methods.
The collection’s catalog keeps growing every month, and if you want to contribute a new one, our Integrations team streamlined and standardized this process. There is a collection template with prebuilt GitHub Action workflows that makes it easy to build, document, and release your custom collection.
This separation allows integrations to be maintained independently from the core package release cycle.
#9 Async support
Prefect 2 supports asynchronous execution out of the box. You can either submit your tasks to the default ConcurrentTaskRunner
, or make your tasks and flows async
to make IO-based operations faster without having to deal with multithreading or distributed compute clusters.
#10 Parallel execution with Dask and Ray
To parallelize execution on a distributed compute cluster, you can add DaskTaskRunner
or RayTaskRunner
to your flow decorator. You can either connect to an already existing cluster or create one on demand. In comparison to Prefect 1, logs from Dask workers are now captured by the Prefect 2 backend.
Next steps
We’ve launched a dedicated website upgrade.prefect.io to make migration as seamless as possible. If you have any questions, you can reach us via our Slack Community and Discourse.
Happy Engineering!