The Prefect Blog
Published in

The Prefect Blog

10 Reasons for Migrating to Prefect 2

From Prefect 1 or another orchestration framework

At Prefect, we have an ambassador program called Club42. One of our club members recently asked: why should I migrate to Prefect 2? If you are using Prefect 1 or another orchestrator, here is a list of reasons you should consider migrating to Prefect 2.

#1 Just Python

Prefect 2 embraces the philosophy that your code is already the best representation of your workflow. You shouldn’t have to learn new logic to implement branching and conditional flow. Loops shouldn’t be forbidden. Therefore, in Prefect 2, DAGs are optional. Your workflow graph doesn’t have to be directed or acyclic. You simply write native Python with function calls, if/else statements, or for/while loops and you can optionally add tasks to your @flow if you want to add more visibility, caching, parallelism, concurrency, and other features that @task provides. Dataflow in Prefect is quite literally as easy as .py.

#2 Ephemeral API

Remember when you had to first deploy or register your workflow DAG before you could see your workflow execution in the UI? That’s no longer the case in Prefect 2. You write a Python function and run it like any Python script.

from prefect import flow

@flow(log_prints=True)
def hello():
print("Greetings from Prefect! 🤗")

if __name__ == "__main__":
hello()

Then… magic! ✨ You can see the workflow execution and logs directly in the UI.

The same applies to Python functions executed based on triggers from AWS Lambda. You add the Prefect Cloud workspace URL and API key, and your serverless functions become observable in your Prefect coordination plane.

This also allows you to run your flows with other schedulers, e.g., with CRON jobs or scheduled GitHub Actions while maintaining the observability from the Prefect UI.

#3 Simpler deployment patterns

Prefect 2 allows something simple but unique: multiple deployments of the same flow. This means that you can schedule the same workflow:

  • with different parameters — imagine the possibilities where you build one generic flow that can be used by many (business) users depending on which default parameter values are passed to a deployment; those default parameter values are validated with pydantic to ensure correct data types and they can be dynamically overridden at runtime when needed
  • across various types of infrastructure — the same flow can run on multiple different Python versions and even on different public clouds and simultaneously on-prem; some users leverage that pattern to test their build artifacts across different Python versions
  • across various environments — the same flow can have a separate deployment for dev and prod simply by pointing to different storage (e.g., GitHub block with main and dev branches) and infrastructure blocks (e.g. dev and prod Kubernetes job blocks)

The deployment experience is simple and intuitive. Once you have defined your storage and infrastructure blocks, you can reuse them across all deployments. The deployment’s entrypoint tells Prefect which flow script and flow function to run, and the queue name helps to ensure that the right agent will pick up scheduled flow runs from that deployment.

Additionally, you can define your deployment from the CLI so that your code is simple and clean. Your deployment metadata can live in your CI/CD pipeline or a bash script without cluttering flow code representing your business logic.

#4 No GraphQL

One common pain point we’ve heard from our users was the complexity of GraphQL. While powerful, it can be difficult to integrate with the Python ecosystem, and it can result in unexpected and complicated access patterns. Prefect 2, in contrast, ships with a REST API and Swagger UI, making it easy to interact with and simplifying many API calls.

#5 Package dependency management

Given that infrastructure blocks are self-contained, packaging your code and dependencies is finally painless. When you create a deployment, your code dependencies can be automatically uploaded to S3, GCS, or Azure blob storage. In the same way, Prefect can clone your entire GitHub repository at runtime so that your code dependencies are properly added to your execution environment. And if you prefer to package your code and dependencies into a single Docker image, that’s possible too!

#6 More open-source features

The open-source Prefect 2 is more powerful than Prefect 1. Concurrency limits, automated flow run notifications, and secrets were previously only available to Prefect Cloud users. All that is now available in Orion.

Blocks allow you to securely store credentials to external systems and allow adding business logic relevant to those systems.

#7 Notifications

Along with blocks, we introduced the ability to create notifications. You can configure them directly from the UI without boilerplate code. They allow you to send automated alerts, e.g., when your flow fails, crashes, or gets canceled. Similarly, you can get notified, e.g., via Slack, anytime your critical flow is completed.

#8 Collections & Blocks as integrations

Third-party systems integrations are managed via dedicated libraries that we call Prefect Collections. To avoid unnecessary dependency conflicts, you only install the collections that you need. Additionally, thanks to Prefect Blocks, managing credentials to third-party systems is handled simply and securely — once you fill out the form in the UI or via code, you can load the block in any part of your flow and call its methods.

The collection’s catalog keeps growing every month, and if you want to contribute a new one, our Integrations team streamlined and standardized this process. There is a collection template with prebuilt GitHub Action workflows that makes it easy to build, document, and release your custom collection.

This separation allows integrations to be maintained independently from the core package release cycle.

#9 Async support

Prefect 2 supports asynchronous execution out of the box. You can either submit your tasks to the default ConcurrentTaskRunner, or make your tasks and flows async to make IO-based operations faster without having to deal with multithreading or distributed compute clusters.

#10 Parallel execution with Dask and Ray

To parallelize execution on a distributed compute cluster, you can add DaskTaskRunner or RayTaskRunner to your flow decorator. You can either connect to an already existing cluster or create one on demand. In comparison to Prefect 1, logs from Dask workers are now captured by the Prefect 2 backend.

Next steps

We’ve launched a dedicated website upgrade.prefect.io to make migration as seamless as possible. If you have any questions, you can reach us via our Slack Community and Discourse.

Happy Engineering!

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Anna Geller

Anna Geller

5.1K Followers

Lead Community Engineer at Prefect, Data Professional, Cloud & .py fan. www.annageller.com. Get my articles via email: https://annageller.medium.com/subscribe