ไธญ

OpenMLDB Node

Overview

OpenMLDB is an excellent open source machine learning database, providing a full-stack FeatureOps solution for production.

OpenMLDB task plugin used to execute tasks on OpenMLDB cluster.

Create Task

  • Click Project Management -> Project Name -> Workflow Definition, and click the Create Workflow button to enter the DAG editing page.
  • Drag from the toolbar task node to canvas.

Task Parameters

Parameter Description
zookeeper OpenMLDB cluster zookeeper address, e.g. 127.0.0.1:2181.
zookeeper path OpenMLDB cluster zookeeper path, e.g. /openmldb.
Execute Mode Determine the init mode, offline or online. You can switch it in sql statement.
SQL statement SQL statement.
Custom parameters It is the user-defined parameters of Python, which will replace the content with ${variable} in the script.

Task Examples

Load data

load data

We use LOAD DATA to load data into OpenMLDB cluster. We select offline here, so it will load to offline storage.

Feature extraction

fe

We use SELECT INTO to do feature extraction. We select offline here, so it will run sql on offline engine.

Environment to Prepare

Start the OpenMLDB Cluster

You should create an OpenMLDB cluster first. If in production env, please check deploy OpenMLDB.

You can follow run OpenMLDB in docker to a quick start.

Python Environment

The OpenMLDB task will use OpenMLDB Python SDK to connect OpenMLDB cluster. So you should have the Python env.

We will use python3 by default. You can set PYTHON_HOME to use your custom python env.

Make sure you have installed OpenMLDB Python SDK in the host where the worker server running, using pip install openmldb.