## Introduction This module contains the implementation patch and installation scripts for "MADlib", which support AI in DB in openGauss. Currently, openGauss supports machine learning algorithm in MADlib17. ## Before Installation Madlib relies on plpython2. So, we must compile GaussDB with python. #### Check Python Environment python version must >= 2.7.12, we highly recommend 2.7.17 or 2.7.18 1) if your python version >= 2.7.12, you can install `yum install python-devel`, others, please goto step 2. 2) install python2.7.18 by yourself with '**--enable-shared**' option when configure. ``` ./configure --prefix=YOUR_XXX --enable-shared --enable-unicode=ucs4 make -sj;make install -sj ``` #### Re-compile Database Compile openGauss, with '**--with-python**' option when configure. Installation MADlib ---------------------------------------------------------------------------- ### Compile 1. patch MADlib. ``` tar -zxf apache-madlib-1.17.0-src.tar.gz cp madlib.patch apache-madlib-1.17.0-src cd apache-madlib-1.17.0-src/ patch -p1 < madlib.patch ``` 2. compile MADlib: **MADlib will download dependent software while compiling.** 1. If your machine can connect to Internet. you can run: ``` ./configure -DCMAKE_INSTALL_PREFIX={YOUR_MADLIB_INSTALL_FOLDER} # your install folder -DPOSTGRESQL_EXECUTABLE=$GAUSSHOME/bin/ -DPOSTGRESQL_9_2_EXECUTABLE=$GAUSSHOME/bin/ -DPOSTGRESQL_9_2_CLIENT_INCLUDE_DIR=$GAUSSHOME/bin/ -DPOSTGRESQL_9_2_SERVER_INCLUDE_DIR=$GAUSSHOME/bin/ make && make install -sj ``` 2. If your machine cannot download dependcy online. you must download Dependent Software by yourself. - PyXB-1.2.6.tar.gz, http://sourceforge.net/projects/pyxb/files/PyXB-1.2.6.tar.gz - eigen-branches-3.2.tar.gz, https://github.com/madlib/eigen/archive/branches/3.2.tar.gz - boost_1_61_0.tar.gz ``` ./configure -DCMAKE_INSTALL_PREFIX={YOUR_MADLIB_INSTALL_FOLDER} # your install folder -DPYXB_TAR_SOURCE={YOUR_DEPENDENCY_FOLDER}/PyXB-1.2.6.tar.gz # change to your local folder -DEIGEN_TAR_SOURCE={YOUR_DEPENDENCY_FOLDER}/eigen-branches-3.2.tar.gz # change to your local folder -DBOOST_TAR_SOURCE={YOUR_DEPENDENCY_FOLDER}/boost_1_61_0.tar.gz # change to your local folder -DPOSTGRESQL_EXECUTABLE=$GAUSSHOME/bin/ -DPOSTGRESQL_9_2_EXECUTABLE=$GAUSSHOME/bin/ -DPOSTGRESQL_9_2_CLIENT_INCLUDE_DIR=$GAUSSHOME/bin/ -DPOSTGRESQL_9_2_SERVER_INCLUDE_DIR=$GAUSSHOME/bin/ make && make install -sj ``` 3. Finished ### Install MADlib #### install python package some algorithm depends on python package. ``` pip install numpy==1.14.5 pip install pandas==0.24.2 pip install scipy ``` gsql connects to your database. ``` create database <YOUR_DATABASE> dbcompatibility='B'; ``` ``` cd {YOUR_MADLIB_INSTALL_FOLDER} ./madpack -s <YOUR_SCHEMA> -p opengauss -c <DATABASE_USERNAME>@127.0.0.1:<PORT>/<YOUR_DATABASE> install ``` #### Additional software 1) if you use facebook prophet ``` pip install pystan pip install holidays==0.9.8 pip install fbprophet==0.3.post2 ``` 2) if you use xgboost ``` pip install xgboost pip install scikit-learn ``` #### Primary/secondary Your need to copy python and `{YOUR_MADLIB_INSTALL_FOLDER}` to the same path in secondary machine.