Additional Examples

Some additional examples

Loading CSV, JSON & Parquet files into StarRocks

# Setup Connection
$ sling conns set STARROCKSLOCAL url=starrocks://root:@localhost:9030/
11:55AM INF connection `STARROCKSLOCAL` has been set in /Users/me/.sling/env.yaml. Please test with `sling conns test STARROCKSLOCAL`

$ sling conns test STARROCKSLOCAL
11:55AM INF success!

# Loading file: https://github.com/slingdata-io/sling-cli/files/14201759/example1.csv
$ sling run --src-stream 'file:///Users/me/Downloads/example1.csv' --tgt-conn STARROCKSLOCAL --tgt-object 'albert.call_center2' --mode full-refresh --src-options '{"header":false}' --primary-key col_001
11:24AM INF connecting to target database (starrocks)
11:24AM INF reading from source file system (file)
11:24AM INF writing to target database [mode: full-refresh]
11:24AM INF streaming data
11:24AM INF dropped table `albert`.`call_center2`
11:24AM INF created table `albert`.`call_center2`
11:24AM INF inserted 4 rows into `albert`.`call_center2` in 0 secs [7 r/s]
11:24AM INF execution succeeded

# Loading https://github.com/slingdata-io/sling-cli/files/14201762/example2.json
$ sling run --src-stream 'file:///Users/me/Downloads/example2.json' --tgt-conn STARROCKSLOCAL --tgt-object 'albert.call_center4' --mode full-refresh --src-options '{"flatten":true}' --primary-key code
11:30AM INF connecting to target database (starrocks)
11:30AM INF reading from source file system (file)
11:30AM INF writing to target database [mode: full-refresh]
11:30AM INF streaming data
11:30AM INF dropped table `albert`.`call_center4`
11:30AM INF created table `albert`.`call_center4`
11:30AM INF inserted 1 rows into `albert`.`call_center4` in 0 secs [2 r/s]
11:30AM INF execution succeeded

# Loading a parquet file
$ sling run --src-stream 'file:///Users/me/sandbox/tpcds-parquet/call_center.parquet' --tgt-conn STARROCKSLOCAL --tgt-object 'albert.call_center' --mode full-refresh --primary-key cc_call_center_sk
1:28PM INF connecting to target database (starrocks)
1:28PM INF reading from source file system (file)
1:28PM INF writing to target database [mode: full-refresh]
1:28PM INF streaming data
1:28PM INF dropped table `albert`.`call_center`
1:28PM INF created table `albert`.`call_center`
1:28PM INF inserted 6 rows into `albert`.`call_center` in 1 secs [6 r/s]
1:28PM INF execution succeeded

Loading Data from MySQL to StarRocks

# Download Sample
$ wget https://github.com/datacharmer/test_db/releases/download/v1.0.7/test_db-1.0.7.tar.gz
$ gunzip test_db-1.0.7.tar.gz
$ tar xvf test_db-1.0.7.tar

# Launching MySQL Container
$ docker run --rm -p 9030:9030 -p 8030:8030 -p 8040:8040 -it starrocks/allin1-ubuntu
$ docker container run -d --name=LocalMySQLDB -p 3306:3306 -e MYSQL_ROOT_PASSWORD=password mysql
$ mysql -P 3306 -h 127.0.0.1 -u root -p --prompt="mysql > " < ./employees.sql

# Target Database Setup
$ mysql -P 9030 -h 127.0.0.1 -u root --prompt="StarRocks > "
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 14
Server version: 5.1.0 3.2.2-269e832

Copyright (c) 2000, 2024, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

StarRocks > create database albert;
Query OK, 0 rows affected (0.02 sec)

# Sling Setup
$ sling conns set MYSQLLOCAL url=mysql://root:password@localhost:3306/employees
5:01PM INF connection `MYSQLLOCAL` has been set in /Users/me/.sling/env.yaml. Please test with `sling conns test MYSQLLOCAL`

$ sling conns test MYSQLLOCAL
5:01PM INF success!

$ sling conns set STARROCKSLOCAL url=starrocks://root:@localhost:9030/albert
11:55AM INF connection `STARROCKSLOCAL` has been set in /Users/me/.sling/env.yaml. Please test with `sling conns test STARROCKSLOCAL`

$ sling conns test STARROCKSLOCAL
11:55AM INF success!

# Sling execution
$ sling run --src-conn MYSQLLOCAL --src-stream employees.employees --tgt-conn STARROCKSLOCAL --tgt-object albert.employees --tgt-options '{ table_keys: { primary: [ emp_no ], hash: [ emp_no ] } }'

Last updated