Bulk Load Examples

The following examples show how to use various ybload options.
Note: The output includes a printout of the command-line options that you used; however, do not try to copy and reuse this command information as printed. Quotes may be removed, variables substituted, wildcard characters expanded, and so on.

The first example shows the ABOUT CLIENT message block, which is always returned to the console unless the --quiet option and associated logging options are used. Subsequent examples omit this output.

Load the match table with the -d option from a single CSV file

$ ybload -d premdb --username bobr -t match -W --format csv /home/bobr/match.csv 
Password for user bobr: 
18:18:32.650 [ INFO] ABOUT CLIENT:
    app.cli_args         = -d premdb --username bobr -t match -W --format csv /home/bobr/match.csv 
    app.name_and_version = ybload version 2.0.0-10711
    java.home            = /usr/lib/jvm/java-8-oracle/jre
    java.version         = 1.8.0_101
    jvm.memory           = 981.50 MB (max=14.22 GB)
    jvm.name_and_version = Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode)
    jvm.options          = -Xmx16g, -Xms1g, -ea, -Dapp.name=ybload, -Dapp.pid=19168, -Dapp.repo=/usr/lib/ybtools/lib, -Dapp.home=/usr/lib/ybtools, -Dbasedir=/usr/lib/ybtools
    jvm.vendor           = Oracle Corporation
    os.name_and_version  = Linux 4.4.0-31-generic (amd64)

18:18:32.697 [ INFO] Gathering metadata on input files
18:18:32.914 [ INFO] Assuming source encoding matches database server encoding: LATIN9
18:18:32.946 [ INFO] Starting 1 source PreReaders
18:18:32.967 [ INFO] Auto-detected line separator = '\n'
18:18:32.977 [ INFO] Configuration (record/field separation):
    --format                           : CSV
    --delimiter                        : ,
    --linesep                          : \n
    --quote-char                       : "
    --escape-char                      : "
    --trim-white                       : false
    --skip-blank-lines                 : true
    --on-missing-field                 : ERROR
    --on-extra-field                   : ERROR
    --on-unescaped-embedded-quote      : ERROR
    Internal scanner                   : RecordScannerQuote_1_1
18:18:33.004 [ INFO] Using database locale: C
18:18:33.846 [ INFO] Bad rows will be written to /home/bobr/match.csv.20190129181833.bad
18:18:33.854 [ INFO] Starting import of 1 files
18:18:34.127 [ INFO] Starting 4 segment readers
18:18:34.128 [ INFO] Opening transaction #1...
18:18:34.183 [ INFO] Opened transaction #1
18:18:34.234 [ INFO] Flushing last of 8606 rows for transaction #1
18:18:34.238 [ INFO] Committing 8606 rows into transaction #1...
18:18:34.256 [ INFO] Committed transaction #1 after a total of 292604 bytes and 8606 good rows
18:18:34.275 [ INFO] READ:305.8KB(193.0KB/s). ROWS G/B: 8606/0( 5.30K/s). WRITE:285.7KB(180.3KB/s).  TIME E/R:   0:00:01/ --:--:--
18:18:34.276 [ INFO] SUCCESSFUL BULK LOAD: Loaded 8606 good rows in   0:00:01 (READ: 193.0KB/s WRITE: 180.3KB/s)

Load the same table from a CSV file with a single header line

$ ybload -d premdb --username bobr -t match -W --format csv --num-header-lines 1 /home/bobr/match_with_header.csv 
Password for user bobr: 
...
18:39:24.888 [ INFO] Gathering metadata on input files
18:39:25.163 [ INFO] Assuming source encoding matches database server encoding: LATIN9
18:39:25.233 [ INFO] Starting 1 source PreReaders
18:39:25.256 [ INFO] Auto-detected line separator = '\n'
18:39:25.263 [ INFO] Configuration (record/field separation):
    --format                           : CSV
    --delimiter                        : ,
    --linesep                          : \n
    --quote-char                       : "
    --escape-char                      : "
    --trim-white                       : false
    --skip-blank-lines                 : true
    --on-missing-field                 : ERROR
    --on-extra-field                   : ERROR
    --on-unescaped-embedded-quote      : ERROR
    Internal scanner                   : RecordScannerQuote_1_1
18:39:25.290 [ INFO] Using database locale: C
18:39:26.047 [ INFO] Bad rows will be written to /home/bobr/match_with_header.csv.20190129183926.bad
18:39:26.055 [ INFO] Starting import of 1 files
18:39:26.326 [ INFO] Starting 4 segment readers
18:39:26.327 [ INFO] Opening transaction #1...
18:39:26.374 [ INFO] Opened transaction #1
18:39:26.411 [ INFO] Flushing last of 8606 rows for transaction #1
18:39:26.414 [ INFO] Committing 8606 rows into transaction #1...
18:39:26.433 [ INFO] Committed transaction #1 after a total of 292604 bytes and 8606 good rows
18:39:26.452 [ INFO] READ:305.8KB(194.7KB/s). ROWS G/B: 8606/0( 5.35K/s). WRITE:285.7KB(181.9KB/s).  TIME E/R:   0:00:01/ --:--:--
18:39:26.453 [ INFO] SUCCESSFUL BULK LOAD: Loaded 8606 good rows in   0:00:01 (READ: 194.7KB/s WRITE: 181.9KB/s)

Load the same table from multiple CSV files

$ ybload -d premdb --username bobr -t match -W --format csv /home/bobr/premdata/matchdata/*
Password for user bobr: 
...
18:50:39.278 [ INFO] Auto-detected line separator = '\n'
18:50:39.288 [ INFO] Configuration (record/field separation):
    --format                           : CSV
    --delimiter                        : ,
    --linesep                          : \n
    --quote-char                       : "
    --escape-char                      : "
    --trim-white                       : false
    --skip-blank-lines                 : true
    --on-missing-field                 : ERROR
    --on-extra-field                   : ERROR
    --on-unescaped-embedded-quote      : ERROR
    Internal scanner                   : RecordScannerQuote_1_1
18:50:39.320 [ INFO] Using database locale: C
18:50:40.119 [ INFO] Bad rows will be written to /home/bobr/ybload.20190129185040.bad
18:50:40.127 [ INFO] Starting import of 3 files
18:50:40.416 [ INFO] Starting 4 segment readers
18:50:40.417 [ INFO] Opening transaction #1...
18:50:40.461 [ INFO] Opened transaction #1
18:50:40.557 [ INFO] Flushing last of 25818 rows for transaction #1
18:50:40.561 [ INFO] Committing 25818 rows into transaction #1...
18:50:40.586 [ INFO] Committed transaction #1 after a total of 877812 bytes and 25818 good rows
18:50:40.607 [ INFO] READ:917.4KB(570.9KB/s). ROWS G/B: 25818/0(15.69K/s). WRITE:857.2KB(533.5KB/s).  TIME E/R:   0:00:01/ --:--:--
18:50:40.607 [ INFO] SUCCESSFUL BULK LOAD: Loaded 25818 good rows in   0:00:01 (READ: 570.9KB/s WRITE: 533.5KB/s)

Load a table with console logging turned off

This example turns off console logging, sends the output to a file, and sets the log level for the output that goes to that file.
$ ybload -d premdb --username bobr -t match -W --format csv --quiet --logfile load_match_log --logfile-log-level INFO /home/bobr/match.csv 
Password for user bobr: 

$ more load_match_log
2019-01-30 16:00:21.830 [ INFO] <main>  ABOUT CLIENT:
    app.cli_args         = -d premdb --username bobr -t match -W --format csv --quiet --logfile load_match_log --logfile-log-level INFO /home/bobr/match.csv 
    app.name_and_version = ybload version 2.0.0-10711
    java.home            = /usr/lib/jvm/java-8-oracle/jre
    java.version         = 1.8.0_101
    jvm.memory           = 981.50 MB (max=14.22 GB)
    jvm.name_and_version = Java HotSpot(TM) 64-Bit Server VM (build 25.101-b13, mixed mode)
    jvm.options          = -Xmx16g, -Xms1g, -ea, -Dapp.name=ybload, -Dapp.pid=12595, -Dapp.repo=/usr/lib/ybtools/lib, -Dapp.home=/usr/lib/ybtools, -Dbasedir=/usr/lib/ybtools
    jvm.vendor           = Oracle Corporation
    os.name_and_version  = Linux 4.4.0-31-generic (amd64)

2019-01-30 16:00:21.831 [ INFO] <main>  Logfile written to load_match_log
...
2019-01-30 16:00:23.580 [ INFO] <main>  SUCCESSFUL BULK LOAD: Loaded 8606 good rows in   0:00:01 (READ: 179.2KB/s WRITE: 167.5KB/s)

Stop and roll back a load if >3 bad rows are found and log the errors

$ ybload -d premdb --username bobr -t team --max-bad-rows 3 -W --quiet --logfile load_team_log --logfile-log-level error /home/bobr/badteam.csv 
Password for user bobr: 

brumsby@brumsby:~$ more load_team_log
2019-01-30 16:10:46.559 [ERROR] <main>  Unable to separate row into fields for parsing: Too few fields (8 < 9)
Bad row = "4,53,Barnsley,Tykes,Barnsley,Oakwell Stadium,23009,0"
2019-01-30 16:10:46.561 [ERROR] <main>  Unable to separate row into fields for parsing: Too few fields (8 < 9)
Bad row = "18,67,Everton,Toffees,Liverpool,Goodison Park,40221,38.124"
2019-01-30 16:10:46.561 [ERROR] <main>  Unable to separate row into fields for parsing: Too few fields (8 < 9)
Bad row = "31,80,Oldham Athletic,Latics,Oldham,Boundary Park,13309,0"
2019-01-30 16:10:46.562 [ERROR] <main>  Unable to separate row into fields for parsing: Too few fields (8 < 9)
Bad row = "32,81,Portsmouth,Pompey,Portsmouth,Fratton Park,21100,0"
2019-01-30 16:10:46.562 [FATAL] <main>  Number of bad rows exceeded maximum (3)
2019-01-30 16:10:46.620 [FATAL] <main>  FAILED BULK LOAD: Nothing committed

Load a table from a text file with a special escape character

$ ybload -d premdb --username bobr -W -t matchstats --format text --escape-char '/' /home/bobr/premdata/matchstats.txt 
Password for user bobr:
...

Load a table from a pipe-delimited file with literal "null" values

$ ybload -d premdb --username bobr -W -t team --format text --delimiter '|' --nullmarker 'null' /home/bobr/premdata/team2019.txt
Password for user bobr: 
...

Load a table from a pipe-delimited file with more fields than the table has columns

ybload -d premdb --username bobr -W -t hometeam --format text --delimiter '|' --on-extra-field remove /home/ybdata/hometeam.txt
Password for user bobr: 
...

Load a calendar table with a specific date format

The --date-style option is set to MDY, which means that ybload can parse and load incoming date values such as 08-13-2018.

$ ybload -d premdb --username bobr -t -W season_date --format text --delimiter '|' --date-style MDY /home/ybdata/season_dates.txt
Database login password:
...

Load a table with a case-sensitive name

Note the -t syntax required to load a table that was created with a quoted identifier:
$ ybload -d premdb --username bobr -W -t \"AwayTeam\" --delimiter ',' /home/premdata/awayteam.csv
...
12:19:02.492 [ INFO] Bulk Loading /home/premdata/awayteam.csv into TABLE "public"."AwayTeam" at 2018-10-27T12:19:01.757-07:00[America/Los_Angeles]
...

See also Loading a Table from Amazon S3.