Analyzing Rejected Rows

This section describes some common cases in which rows are rejected and explains how to interpret and fix the errors. Common error conditions include:
  • Incorrect field delimiter specified in the ybload command
  • Incorrect line separator specified in the ybload command
  • Mismatch between number of columns in table and number of fields in input data
  • Mismatch between data types in table and values in input data fields
  • Blank lines in source files (or unexpected header/footer information)
  • Out-of-range values in source files (for numeric data)
  • Incorrect format for dates and timestamps
  • Incorrect format for NULL values

Incorrect Number of Fields

The following example is a simple case where the number of fields in the source file (6) is greater than the number of columns in the target table (5).
$ more match.csv.20160707112858.bad
# Bulk Loading /home/premdata/match.csv into TABLE matchstats at 2016-07-07T11:28:58.607-07:00[America/Los_Angeles]

#error: lineByteRange(0-35)
#reason: Too many fields (6 > 5)
1,1992-08-01 00:00:00,2,52,0-1,-  

#error: lineByteRange(35-70)
#reason: Too many fields (6 > 5)
1,1992-08-01 00:00:00,2,55,0-1,-  

#error: lineByteRange(70-105)
#reason: Too many fields (6 > 5)
1,1992-08-01 00:00:00,2,63,2-1,-  

...

To fix this load, you can modify the target table, modify the source file, or use the --csv-allow-too-many-fields option in the ybload command.

Incorrect Field Delimiter

Note the location in the row (...<---) where the error is occurring. The arrow points to a sequence of dots (...), which represent the extent of the field that generated the error.

In this case, the problem is the field delimiter for the rows, not the date format of the first column. The load was attempted with a default comma delimiter, but the source file uses a pipe character as its delimiter. The bulk loader ignores the pipe characters, attempts to read all five columns as a single date column, and fails (predictably) on the date format.

The fieldinfo line in the output contains the following information:
  • destName: the name of the destination table column that the field in the source file is being parsed into.
  • destType: the data type of the destination table column.
#error: lineByteRange(810-890):fieldByteRange(1-75)
#reason: Doesn't match any of the allowed date formats: YMD,DMONY,MONDY
#fieldinfo: srcIndex:1 destName:matchday destType:DATE  options:{"formats":[{"style":"YMD","delim":"auto"},{"style":"DMONY","delim":"auto"},{"style":"MONDY",
"delim":"auto"}]}
#..........................................................................<--------- Doesn't match any of the allowed date formats: YMD,DMONY,MONDY
 1993-08-14 | Liverpool               | Sheffield Wednesday     | 2-0     |    

#error: lineByteRange(890-970):fieldByteRange(1-75)
#reason: Doesn't match any of the allowed date formats: YMD,DMONY,MONDY
#fieldinfo: srcIndex:1 destName:matchday destType:DATE  options:{"formats":[{"style":"YMD","delim":"auto"},{"style":"DMONY","delim":"auto"},{"style":"MONDY",
"delim":"auto"}]}
#..........................................................................<--------- Doesn't match any of the allowed date formats: YMD,DMONY,MONDY
 1993-08-14 | Southampton             | Everton                 | 0-2     |    

To fix this load, add the delimiter option to the ybload command: --csv-delimiter '|'

Incorrect Date Format

In this example, the incoming dates do not match any of the default formats. See ybload Date Formats.
#error: lineByteRange(0-58):fieldByteRange(38-56)
#reason: Doesn't match any of the allowed date formats: YMD,DMONY,MONDY
#fieldinfo: srcIndex:10 destName:saletime destType:TIMESTAMP  options:{"dateformats":[{"style":"YMD","delim":"auto"},{"style":"DMONY","delim":"auto"},{"style":"MONDY","delim":"auto"}],"timeformats":[{"style":"HMSs","delim":"auto"},{"style":"HMS","delim":"auto"},{"style":"HM","delim":"auto"}]}
#                                     ..................<--------- Doesn't match any of the allowed date formats: YMD,DMONY,MONDY
1|1|36861|21191|7872|1875|4|728|109.2|2/18/2008 02:36:48

#error: lineByteRange(571-628):fieldByteRange(38-55)
#reason: Doesn't match any of the allowed date formats: YMD,DMONY,MONDY
#fieldinfo: srcIndex:10 destName:saletime destType:TIMESTAMP  options:{"dateformats":[{"style":"YMD","delim":"auto"},{"style":"DMONY","delim":"auto"},{"style":"MONDY","delim":"auto"}],"timeformats":[{"style":"HMSs","delim":"auto"},{"style":"HMS","delim":"auto"},{"style":"HM","delim":"auto"}]}
#                                     .................<--------- Doesn't match any of the allowed date formats: YMD,DMONY,MONDY
11|12|45635|8435|4769|2042|2|130|19.5|8/4/2008 03:06:36

Incorrect Format for NULL Values

In this example, the parser does not recognize that the \N string is intended to be the null string. The parser expects an integer value for the column.
#error: lineByteRange(7035-7069):fieldByteRange(30-32)
#reason: Non-digit character
#fieldinfo: srcIndex:5 destName:venueseats destType:INTEGER  options:{}
#                             ..<--------- Non-digit character
250|Sahara Hotel|Las Vegas|NV|\N

#error: lineByteRange(7136-7170):fieldByteRange(30-32)
#reason: Non-digit character
#fieldinfo: srcIndex:5 destName:venueseats destType:INTEGER  options:{}
#                             ..<--------- Non-digit character
253|Mirage Hotel|Las Vegas|NV|\N
This example can be fixed with the --nullmarker option:
$ ybload --csv-delimiter '|' --nullmarker '\N' -t venue /home/data/venue.tbl
...

Fields Not Allowed To Be Null

In this example, the DDL for the table declared that some Boolean columns could not be null:

#error: lineByteRange(1324-1403):fieldByteRange(58-58)
#reason: Field not allowed to be null
#fieldinfo: srcIndex:9 destName:likesports destType:BOOLEAN NOT NULL  options:{}
#                                                         <--------- Field not allowed to be null
12|FVK28WAS|Bruce|Beck|Kona|OH|ac@velit.ca|(617) 527-9908|||FALSE|||FALSE||||

#error: lineByteRange(135-267):fieldByteRange(100-100)
#reason: Field not allowed to be null
#fieldinfo: srcIndex:9 destName:likesports destType:BOOLEAN NOT NULL  options:{}
#                                                                                                   <--------- Field not allowed to be null
2|PGL08LJI|Vladimir|Humphrey|Murfreesboro|SK|Suspendisse.tristique@nonnisiAenean.edu|(783) 492-1886||||TRUE|TRUE|||TRUE|FALSE|TRUE

#error: lineByteRange(1403-1532):fieldByteRange(102-102)
#reason: Field not allowed to be null
#fieldinfo: srcIndex:9 destName:likesports destType:BOOLEAN NOT NULL  options:{}
#                                                                                                     <--------- Field not allowed to be null
13|QTF33MCG|Henry|Cochran|Bossier City|QC|Aliquam.vulputate.ullamcorper@amalesuada.org|(783) 105-0989||TRUE|||||TRUE|TRUE|TRUE|