You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would be really useful if there was a configuration option for the Table.read() method that would allow for non-blocking execution - i.e. that would not stop on the first error but rather return a collection of all errors from the validation of the full stream.
There are scenarios that it is really difficult to have users fix one row at a time, plus it adds up effort and complexity on the integrator's side as the only way to mimic such behavior is through a recursive approach which is inefficient since it involves opening the stream multiple times and start reading from the line after the last failed one.
The text was updated successfully, but these errors were encountered:
@spilio @anuveyatsu
This idea is in the air. And it's implemented for tabulator-py. I've added a task description.
roll
changed the title
Option for non-blocking CSV validation with collection of errors
Option to collect all cast errors for table.read() call
Dec 10, 2017
Overview
For now
table.read
fails with atableschema.Error
on the first cast error. We'd like to have an ability to get all errors from thetable.read()
call.Here is an example how it's implemented for
tabulator
with aforce_parse
option - https://github.com/frictionlessdata/tabulator-py#force-parse.We e.g. could use a
force_cast
option.From @spilio
It would be really useful if there was a configuration option for the
Table.read()
method that would allow for non-blocking execution - i.e. that would not stop on the first error but rather return a collection of all errors from the validation of the full stream.There are scenarios that it is really difficult to have users fix one row at a time, plus it adds up effort and complexity on the integrator's side as the only way to mimic such behavior is through a recursive approach which is inefficient since it involves opening the stream multiple times and start reading from the line after the last failed one.
The text was updated successfully, but these errors were encountered: