Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add itemType constraint to array type #409

Closed
akariv opened this issue Apr 30, 2017 · 16 comments · Fixed by frictionlessdata/datapackage-v2-draft#38
Closed

Add itemType constraint to array type #409

akariv opened this issue Apr 30, 2017 · 16 comments · Fixed by frictionlessdata/datapackage-v2-draft#38
Assignees
Milestone

Comments

@akariv
Copy link
Member

akariv commented Apr 30, 2017

This attribute will allow specifying the exact items that are allowed in an array.

e.g.

{
   "name": "years",
   "type": "array",
   "constraints": {
      "typeOf": "year"
  }
}

Related

Doing this with appropriate semantics would also resolve

@rufuspollock rufuspollock added this to the v1.1 milestone May 24, 2017
@rufuspollock
Copy link
Contributor

@akariv really like it. Again may be v1.1

@danfowler
Copy link
Contributor

Related: #381

@pwalsh
Copy link
Member

pwalsh commented May 29, 2017

+1 on this, and we should do a PR once v1 settles.

@rufuspollock
Copy link
Contributor

@akariv would you be up for doing a PR to implement this?

I also think that rather than just have the type we might as well go the whole hog and just have the whole field schema be reused for the individual item. In this case the property name might be something like an itemSchema or itemType.

Alternatively, since most of the constraints in constraints only make sense for individual values we could implicitly apply them to the individual values in the array e.g. max, min, enum etc in which case we do only need typeOf.

Thoughts @roll ...

@akariv
Copy link
Member Author

akariv commented Apr 30, 2020

@rufuspollock that's my suggestion in #410

@rufuspollock
Copy link
Contributor

rufuspollock commented Apr 30, 2020 via email

@akariv
Copy link
Member Author

akariv commented Apr 30, 2020

table type could be seen as syntactic sugar for an array field of itenType=object and a mandatory schema property.

In both cases the datatype is the same in all array entries.

@roll
Copy link
Member

roll commented May 4, 2020

I think itemType should be as simple as possible thing. I would move all the complexity to something like #410 and v2 of the specs

@rufuspollock
Copy link
Contributor

@roll generally agree - though could you clarify what "as simple as possible" looks like for you - and would it address the related stuff e.g. #549?

@roll
Copy link
Member

roll commented May 4, 2020

@rufuspollock

I think there are two options

Simple

https://specs.frictionlessdata.io/table-schema/#array

array
The field contains data that is a valid JSON format arrays.

Because it's a JSON we can probably treat items just as native JSON values (itemType):

  • string
  • number
  • boolean
  • null

It will mean no processing on the TableSchema level but just checking that a JSON has items with the same given type e.g. [1,2,3] or [true,false] - just re-using native JSON types.

We can also add a itemConstraints property.

This option will keep things simple.

Complex

Another option is to have full itemSchema or something like as you mentioned. This adds Table Schema level processing to array's items e.g. an ability to extract parsed vales from items as strings ["10$", "30$"] or ["2020-02-01", "2020-04-05"].

With the second option, we add an ability to process array's items with all the power of Table Schema.

This option is much more complex though.

@akariv
Copy link
Member Author

akariv commented May 4, 2020 via email

@roll
Copy link
Member

roll commented May 4, 2020

But then, we would need also itemConstraints, itemFormat,
itemDecimalChar and there's no end to this.

Yea exactly, without all the options a Table Schema type is partially useless.

I think that itemType should have the same semantics as type (i.e.read a tableschema
type name and not a json).

It can be itemJsonType. It will

I'm not sure that it's the best option just trying to figure out the pros.

@JDziurlaj
Copy link

There is something elegant in reusing the existing schema structures when adding this feature. I don't think constraining the arrays to simple JSON types would provide a lot of value. I favor @akariv's proposal.

@rufuspollock
Copy link
Contributor

OK, it looks like we have convergence here.

@pwalsh
Copy link
Member

pwalsh commented Sep 29, 2021

@roll @rufuspollock

I see frictionlessdata/frictionless-py#627 has been referenced here but it solves a fundamentally different issue.

It would be great to see this original request added. It is very common that members of an array be same typed, and declaring it as part of table schema would allow:

  • In the SQL driver, use of Array fields for backends that support it (which would be a more logically correct mapping than JSONB fields, and allow round-tripping data)
  • In the Elasticsearch driver, use of array fields
  • Most likely most other data backends that support an array as a type would require members to have the same type

Note that this behavior ( also #410 ) is already implemented in some form in https://github.com/frictionlessdata/tableschema-elasticsearch-py and has a bunch of successful usage in production systems.

@rufuspollock
Copy link
Contributor

Happy to have PRs on this - or even start with a pattern and implement into say python. Remember the complex thing with the specs now is that one then needs all drivers to upgrade to be compliant.

@roll roll modified the milestones: v1.1, v2 Apr 14, 2023
@roll roll removed the ready for PR label Jan 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment