Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document CubiQL data requirements #92

Merged
merged 6 commits into from
Aug 2, 2018

Conversation

zeginis
Copy link
Contributor

@zeginis zeginis commented Jun 22, 2018

Fix #78

Copy link
Member

@RickMoynihan RickMoynihan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, will ask others here to check it before merging.

- Reference area should be defined as: http://purl.org/linked-data/sdmx/2009/dimension#refArea
- Reference period should be defined as: http://purl.org/linked-data/sdmx/2009/dimension#refPeriod
- Time values should be expressed using reference.data.gov.uk e.g. http://reference.data.gov.uk/id/year/2016
- A qb:codeList should be defined for each dimension of the cube (except refArea and refPeriod) that contains *only* the values used at the cube
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment:

used at the cube

Should be

used in the cube

@mohadelrezk
Copy link

mohadelrezk commented Jul 12, 2018

Hi @zeginis,
Do you recommend changing our cube creation pipeline to work with the current Data restrictions?
or we shall ignore those restrictions as they are temporary problems that you are currently addressing?

@zeginis
Copy link
Contributor Author

zeginis commented Jul 12, 2018

@mohadelrezk some of them will be fixed but not all of them (at least for now). The status is the following:

@mohadelrezk
Copy link

@zeginis
regrading the qb:measureType issue, do you mean to include it only as a qb:component only
even if the observation is measuring all statted measure?
or to have it in all observations entries as well [this will be an overhead and will increase cube size for no reason] ?
I am referring to IC-14 .

@zeginis
Copy link
Contributor Author

zeginis commented Jul 13, 2018

@mohadelrezk for the use of multiple measures CubiQL adopts only the "Measure dimension" approach defined at the QB vocabulary. According to this it requires:

  • each observation to have only one measure (even if there are many measures at the cube)
  • define a qb:dimension qb:measureType at the DSD
  • use the qb:measureType at each obervation

"Measure dimension" is more flexible and extensible but this comes with an increased size of cube as you said.

CubiQL currently requires qb:measureType to be used even if there is only one measure at the cube. This should be fixed (#65)

@mohadelrezk
Copy link

If I kept all measures in one observation as it is, and added qb:measureType for each of them, will that work with CubiQl?

having single observation for every measure reading is not best choice and for our marine pilot specially.

this is an example of an observation of one data-set belongs to our marine pilot:

ogi:observations_dadfeed1-d146-4e75-a9bb-337b08483841

rdf:type qb:Observation ;
qb:dataSet ogi:IWaveBNetwork_spectral_ds ;
ogi:MeanWavePeriod_Tm01 5.07 ;
ogi:latitude 54.2311 ;
ogi:longitude -10.146 ;
ogi:PeakSpread 35.6 ;
ogi:EnergyPeriod 6.11931 ;
ogi:buoy_id "1"^^xsd:string ;
ogi:SignificantWaveHeight 0.88 ;
ogi:PeakPeriod 7.69 ;
ogi:PeakDirection 271.4 ;
ogi:MeanWavePeriod_Tm02 4.545 ;
ogi:station_id "Belmullet Wave Buoy Berth B"^^xsd:string ;
ogi:qcflag "0"^^xsd:int .

for that I would prefer the method we adopted as it confirm with IC-14:

IC-14
All measures present
In a qb:DataSet which does not use a Measure dimension then each individual qb:Observation must have a value for every declared measure.
https://www.w3.org/TR/vocab-data-cube/

@zeginis
Copy link
Contributor Author

zeginis commented Jul 31, 2018

If I kept all measures in one observation as it is, and added qb:measureType for each of them, will that work with CubiQl?

No this approach will not work. CubiQL supports only the "Measure dimension" approach for handling multiple measures per cube. This approach is more flexible compared to "Multi-measure observations" e.g. at your example it is not easy to define the unit of each of the measures. Ofcource this comes with an increased size of the cube.

To continue we have two options:

  • Transform marine pilot data to have only one measure per observation. @mohadelrezk how many observations do they currently have? What is their current and expected size after the transformation?

  • Extend CubiQL to support "Multi-measure observations". CubiQL was built with the core assumption that observations have only one measure (and use the qb:measureType). @lkitching @RickMoynihan do you think it is easy to extend CubiQL to support also multiple measures per observation (https://www.w3.org/TR/vocab-data-cube/#dsd-mm)?

@lkitching
Copy link
Contributor

@zeginis - Yes it should be possible to support multiple measures per observation. We plan to improve the dataset data model, having done it should be straightforward to generate different queries depending on the dataset structure.

@lkitching lkitching merged commit 0afd037 into Swirrl:master Aug 2, 2018
@zeginis zeginis deleted the DocumentDataReqs branch August 2, 2018 12:38
@zeginis
Copy link
Contributor Author

zeginis commented Aug 2, 2018

@lkitching when are you planing to implement this extension?
Are you planning to have it ready soon? Otherwise we should transform the data to have only one measure per observation in order to go on with the pilots.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants