Skip to content

Commit

Permalink
Update paper.md
Browse files Browse the repository at this point in the history
typo, spacing
  • Loading branch information
hjwilli authored Nov 27, 2024
1 parent 68344fe commit cab5a63
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion joss/paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,9 +44,10 @@ Loading, cleansing, and organizing data can dominate the time spent on a data sc

Existing extract, transform and load (ETL) technologies such as [Microsoft SQL Server Integration Services](https://docs.microsoft.com/en-us/sql/integration-services/sql-server-integration-services) help with data staging. Similarly, data manipulation tools like [pandas](https://pandas.pydata.org) facilitate transformation of series and matrix data. **Carnival** distinguishes itself by offering a lightweight data caching mechanism coupled with data manipulation services built on a property graph rather than arrays and data frames. Graphs present an alternative to relational data structures that more naturally represent complex and highly relational data and are more adaptive to change. A property graph database is an implementation of the graph structure that represents data as nodes and directed edges (relationships) between the nodes, where nodes and edges can have properties (key/value pairs) associated with them. Carnival’s combination of features and graph data representation empowers informaticians and programmers working in complex data domains to build pipelines, utilities, and applications that are comparatively richer in semantics and provenance.

Knowledge bases in Resource Description Framework (RDF) triplestores can be valuable tools to harmonize and enrich complex data. Transforming source relational data into RDF triples reflecting a data model is challenging. While there exist relational-to-RDF mappers such as Karma [@10.1007/978-3-662-46641-4_40], the configuration process is labor intensive and the resulting triples may not match a data model particularly one of sufficient complexity.
Knowledge bases in Resource Description Framework (RDF) triplestores can be valuable tools to harmonize and enrich complex data. Transforming source relational data into RDF triples reflecting a data model is challenging. While there exist relational-to-RDF mappers such as Karma [@10.1007/978-3-662-46641-4_40], the configuration process is labor intensive and the resulting triples may not match a data model, particularly one of sufficient complexity.

**Carnival** was developed to create domain-specific property graph data models, and provide tools to create robust pipelines to import and manage data in that model. There are two main components to Carnival. The primary component is a layer built on top of [Apache Tinkerpop](https://tinkerpop.apache.org) that seeks to provide more standardized and semantically driven methods of interacting with a property graph. An additional component is a data caching mechanism that supports the efficient aggregation of data from disparate sources. The main features of Carnival are:

- a graph modeling framework that ensures graph data remain consistent,
- a data caching mechanism to ease the computational burden of data aggregation during the development process and promotes data provenance,
- a lightweight graph algorithm framework that facilitates the creation of graph building components with automated provenance tracking.
Expand Down

0 comments on commit cab5a63

Please sign in to comment.