Skip to content

Commit

Permalink
update data sources and about page
Browse files Browse the repository at this point in the history
  • Loading branch information
naustica committed Oct 24, 2024
1 parent a04fb88 commit d15f1ec
Show file tree
Hide file tree
Showing 9 changed files with 140 additions and 15 deletions.
11 changes: 10 additions & 1 deletion about.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,16 @@ We are based at the [Göttingen State and University Library](https://www.sub.un

We want to thank [Maëlle Salmon](https://masalmon.eu/) for encouraging us to start a blog about our work. As a technical framework for the blog, we are using [Distill for R Markdown](https://rstudio.github.io/distill/), a new web publishing format optimized for scientific and technical writing.

*Dr. Anne Hobert*, *Nick Haupka*, *Najko Jahn*
*Dr. Anne Hobert*, *Nick Haupka*, *Sophia Dörner*, *Najko Jahn*

## Journal publications

We also publish in scholarly journals about our work.

Haupka, N., Culbert, J., Schniedermann, A, Jahn, N., Mayr, P. (2024). Analysis of the Publication and Document Types in OpenAlex, Web of Science, Scopus, Pubmed and Semantic Scholar. <https://arxiv.org/abs/2406.15154>

Haupka, N. (2024). Analyse der Abdeckung wissenschaftlicher Publikationen auf Semantic Scholar im Kontext von Open Access. *Bibliothek Forschung und Praxis*, 48(2), 362–-373. <https://doi.org/10.1515/bfp-2023-0057>

Jahn, N. (2024). How open are hybrid journals included in transformative agreements? <https://arxiv.org/abs/2402.18255>

Culbert, J., Hobert, A., Jahn, N., Haupka, N., Schmidt, M., Donner, P., Mayr, P. (2024). Reference Coverage Analysis of OpenAlex compared to Web of Science and Scopus. <https://arxiv.org/abs/2401.16359>
Expand Down Expand Up @@ -90,6 +94,11 @@ Chamberlain, S., Zhu, H., Jahn, N., Boettiger, C., Ram, K. *rcrossref: Client fo

Jahn, N (2022). *roadoi: Find Free Versions of Scholarly Publications via Unpaywall*. <https://CRAN.R-project.org/package=roadoi> | <https://docs.ropensci.org/roadoi/>.

Python-Packages (selection):

Haupka, N., Paul Morrison. *unpywall - Interfacing the Unpaywall API with Python*.
<https://pypi.org/project/unpywall> | <https://unpywall.readthedocs.io/>

Dashboards (selection):

[Hybrid Open Access Dashboard (HOAD)](https://subugoe.github.io/hoaddash/). See our blog post: <https://www.coalition-s.org/blog/introducing-the-hybrid-open-access-dashboard-hoad/>
Expand Down
21 changes: 21 additions & 0 deletions data.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,17 @@ Anyone can view and query our publicly available [Open Scholarly Data warehouse

:::

## Status Semantic Scholar

::: l-body-outset

| Snapshot | Directory | Table | Schema | Procedure | Last Changed | Coverage | Number of rows |
|------------|--------------|----------------------|----------------------|-----------|--------------|-----------|-----------------|
| 2024-05-28 | papers/ | [semantic_scholar.papers](https://console.cloud.google.com/bigquery?ws=!1m4!1m3!3m2!1ssubugoe-collaborative!2ssemantic_scholar) | / | [Repo](https://github.com/naustica/MA/blob/main/download_papers.py) | 10.06.2024 | All | 218.668.220 |
| 2024-05-28 | venues/ | [semantic_scholar.venues](https://console.cloud.google.com/bigquery?ws=!1m4!1m3!3m2!1ssubugoe-collaborative!2ssemantic_scholar) | / | [Repo](https://github.com/naustica/MA/blob/main/download_venues.py) | 10.06.2024 | All | 194.578 |

:::

## Status Openalex

::: l-body-outset
Expand All @@ -85,4 +96,14 @@ Anyone can view and query our publicly available [Open Scholarly Data warehouse
| 2024-09-23 | topics/ | [openalex.topics](https://console.cloud.google.com/bigquery?ws=!1m4!1m3!3m2!1ssubugoe-collaborative!2sopenalex) | schema_openalex_topics.json | [Repo](https://github.com/naustica/openalex) | 09.10.2024 | All | 4.516 |
| 2024-09-27 | works/ | [openalex.works](https://console.cloud.google.com/bigquery?ws=!1m4!1m3!3m2!1ssubugoe-collaborative!2sopenalex) | schema_openalex_work.json | [Repo](https://github.com/naustica/openalex) | 09.10.2024 | All | 259.375.778 |

:::

## Status OpenAlex Document Type classification by SUB Göttingen

::: l-body-outset

| Snapshot | Directory | Table | Schema | Procedure | Last Changed | Coverage | Number of rows |
|------------|--------------|----------------------|----------------------|-----------|--------------|-----------|-----------------|
| 2024-09-27 | works/ | [resources.classification_article_reviews_september_2024](https://console.cloud.google.com/bigquery?ws=!1m4!1m3!3m2!1ssubugoe-collaborative!2sresources) | schema_document_types.json | [Repo](https://github.com/naustica/openalex_doctypes/tree/classifier/classifier) | 15.10.2024 | All | 151.099.815 |

:::
7 changes: 6 additions & 1 deletion docs/about.html
Original file line number Diff line number Diff line change
Expand Up @@ -2205,9 +2205,11 @@ <h2 id="about-this-blog">About this Blog!</h2>
<li>R-related training and outreach activities</li>
</ul>
<p>We want to thank <a href="https://masalmon.eu/">Maëlle Salmon</a> for encouraging us to start a blog about our work. As a technical framework for the blog, we are using <a href="https://rstudio.github.io/distill/">Distill for R Markdown</a>, a new web publishing format optimized for scientific and technical writing.</p>
<p><em>Dr. Anne Hobert</em>, <em>Nick Haupka</em>, <em>Najko Jahn</em></p>
<p><em>Dr. Anne Hobert</em>, <em>Nick Haupka</em>, <em>Sophia Dörner</em>, <em>Najko Jahn</em></p>
<h2 id="journal-publications">Journal publications</h2>
<p>We also publish in scholarly journals about our work.</p>
<p>Haupka, N., Culbert, J., Schniedermann, A, Jahn, N., Mayr, P. (2024). Analysis of the Publication and Document Types in OpenAlex, Web of Science, Scopus, Pubmed and Semantic Scholar. <a href="https://arxiv.org/abs/2406.15154" class="uri">https://arxiv.org/abs/2406.15154</a></p>
<p>Haupka, N. (2024). Analyse der Abdeckung wissenschaftlicher Publikationen auf Semantic Scholar im Kontext von Open Access. <em>Bibliothek Forschung und Praxis</em>, 48(2), 362–-373. <a href="https://doi.org/10.1515/bfp-2023-0057" class="uri">https://doi.org/10.1515/bfp-2023-0057</a></p>
<p>Jahn, N. (2024). How open are hybrid journals included in transformative agreements? <a href="https://arxiv.org/abs/2402.18255" class="uri">https://arxiv.org/abs/2402.18255</a></p>
<p>Culbert, J., Hobert, A., Jahn, N., Haupka, N., Schmidt, M., Donner, P., Mayr, P. (2024). Reference Coverage Analysis of OpenAlex compared to Web of Science and Scopus. <a href="https://arxiv.org/abs/2401.16359" class="uri">https://arxiv.org/abs/2401.16359</a></p>
<p>Taubert, N., Hobert, A., Jahn, N., Bruns, A., &amp; Iravani, E. (2024). Understanding differences of the OA uptake within the German University landscape (2010–2020): Part 2—repository-provided OA. Scientometrics. <a href="https://doi.org/10.1007/s11192-024-05003-5" class="uri">https://doi.org/10.1007/s11192-024-05003-5</a></p>
Expand Down Expand Up @@ -2248,6 +2250,9 @@ <h2 id="software">Software</h2>
<p>Jahn, N. <em>europepmc: R Interface to the Europe PubMed Central RESTful Web Service</em>. <a href="https://CRAN.R-project.org/package=europepmc" class="uri">https://CRAN.R-project.org/package=europepmc</a> | <a href="https://docs.ropensci.org/europepmc/" class="uri">https://docs.ropensci.org/europepmc/</a></p>
<p>Chamberlain, S., Zhu, H., Jahn, N., Boettiger, C., Ram, K. <em>rcrossref: Client for Various ‘CrossRef’ ‘APIs’</em>. <a href="https://CRAN.R-project.org/package=rcrossref" class="uri">https://CRAN.R-project.org/package=rcrossref</a> <a href="https://docs.ropensci.org/rcrossref/" class="uri">https://docs.ropensci.org/rcrossref/</a></p>
<p>Jahn, N (2022). <em>roadoi: Find Free Versions of Scholarly Publications via Unpaywall</em>. <a href="https://CRAN.R-project.org/package=roadoi" class="uri">https://CRAN.R-project.org/package=roadoi</a> | <a href="https://docs.ropensci.org/roadoi/" class="uri">https://docs.ropensci.org/roadoi/</a>.</p>
<p>Python-Packages (selection):</p>
<p>Haupka, N., Paul Morrison. <em>unpywall - Interfacing the Unpaywall API with Python</em>.
<a href="https://pypi.org/project/unpywall" class="uri">https://pypi.org/project/unpywall</a> | <a href="https://unpywall.readthedocs.io/" class="uri">https://unpywall.readthedocs.io/</a></p>
<p>Dashboards (selection):</p>
<p><a href="https://subugoe.github.io/hoaddash/">Hybrid Open Access Dashboard (HOAD)</a>. See our blog post: <a href="https://www.coalition-s.org/blog/introducing-the-hybrid-open-access-dashboard-hoad/" class="uri">https://www.coalition-s.org/blog/introducing-the-hybrid-open-access-dashboard-hoad/</a></p>
<p><a href="https://subugoe.github.io/metacheck/">metacheck: Open Access Metadata Compliance Checker</a></p>
Expand Down
90 changes: 90 additions & 0 deletions docs/data.html
Original file line number Diff line number Diff line change
Expand Up @@ -2189,7 +2189,9 @@ <h3>Contents</h3>
<ul>
<li><a href="#status-crossref" id="toc-status-crossref">Status Crossref</a></li>
<li><a href="#status-unpaywall" id="toc-status-unpaywall">Status Unpaywall</a></li>
<li><a href="#status-semantic-scholar" id="toc-status-semantic-scholar">Status Semantic Scholar</a></li>
<li><a href="#status-openalex" id="toc-status-openalex">Status Openalex</a></li>
<li><a href="#status-openalex-document-type-classification-by-sub-göttingen" id="toc-status-openalex-document-type-classification-by-sub-göttingen">Status OpenAlex Document Type classification by SUB Göttingen</a></li>
</ul>
</nav>
</div>
Expand Down Expand Up @@ -2454,6 +2456,55 @@ <h3 id="historical-snapshots-upw_history">Historical Snapshots (upw_history)</h3
</tbody>
</table>
</div>
<h2 id="status-semantic-scholar">Status Semantic Scholar</h2>
<div class="l-body-outset">
<table>
<colgroup>
<col style="width: 9%" />
<col style="width: 11%" />
<col style="width: 17%" />
<col style="width: 17%" />
<col style="width: 8%" />
<col style="width: 11%" />
<col style="width: 8%" />
<col style="width: 13%" />
</colgroup>
<thead>
<tr class="header">
<th>Snapshot</th>
<th>Directory</th>
<th>Table</th>
<th>Schema</th>
<th>Procedure</th>
<th>Last Changed</th>
<th>Coverage</th>
<th>Number of rows</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>2024-05-28</td>
<td>papers/</td>
<td><a href="https://console.cloud.google.com/bigquery?ws=!1m4!1m3!3m2!1ssubugoe-collaborative!2ssemantic_scholar">semantic_scholar.papers</a></td>
<td>/</td>
<td><a href="https://github.com/naustica/MA/blob/main/download_papers.py">Repo</a></td>
<td>10.06.2024</td>
<td>All</td>
<td>218.668.220</td>
</tr>
<tr class="even">
<td>2024-05-28</td>
<td>venues/</td>
<td><a href="https://console.cloud.google.com/bigquery?ws=!1m4!1m3!3m2!1ssubugoe-collaborative!2ssemantic_scholar">semantic_scholar.venues</a></td>
<td>/</td>
<td><a href="https://github.com/naustica/MA/blob/main/download_venues.py">Repo</a></td>
<td>10.06.2024</td>
<td>All</td>
<td>194.578</td>
</tr>
</tbody>
</table>
</div>
<h2 id="status-openalex">Status Openalex</h2>
<div class="l-body-outset">
<table style="width:100%;">
Expand Down Expand Up @@ -2553,6 +2604,45 @@ <h2 id="status-openalex">Status Openalex</h2>
</tbody>
</table>
</div>
<h2 id="status-openalex-document-type-classification-by-sub-göttingen">Status OpenAlex Document Type classification by SUB Göttingen</h2>
<div class="l-body-outset">
<table>
<colgroup>
<col style="width: 9%" />
<col style="width: 11%" />
<col style="width: 17%" />
<col style="width: 17%" />
<col style="width: 8%" />
<col style="width: 11%" />
<col style="width: 8%" />
<col style="width: 13%" />
</colgroup>
<thead>
<tr class="header">
<th>Snapshot</th>
<th>Directory</th>
<th>Table</th>
<th>Schema</th>
<th>Procedure</th>
<th>Last Changed</th>
<th>Coverage</th>
<th>Number of rows</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>2024-09-27</td>
<td>works/</td>
<td><a href="https://console.cloud.google.com/bigquery?ws=!1m4!1m3!3m2!1ssubugoe-collaborative!2sresources">resources.classification_article_reviews_september_2024</a></td>
<td>schema_document_types.json</td>
<td><a href="https://github.com/naustica/openalex_doctypes/tree/classifier/classifier">Repo</a></td>
<td>15.10.2024</td>
<td>All</td>
<td>151.099.815</td>
</tr>
</tbody>
</table>
</div>
<div class="sourceCode" id="cb1"><pre class="sourceCode r distill-force-highlighting-css"><code class="sourceCode r"></code></pre></div>
<!--radix_placeholder_article_footer-->
<!--/radix_placeholder_article_footer-->
Expand Down
2 changes: 1 addition & 1 deletion docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -2312,7 +2312,7 @@ <h1 class="posts-list-caption" data-caption="Blog | Scholarly Communication Anal
<a href="posts/oal_document_types_classifier/" class="post-preview">
<script class="post-metadata" type="text/json">{"categories":[]}</script>
<div class="metadata">
<div class="publishedDate">Oct. 23, 2024</div>
<div class="publishedDate">Oct. 24, 2024</div>
<div class="dt-authors">
<div class="dt-author">Nick Haupka</div>
</div>
Expand Down
4 changes: 2 additions & 2 deletions docs/index.xml
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,14 @@ to publish case-studies rapidely showing how to support data-driven workflows an
decision-making around scholarly communication in libraries using R.
</description>
<generator>Distill</generator>
<lastBuildDate>Wed, 23 Oct 2024 00:00:00 +0000</lastBuildDate>
<lastBuildDate>Thu, 24 Oct 2024 00:00:00 +0000</lastBuildDate>
<item>
<title>Identifying journal article types in OpenAlex</title>
<dc:creator>Nick Haupka</dc:creator>
<link>https://subugoe.github.io/scholcomm_analytics/posts/oal_document_types_classifier</link>
<description>Identifying suitable types of journal articles for bibliometric analyses is important. In this blog post, I present a document type classifier that helps to identify research contributions like original research articles using Crossref and OpenAlex. The classifier and classified OpenAlex records are openly available.</description>
<guid>https://subugoe.github.io/scholcomm_analytics/posts/oal_document_types_classifier</guid>
<pubDate>Wed, 23 Oct 2024 00:00:00 +0000</pubDate>
<pubDate>Thu, 24 Oct 2024 00:00:00 +0000</pubDate>
<media:content url="https://subugoe.github.io/scholcomm_analytics/posts/oal_document_types_classifier/distill-preview.png" medium="image" type="image/png" width="3896" height="2213"/>
</item>
<item>
Expand Down
Loading

0 comments on commit d15f1ec

Please sign in to comment.