-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
executable file
·195 lines (193 loc) · 10.3 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
---
layout: homepage
---
<div
class="hero-subheader hero-subheader--no-bottom{% if page.grid_navigation %} hero-subheader--before-out{% endif %}">
<div class="container">
<div class="row vertical-align">
<div class="col-sm-6 mobile-margin">
<!--<p class="subheadline">SNORKEL:</p>-->
<h1>Programmatically Build Training Data</h1>
</div>
<!--<div class="col-sm-6"><img src="/doks-theme/assets/images/layout/Pattern 3.png" alt="Pattern 3" /></div>-->
</div>
<div class="row row-spacing mobile-padding">
<div class="oss-transition-block">
<p>
<b>
The Snorkel team is now focusing their efforts on Snorkel Flow, an end-to-end AI application development platform based on the core ideas behind Snorkel—check it out <a href="https://www.snorkel.ai/" style="color:blue;">here</a>!
</b>
</p>
<p>
The Snorkel project started at Stanford in 2016 with a simple technical bet: that it would increasingly be the training data, not the models, algorithms, or infrastructure, that decided whether a machine learning project succeeded or failed.
Given this premise, we set out to explore the radical idea that you could bring mathematical and systems structure to the messy and often entirely manual process of training data creation and management, starting by empowering users to programmatically label, build, and manage training data.
</p>
<p>
To say that the Snorkel project succeeded and expanded beyond what we had ever expected would be an understatement.
The basic goals of a research repo like Snorkel are to provide a minimum viable framework for testing and validating hypotheses.
Four years later, we’ve been fortunate to do not just this, but to develop and deploy early versions of Snorkel in partnership with some of the world’s leading organizations like <a href="https://ai.googleblog.com/2019/03/harnessing-organizational-knowledge-for.html" style="color:blue;">Google</a>, <a href="https://dl.acm.org/doi/abs/10.1145/3329486.3329492" style="color:blue;">Intel</a>, <a href="https://www.cell.com/patterns/fulltext/S2666-3899(20)30019-2" style="color:blue;">Stanford Medicine</a>, and many more; author over <a href="https://snorkel.ai/technology" style="color:blue;">sixty peer-reviewed publications</a> on our findings around Snorkel and related innovations in weak supervision modeling, data augmentation, multi-task learning, and more; be included in courses at top-tier universities; support production deployments in systems that you’ve likely used in the last few hours; and work with an amazing community of researchers and practitioners from industry, medicine, government, academia, and beyond.
</p>
<p>
However, we realized increasingly–from conversations with users in weekly office hours, workshops, online discussions, and industry partners–that the Snorkel project was just the very first step.
The ideas behind Snorkel change not just how you label training data, but so much of the entire lifecycle and pipeline of building, deploying, and managing ML: how users inject their knowledge; how models are constructed, trained, inspected, versioned, and monitored; how entire pipelines are developed iteratively; and how the full set of stakeholders in any ML deployment, from subject matter experts to ML engineers, are incorporated into the process.
</p>
<p>
Over the last year, we have been building the platform to support this broader vision: <a href="https://snorkel.ai/platform" style="color:blue;">Snorkel Flow</a>, an end-to-end machine learning platform for developing and deploying AI applications.
Snorkel Flow incorporates many of the concepts of the Snorkel project with a range of newer techniques around weak supervision modeling, data augmentation, multi-task learning, data slicing and structuring, monitoring and analysis, and more, all of which integrate in a way that is greater than the sum of its parts–and that we believe makes ML truly faster, more flexible, and more practical than ever before.
</p>
<p>
Moving forward, we will be focusing our efforts on Snorkel Flow.
We are extremely grateful for all of you that have contributed to the Snorkel project, and are excited for you to check out our next chapter <a href="http://snorkel.ai/" style="color:blue;">here</a>.
</p>
</div>
</div>
<div class="row row-spacing mobile-padding">
<div class="light-blue-card-container">
<div class="border-card">
<p class="subheadline">Build</p>
<h3>
Build Training Sets Programmatically
</h3>
<p>
Labeling and managing training datasets by hand is one of the biggest bottlenecks in machine learning.
In Snorkel, write heuristic functions to do this programmatically instead!
</p>
<a href="/get-started/" class="btn" target="_blank">Get Started!</a>
</div>
<div class="border-card">
<p class="subheadline">Model</p>
<h3>Model Weak Supervision</h3>
<p>
Programmatic or <i>weak</i> supervision sources can be noisy and correlated.
Snorkel uses novel, theoretically-grounded unsupervised modeling techniques to automatically clean and
integrate them.
</p>
<a href="/blog/" class="btn">Read More!</a>
</div>
<div class="border-card">
<p class="subheadline">Train</p>
<h3>Train Modern ML Models</h3>
<p>
Snorkel outputs clean, confidence-weighted training datasets that easily plug into any modern machine
learning framework.
</p>
<a href="/use-cases/" class="btn">Tutorials</a>
</div>
</div>
</div>
<div class="row row-spacing mobile-padding vertical-align">
<div class="col-sm-5 mobile-margin hidden-xs">
<img src="/doks-theme/assets/images/layout/Overview.png" alt="Quickstart" />
</div>
<div class="col-sm-1"></div>
<div class="col-sm-6">
<p class="subheadline">start in minutes</p>
<h1>Quickstart</h1>
<div class="code-block">
<p># For pip users<br>pip install snorkel</p>
<p># For conda users<br>conda install snorkel -c conda-forge</p>
<!-- <span style="color: #9D3FA7;">import</span><span style="color: #18171C;"> snorkel</span> -->
</div>
<a href="https://github.com/snorkel-team/snorkel" class="btn" target="_blank">GitHub</a>
<a href="/get-started/" class="btn" target="_blank">Get Started</a>
</div>
</div>
</div>
<div class="light-blue">
<div class="container">
<div class="nav-grid-light-blue">
<div class="row">
{% assign cases = site.use_cases | sort: 'order' %}
{% for tutorial in cases limit:3 %}
<div class="col-sm-6 col-lg-4">
<a href="{% if jekyll.environment == 'production' %}{{
site.doks.baseurl
}}{% endif %}{{ tutorial.url }}" class="nav-grid__item_white">
<div class="nav-grid__content" data-mh>
<p class="purple-numbers">{{ forloop.index | prepend: '00' | slice: -2, 2 }}</p>
{% if tutorial.category %}
<p class="purple">{{ tutorial.category }}</p>
{% endif %}
<h2 class="nav-grid__title">{{ tutorial.title }}</h2>
<p>{{ tutorial.excerpt }}</p>
</div>
<p class="nav-grid__btn_light_blue">
{{ tutorial.cta | default: "READ MORE" }}
<i class="icon icon--arrow-right"></i>
</p>
</a>
</div>
{% endfor %}
</div>
</div>
</div>
</div>
<div class="container-fluid">
<div class="row cta-row row-spacing">
<a class="card-cta" href="/use-cases/">SEE ALL TUTORIALS <i class="icon icon--arrow-right"></i></a>
</div>
</div>
<div class="container">
<div class="row row-spacing tweets">
<h3>RECENT <a href="https://twitter.com/SnorkelAI">@SNORKELAI</a> NEWS</h3>
<div class="block"></div>
<div class="timeline" id="chrome">
{% for tweet in site.data.tweets %}
<iframe border=0 frameborder=0 height=625 src="https://twitframe.com/show?url={{tweet}}"
loading="lazy"></iframe>
{% endfor %}
</div>
<div class="timeline" id="notChrome">
{% for tweet in site.data.tweets %}
<iframe class="lazy" border=0 height="625" scrolling="no" frameborder=0
data-src="https://twitframe.com/show?url={{tweet}}"></iframe>
{% endfor %}
</div>
<div class="col-sm-12 all-tweets">
<a href="https://twitter.com/SnorkelAI">SEE ALL TWEETS <i
class="icon icon--arrow-right"></i></a>
</div>
</div>
</div>
<div class="container">
<div class="row mobile-padding flex-row-center">
<h3>USERS & SPONSORS</h3>
</div>
<div class="row row-spacing mobile-padding">
<div align="center">
<img width=85%; src="/doks-theme/assets/images/layout/Group.png" alt="Logos">
</div>
</div>
</div>
</div>
<div class="dark-blue">
<div class="container" id="blogs">
<div class="nav-grid-light-blue">
<div class="row">
{% assign posts = site.posts | sort: 'order' %}
{% for post in posts limit:4 %}
<div class="col-sm-6 col-lg-3">
<a href="{% if jekyll.environment == 'production' %}{{
site.doks.baseurl
}}{% endif %}{{ post.url }}" class="nav-grid__item">
<div class="nav-grid__content" data-mh>
<p class="purple-numbers">{{ forloop.index | prepend: '00' | slice: -2, 2 }}</p>
<h2 class="nav-grid__title">{{ post.title }}</h2>
<p>{{ post.excerpt }}</p>
</div>
<p class="nav-grid__btn_light_blue">
{{ post.cta | default: "READ MORE" }}
<i class="icon icon--arrow-right"></i>
</p>
</a>
</div>
{% endfor %}
</div>
</div>
</div>
</div>
<div class="container-fluid">
<div class="row cta-dark-row">
<a class="dark-card-cta" href="/blog/">SEE ALL BLOGS<i class="icon icon--arrow-right"></i></a>
</div>
</div>