Skip to content

Commit

Permalink
docs: update readme and templates (#1463)
Browse files Browse the repository at this point in the history
  • Loading branch information
lumianph authored Mar 17, 2022
1 parent 24db81b commit 9bf27b0
Show file tree
Hide file tree
Showing 6 changed files with 36 additions and 112 deletions.
37 changes: 8 additions & 29 deletions .github/ISSUE_TEMPLATE/bug_report.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,41 +3,20 @@ name: Bug report
about: Create a report to help us improve
title: ''
labels: bug
assignees: imotai

assignees: aceforeverd
---

Issue tracker is **ONLY** used for reporting bugs. New features should be discussed on our discussion

<!--- Provide a general summary of the issue in the Title above -->

## Expected Behavior

<!--- Tell us what should happen -->

## Current Behavior
**Bug Description**

<!--- Tell us what happens instead of the expected behavior -->

## Possible Solution
<!--- Not obligatory, but suggest a fix/reason for the bug, -->

## Steps to Reproduce
<!--- Provide a link to a live example, or an unambiguous set of steps to -->
<!--- reproduce this bug. Include code to reproduce, if relevant -->
1.
2.
3.
4.
**Expected Behavior**

## Context (Environment)
<!--- How has this issue affected you? What are you trying to accomplish? -->
<!--- Providing context helps us come up with a solution that is most useful in the real world -->

<!--- Provide a general summary of the issue in the Title above -->

## Detailed Description
<!--- Provide a detailed description of the change or addition you are proposing -->
**Steps to Reproduce**

## Possible Implementation
<!--- Not obligatory, but suggest an idea for implementing addition or change -->
1.
2.
3.
4.
9 changes: 3 additions & 6 deletions .github/ISSUE_TEMPLATE/feature_request.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,14 @@ about: Suggest an idea for this project
title: ''
labels: ''
assignees: ''

---

**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
**Describe the feature you'd like**

**Describe the solution you'd like**
A clear and concise description of what you want to happen.

**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.


**Additional context**

Add any other context or screenshots about the feature request here.
24 changes: 0 additions & 24 deletions .github/ISSUE_TEMPLATE/pull_request_template.md

This file was deleted.

14 changes: 0 additions & 14 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,3 @@
* **Please check if the PR fulfills these requirements**

- The commit message follows our guidelines
- Tests for the changes have been added (for bug fixes / features)
- Docs have been added / updated (for bug fixes / features)


* **What kind of change does this PR introduce?** (Bug fix, feature, docs update, ...)


Expand All @@ -15,10 +8,3 @@

* **What is the new behavior (if this is a feature change)?**



* **Does this PR introduce a breaking change?** (What changes might users need to make in their application due to this PR?)



* **Other information**:
33 changes: 13 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,48 +67,43 @@ The figure above shows the workflow of FeatureOps based on OpenMLDB. From offlin

## 5. Build & Install

:point_right: [Read more](docs/en/compile.md)
:point_right: [Read more](https://openmldb.ai/docs/zh/dev/deploy/index.html)

## 6. QuickStart

**Cluster and Standalone Versions**

OpenMLDB has introduced two deployment versions, which are *cluster version* and *standalone version*. The cluster version is suitable for large-scale applications, which provides the scalability and high-availability. On the other hand, the lightweight standalone version running on a single node is ideal for small businesses and demonstration. The cluster and standalone versions have the same functionalities but with different limitations for particular functions. Please refer to [this document](https://docs.openmldb.ai/v/0.4/content-2/standalone_vs_cluster) for details.
OpenMLDB has introduced two deployment versions, which are *cluster version* and *standalone version*. The cluster version is suitable for large-scale applications, which provides the scalability and high-availability. On the other hand, the lightweight standalone version running on a single node is ideal for small businesses and demonstration. The cluster and standalone versions have the same functionalities but with different limitations for particular functions. Please refer to [this document](https://openmldb.ai/docs/zh/dev/tutorial/standalone_vs_cluster.html) for details.

**Getting Started with OpenMLDB**

:point_right: [OpenMLDB QuickStart](https://docs.openmldb.ai/v/0.4/content-1/openmldb_quickstart)
:point_right: [OpenMLDB QuickStart](https://openmldb.ai/docs/zh/dev/quickstart/openmldb_quickstart.html)

## 7. Use Cases

We are making efforts to build a list of real-world use cases based on OpenMLDB to demonstrate how it can fit into your business. Please stay tuned.

| Application | Tools | Brief Introduction |
| ------------------------------------------------------------ | ------------------ | ------------------------------------------------------------ |
| [New York City Taxi Trip Duration](https://docs.openmldb.ai/use_case/taxi_tour_duration_prediction) | OpenMLDB, LightGBM | This is a challenge from Kaggle to predict the total ride duration of taxi trips in New York City. You can read [more detail here](https://www.kaggle.com/c/nyc-taxi-trip-duration/). It demonstrates using the open-source tools OpenMLDB + LightGBM to build an end-to-end machine learning applications easily. |
| [New York City Taxi Trip Duration](https://openmldb.ai/docs/zh/dev/use_case/taxi_tour_duration_prediction.html) | OpenMLDB, LightGBM | This is a challenge from Kaggle to predict the total ride duration of taxi trips in New York City. You can read [more detail here](https://www.kaggle.com/c/nyc-taxi-trip-duration/). It demonstrates using the open-source tools OpenMLDB + LightGBM to build an end-to-end machine learning applications easily. |

## 8. Documentation

We have released the Chinese version of the full documentation, you can find it here:

- The main site: [https://docs.openmldb.ai/](https://docs.openmldb.ai/)
- The mirror site in China: [http://docs-cn.openmldb.ai/](http://docs-cn.openmldb.ai/)

We are working on the English documentation, and it will be released very soon.
- Chinese documentations: [https://openmldb.ai/docs/zh](https://openmldb.ai/docs/zh)
- English documentations: coming soon

## 9. Roadmap

| Version | Est. release date | Highlight features |
| ------- | ----------------- | ------------------------------------------------------------ |
| 0.5.0 | 2022 Q1 | - Monitoring APIs and tools for online serving <br />- Efficient queries over a fairly long period of time by window functions <br />- Kafka/Pulsar connector support for online data sources <br />- The online storage engine supports external storage devices. |
| 0.5.0 | 2022 Q1 | - Monitoring APIs and tools for online serving <br />- Efficient queries over a fairly long period of time by window functions <br />- Kafka/Pulsar connector support for online data sources <br />- The online storage engine supports external storage devices.<br />- UDF support |

Furthermore, there are a few important features on the development roadmap but have not been scheduled yet. We appreciate any feedbacks on those features.

- A cloud-native OpenMLDB
- Adaptors to open-source machine learning lifecycle management platforms, such as MLflow and Airflow
- Fast recovery based on Intel® Optane™ Persistent Memory
- Adaptors to open-source machine learning lifecycle management platforms, such as Airflow
- Automatic feature extraction
- Lightweight OpenMLDB for edge computing
- A lightweight OpenMLDB for edge computing

## 10. Contributors

Expand All @@ -125,27 +120,25 @@ Let's clap hands for our community contributors :clap:

## 11. Community

- **Website**: [https://openmldb.ai/](https://openmldb.ai) (coming soon)
- **Website**: [https://openmldb.ai/en](https://openmldb.ai/en)

- **Email**: [[email protected]](mailto:[email protected])

- **[Slack](https://join.slack.com/t/openmldb/shared_invite/zt-ozu3llie-K~hn9Ss1GZcFW2~K_L5sMg)**

- **[GitHub Issues](https://github.com/4paradigm/OpenMLDB/issues)** and **[GitHub Discussions](https://github.com/4paradigm/OpenMLDB/discussions)**: If you are a serious developer, you are most welcome to join our discussion on GitHub. The GitHub Issues is used to report bugs and collect new requirements. The GitHub Discussions is mostly used by our project maintainers to publish and comment RFCs.

- **[Blogs (English)](https://openmldb.medium.com/)**

- [**Blogs (Chinese)**](https://www.zhihu.com/column/c_1417199590352916480)

- **WeChat Groups (Chinese)**:

<img src="images/wechat.png" alt="img" width=120 />

## 12. Publications & Blogs
## 12. Publications

- Cheng Chen, Jun Yang, Mian Lu, Taize Wang, Zhao Zheng, Yuqiang Chen, Wenyuan Dai, Bingsheng He, Weng-Fai Wong, Guoan Wu, Yuping Zhao, and Andy Rudoff. *[Optimizing in-memory database engine for AI-powered on-line decision augmentation using persistent memory](http://vldb.org/pvldb/vol14/p799-chen.pdf)*. International Conference on Very Large Data Bases (VLDB) 2021.
- [In-Depth Interpretation of the Latest VLDB 2021 Paper: Artificial Intelligence Driven Real-Time Decision System Database and Optimization Based on Persistent Memory](https://medium.com/@fengxindai0/in-depth-interpretation-of-the-latest-vldb-2021-paper-artificial-intelligence-driven-real-time-f2a818bcf2b2)
- [Predictive maintenance — 5 minutes demo of an end to end machine learning project](https://towardsdatascience.com/predictive-maintenance-5minutes-demo-of-an-end-to-end-machine-learning-project-60941f1c9793)
- [Compared to Native Spark 3.0, We Have Achieved Significant Optimization Effects in the AI Application Field](https://towardsdatascience.com/we-have-achieved-significant-optimization-effects-in-the-ai-application-field-compared-to-native-2a055e47250f)
- [MLOp Practice: Using OpenMLDB in the Real-Time Anti-Fraud Model for the Bank’s Online Transaction](https://towardsdatascience.com/practice-of-openmldbs-transaction-real-time-anti-fraud-model-in-the-bank-s-online-event-40ab41fec6d4)

## 13. [The User List](https://github.com/4paradigm/OpenMLDB/discussions/707)

Expand Down
31 changes: 12 additions & 19 deletions README_cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ MLOps 为人工智能工程化落地提供全栈技术方案,作为其中的

**以 SQL 为核心的开发和管理体验:** 低门槛且功能强大的数据库开发体验,全流程基于 SQL 进行特征计算脚本开发以及部署上线。

**面向特征计算的定制化性能优化:** 离线特征计算使用[面向特征计算优化的 OpenMLDB Spark 发行版](https://docs.openmldb.ai/v/0.4/content-2/openmldbspark_distribution);线上实时特征计算在高吞吐压力下的复杂查询提供几十毫秒量级的延迟,充分满足高并发、低延迟的性能需求。
**面向特征计算的定制化性能优化:** 离线特征计算使用[面向特征计算优化的 OpenMLDB Spark 发行版](https://openmldb.ai/docs/zh/dev/tutorial/openmldbspark_distribution.html);线上实时特征计算在高吞吐压力下的复杂查询提供几十毫秒量级的延迟,充分满足高并发、低延迟的性能需求。

**生产级特性:** 为大规模企业应用而设计,不断完善诸多生产级特性,包括灾备恢复、高可用、可无缝扩缩容、可平滑升级、可监控、异构内存架构支持等。

Expand All @@ -58,53 +58,50 @@ MLOps 为人工智能工程化落地提供全栈技术方案,作为其中的

3. **OpenMLDB 是否就是一个 feature store?**

OpenMLDB 包含 feature store 的全部功能,并且提供更为完整的 FeatureOps 全栈方案。除了提供特征存储功能,还具有基于 SQL 的数据库开发体验、[面向特征计算优化的 OpenMLDB Spark 发行版](https://docs.openmldb.ai/v/0.4/content-2/openmldbspark_distribution),针对实时特征计算优化的索引结构,特征上线服务、生产级运维和管理等功能。此外,OpenMLDB 也被用作一个高性能的时序特征数据库。
OpenMLDB 包含 feature store 的全部功能,并且提供更为完整的 FeatureOps 全栈方案。除了提供特征存储功能,还具有基于 SQL 的数据库开发体验、[面向特征计算优化的 OpenMLDB Spark 发行版](https://openmldb.ai/docs/zh/dev/tutorial/openmldbspark_distribution.html),针对实时特征计算优化的索引结构,特征上线服务、生产级运维和管理等功能。此外,OpenMLDB 也被用作一个高性能的时序特征数据库。

4. **OpenMLDB 为什么选择 SQL 作为开发语言?**

SQL 具备表达语法简洁且功能强大的特点,选用 SQL 和数据库开发体验一方面降低开发门槛,另一方面更易于跨部门之间的协作和共享。此外,基于 OpenMLDB 的实践经验表明,SQL 在特征计算的表达上功能完备,已经经受了长时间的实践考验。

## 5. 编译和安装

:point_right: [点击这里](https://docs.openmldb.ai/content-4)
:point_right: [点击这里](https://openmldb.ai/docs/zh/dev/deploy/index.html)

## 6. QuickStart

**集群版和单机版**

OpenMLDB 有两种部署模式:集群版(cluster version)和单机版(standalone vesion)。集群版适合于大规模数据的生产环境,提供了良好的可扩展性和高可用性;单机版适合于小数据场景或者试用目的,更加方便部署和使用。集群版和单机版在功能上完全一致,但是在某些具体功能上会有不同限制,详细参阅[此篇说明文档](https://docs.openmldb.ai/v/0.4/content-2/standalone_vs_cluster)
OpenMLDB 有两种部署模式:集群版(cluster version)和单机版(standalone vesion)。集群版适合于大规模数据的生产环境,提供了良好的可扩展性和高可用性;单机版适合于小数据场景或者试用目的,更加方便部署和使用。集群版和单机版在功能上完全一致,但是在某些具体功能上会有不同限制,详细参阅[此篇说明文档](https://openmldb.ai/docs/zh/dev/tutorial/standalone_vs_cluster.html)

**准备开始体验 OpenMLDB**

:point_right: [OpenMLDB 快速上手指南](https://docs.openmldb.ai/v/0.4/content-1/openmldb_quickstart)
:point_right: [OpenMLDB 快速上手指南](https://openmldb.ai/docs/zh/dev/quickstart/openmldb_quickstart.html)

## 7. 使用案例

我们正在努力构建一个 OpenMLDB 用于实际案例的列表,为 OpenMLDB 如何在你的业务中发挥价值提供参考,请随时关注我们的列表更新。

| 应用 | 所用工具 | 简介 |
| ------------------------------------------------------------ | ------------------ | ------------------------------------------------------------ |
| [New York City Taxi Trip Duration](https://docs.openmldb.ai/use_case/taxi_tour_duration_prediction) | OpenMLDB, LightGBM | 这是个来自 Kaggle 的挑战,用于预测纽约市的出租车行程时间。你可以从这里阅读更多关于[该应用场景的描述](https://www.kaggle.com/c/nyc-taxi-trip-duration/)。本案例展示使用 OpenMLDB + LightGBM 的开源方案,快速搭建完整的机器学习应用。 |
| [New York City Taxi Trip Duration](https://openmldb.ai/docs/zh/dev/use_case/taxi_tour_duration_prediction.html) | OpenMLDB, LightGBM | 这是个来自 Kaggle 的挑战,用于预测纽约市的出租车行程时间。你可以从这里阅读更多关于[该应用场景的描述](https://www.kaggle.com/c/nyc-taxi-trip-duration/)。本案例展示使用 OpenMLDB + LightGBM 的开源方案,快速搭建完整的机器学习应用。 |

## 8. OpenMLDB 文档

我们已经发布了完整的产品文档,你可以在以下地址找到:

- 主力站点:[https://docs.openmldb.ai/](https://docs.openmldb.ai/)
- 中国境内镜像:[http://docs-cn.openmldb.ai/](http://docs-cn.openmldb.ai/)
- 中文文档:[https://openmldb.ai/docs/zh/](https://openmldb.ai/docs/zh/)
- 英文文档:即将上线


## 9. 开发计划

| 版本号 | 预期发布日期 | 主要特性 |
| ------ | ------------ | ------------------------------------------------------------ |
| 0.5.0 | 2022 Q1 | - 在线服务监控模块<br />- 长时间窗口支持 <br />- 支持第三方在线数据流引入,包括 Kafka 和 Pulsar<br />- 实时特征计算的存储引擎支持外存设备 |
| 0.5.0 | 2022 Apr | - 在线服务监控模块<br />- 长时间窗口支持 <br />- 支持第三方在线数据流引入,包括 Kafka 和 Pulsar<br />- 实时特征计算的存储引擎支持外存设备<br />- UDF 支持 |

此外,OpenMLDB roadmap 上有一些规划中的重要功能演进,但是尚未具体排期,欢迎给我们任何反馈:

- Cloud-native 版本
- 适配机器学习全流程管理平台,比如 MLflow, Airflow 等
- 整合基于傲腾持久内存的快速恢复技术
- 适配机器学习全流程管理平台,比如 Airflow
- 整合自动特征生成
- 轻量级 edge 版本

Expand All @@ -123,21 +120,17 @@ OpenMLDB 有两种部署模式:集群版(cluster version)和单机版(st

## 11. 社区

- 网站:[https://openmldb.ai/](https://openmldb.ai) (即将上线)
- 网站:[https://openmldb.ai/](https://openmldb.ai)
- **Email**: [[email protected]](mailto:[email protected])
- **[Slack](https://join.slack.com/t/openmldb/shared_invite/zt-ozu3llie-K~hn9Ss1GZcFW2~K_L5sMg)**
- **[GitHub Issues](https://github.com/4paradigm/OpenMLDB/issues)[GitHub Discussions](https://github.com/4paradigm/OpenMLDB/discussions)**: 如果你是一个严肃的开发者,我们非常欢迎加入我们 GitHub 上的开发者社区,近距离参与我们的开发迭代。GitHub Issues 主要用来搜集 bugs 以及反馈新特性需求;GitHub Discussions 主要用来给开发团队发布并且讨论 RFCs。
- [**技术博客**](https://www.zhihu.com/column/c_1417199590352916480)
- **微信交流群:**
<img src="images/wechat.png" alt="img" width=120 />

## 12. 学术论文和技术博客
## 12. 学术论文

* Cheng Chen, Jun Yang, Mian Lu, Taize Wang, Zhao Zheng, Yuqiang Chen, Wenyuan Dai, Bingsheng He, Weng-Fai Wong, Guoan Wu, Yuping Zhao, and Andy Rudoff. *[Optimizing in-memory database engine for AI-powered on-line decision augmentation using persistent memory](http://vldb.org/pvldb/vol14/p799-chen.pdf)*. International Conference on Very Large Data Bases (VLDB) 2021.
* [第四范式OpenMLDB优化创新论文被国际数据库顶会VLDB录用](https://zhuanlan.zhihu.com/p/401513878)
* [OpenMLDB在银行上线事中交易反欺诈模型实践](https://zhuanlan.zhihu.com/p/389599785)
* [OpenMLDB在AIOPS领域关于交易系统异常检测应用实践](https://zhuanlan.zhihu.com/p/393602288)
* [5分钟完成硬件剩余寿命智能预测](https://zhuanlan.zhihu.com/p/399346826)

## 13. [用户列表](https://github.com/4paradigm/OpenMLDB/discussions/707)

Expand Down

0 comments on commit 9bf27b0

Please sign in to comment.