Skip to content

Commit

Permalink
Communication protocol is now an async generator
Browse files Browse the repository at this point in the history
The communication protocol now uses an async generator to return data. The dashboards have been updated to reflect this as well. Additionally, some piece of this pull request and in here to fix Python 3.11 issues.
---------

Co-authored-by: Piotr Mitros <[email protected]>
  • Loading branch information
bradley-erickson and pmitros authored Oct 22, 2024
1 parent e6908cc commit 5c25463
Show file tree
Hide file tree
Showing 53 changed files with 2,381 additions and 447 deletions.
1 change: 1 addition & 0 deletions CONTRIBUTORS.TXT
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
Piotr Mitros
Oren Livne
Paul Deane
Bradley Erickson
14 changes: 12 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ PACKAGES ?= wo,awe
run:
# If you haven't done so yet, run: make install
# we need to make sure we are on the virtual env when we do this
cd learning_observer && python learning_observer --watchdog=restart
cd learning_observer && python learning_observer

venv:
# This is unnecessary since LO installs requirements on install.
Expand Down Expand Up @@ -34,6 +34,7 @@ install-packages: venv
pip install -e learning_observer/[${PACKAGES}]

# Just a little bit of dependency hell...

# The AWE Components are built using a specific version of
# `spacy`. This requires an out-of-date `typing-extensions`
# package. There are few other dependecies that require a
Expand All @@ -42,7 +43,16 @@ install-packages: venv
# components.
# TODO remove this extra step after AWE Component's `spacy`
# is no longer version locked.
pip install -U typing-extensions
# This is no longer an issue, but we will leave until all
# dependecies can be resolved in the appropriate locations.
# pip install -U typing-extensions

# On Python3.11 with tensorflow, we get some odd errors
# regarding compatibility with `protobuf`. Some installation
# files are missing from the protobuf binary on pip.
# Using the `--no-binary` option includes all files.
pip uninstall -y protobuf
pip install --no-binary=protobuf protobuf==4.25

# testing commands
test:
Expand Down
3 changes: 3 additions & 0 deletions awe_requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
spacy==3.4.4
pydantic==1.10
spacytextblob==3.0.1
AWE_SpellCorrect @ git+https://github.com/ETS-Next-Gen/AWE_SpellCorrect.git
AWE_Components @ git+https://github.com/ETS-Next-Gen/AWE_Components.git
AWE_Lexica @ git+https://github.com/ETS-Next-Gen/AWE_Lexica.git
Expand Down
11 changes: 10 additions & 1 deletion learning_observer/learning_observer/adapters/adapter.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,11 +51,20 @@ def dash_to_underscore(event):

return event


common_transformers = [
dash_to_underscore
]

def add_common_migrator(migrator, file):
'''Add a migrator to the common transformers list.
TODO
We ought check each module on startup for migrators
and import them instead of using this function to
add them to the transformations.
'''
print('Adding migrator', migrator, 'from', file),
common_transformers.append(migrator)


class EventAdapter:
def __init__(self, metadata=None):
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Communication Protocol / Query Language

## Motivation

In our first version of this system, we would simply compile the state for all the students, and ship that to the dashboard. However, that didn't allow us to make interactive dashboards, so we created a query language. This is inspired by SQL (with JOIN and friends), but designed for streaming data.

It can be written in Python or, soon, JavaScript, which compile queries to a JSON object. The JSON object is very similar to SQL.

## Security model

We allow two modes of operation:

- **Predefined queries** are designed for production use. The client cannot make arbitrary queries.
- **Open queries** are designed for development and data analysis, for example, working from a Jupyter notebook. This allows arbitrary queries, including ones which might not be performant or which might reveal sensitive data.

The latter should only be used in a trusted environment, and on a read replica.

## Shorthand / Getting Started

For common queries, we have shorthand notation, to maintain simplicity. In the majority of cases, we want just want the latest reducer data for either a single student or a classroom of students.

In `module.py`, you see this line:

```python
EXECUTION_DAG = learning_observer.communication_protocol.util.generate_base_dag_for_student_reducer('student_event_counter', 'my_event_module')
```

This is shorthand for a common query which JOINs the class roster with the output of the reducers. The Python code for the query itself is [here](https://github.com/ETS-Next-Gen/writing_observer/blob/berickson/workshop/learning_observer/learning_observer/communication_protocol/util.py#L58), but the jist of the code is:

```python
'roster': course_roster(runtime=q.parameter('runtime'), course_id=q.parameter("course_id", required=True)),
keys_node: q.keys(f'{module}.{reducer}', STUDENTS=q.variable('roster'), STUDENTS_path='user_id'),
select_node: q.select(q.variable(keys_node), fields=q.SelectFields.All),
join_node: q.join(LEFT=q.variable(select_node), RIGHT=q.variable('roster'), LEFT_ON='provenance.provenance.value.user_id', RIGHT_ON='user_id')
```

You can add a `print(EXECUTION_DAG)` statement to see the JSON representation this compiles to.

To see the data protocol, open up develop tools from your browser, click on network, and see the communication_protocol response.

## Playing / Debugging / Interactive operations

* `debugger.py` has a view for executing queries manually.
* `explorer.py` has a view for showing predefined queries already on the server, and running those.

As of this writing, these are likely broken, as it has not been recently tested and there were code changes. Both of these should also:
* Be available from the Jupyter notebook in the future
* Have a command line / test case version

## Python Query Language



## JSON Query Language

Original file line number Diff line number Diff line change
@@ -1,6 +1,13 @@
'''
This provides a web interface for making queries via the
communication protocol and seeing the text of the results.
TODO:
* This isn't really a debugger. Perhaps this should be called
interactive mode? Or developer mode? Or similar?
* Ideally, this should be moved to the Jupyter notebook
* Make work with the new async generator pipeline
'''

from dash import html, callback, Output, Input, State
Expand All @@ -11,6 +18,7 @@
import lo_dash_react_components as lodrc


# These are IDs for page elements, used in the layout and for callbacks
prefix = 'communication-debugger'
ws = f'{prefix}-websocket'
status = f'{prefix}-connection-status'
Expand All @@ -27,7 +35,7 @@ def layout():
html.H1('Communication Protocol Debugger'),
lodrc.LOConnection(
id=ws,
url='ws://localhost:8888/wsapi/communication_protocol'
url='ws://localhost:8888/wsapi/communication_protocol' # HACK/TODO: This might not be 8888. We should use the default port.
),
html.Div(id=status)
]
Expand Down Expand Up @@ -69,6 +77,9 @@ def layout():


def create_status(title, icon):
'''
Are we connected to the server? Connecting? Disconnected? Used by update_status below
'''
return html.Div(
[
html.I(className=f'{icon} me-1'),
Expand All @@ -82,6 +93,9 @@ def create_status(title, icon):
Input(ws, 'state')
)
def update_status(state):
'''
Called when we connect / disconnect / etc.
'''
icons = ['fas fa-sync-alt', 'fas fa-check text-success', 'fas fa-sync-alt', 'fas fa-times text-danger']
titles = ['Connecting to server', 'Connected to server', 'Closing connection', 'Disconnected from server']
index = state.get('readyState', 3) if state is not None else 3
Expand All @@ -93,6 +107,10 @@ def update_status(state):
Input(message, 'value')
)
def determine_valid_json(value):
'''
Disable or enable to submit button, so we can only submit a
query if it is valid JSON
'''
if value is None:
return True
try:
Expand All @@ -108,6 +126,9 @@ def determine_valid_json(value):
State(message, 'value')
)
def send_message(clicks, value):
'''
Send a message to the communication protocol on the web socket.
'''
if clicks is None:
raise PreventUpdate
return value
Expand All @@ -118,6 +139,10 @@ def send_message(clicks, value):
Input(ws, 'message')
)
def receive_message(message):
'''
Shows messages from the web socket in the field with ID
`response` (defined on top)
'''
if message is None:
return {}
return json.loads(message.get("data", {}))
Loading

0 comments on commit 5c25463

Please sign in to comment.