diff --git a/docs/kafka_notes.md b/docs/kafka_notes.md new file mode 100644 index 00000000..db01e179 --- /dev/null +++ b/docs/kafka_notes.md @@ -0,0 +1,55 @@ +# Structure notes + +Table for students + +student | sessions | hashes +---|---|--- +example@email.com | \[kafka_topic_id_00, kafka_topic_id_23\] | \[hash_of_kafka_topic_id_00\] + +Create session + +* Create Kafka stream +* Add student to student table with session name (REDIS or KAFKA) + +Closing behavior + +* Socket gets closed + * Times out? + * Could we determine when the document is closed? +* Message sent to data server + * The number of requests between servers will be dramatically fewer than messages between client and WO + * Could use a simple REST API +* Data server will: + * Compute the Hash + * Read data from Kafka stream + * Compute `hash = hash(data)` from our own hash function + * SHA 256 with some salt + * Create a Merkle tree where each leaf is a log event from the session. We use the root hash instead + * Append `hash` to a student's `hashes` + * Backup data + * Determine file location, `f = str(hash[:2]/hash[2:])` or some other way of hashing + * Create file at `f` + * There is some note that says Linux handles a lot of directories better than a lot of files in WYAG + * Write data to `f` + +Things TODO: + +* Turn on Kafka +* Write some of our data to Kafka stream + * Make the naming scheme easier to understand, for testing + * Just run it localy and collect the data through yourself +* Create simple REST API + * Post command to student + * Compute the hash, pull data, and store it + +How do we create the Merkle tree? + +* What is the purpose of the Merkle Tree? + * Where does it fit in? + * How will it be used? + * What is the purpose of the DAG portion? + +* Is it autostored in a merkle tree? i.e. the directory structure of our store is a merkle tree +* How are Merkle tree being used here? + * Do we store each log event as a Merkle Tree? Why? + * Do we store each log file as a Merkle Tree? Why? diff --git a/docs/ncsu_setup.md b/docs/ncsu_setup.md new file mode 100644 index 00000000..c97fd5bf --- /dev/null +++ b/docs/ncsu_setup.md @@ -0,0 +1,304 @@ +# NCSU system setup guide +# ================================================== + +Currently the system is set for use with RHEL 8 on the NCSU systems. We are running with Python 3.9 and connected to the AWEWorkbench code. This assumes that we are also installing it into a vm with those tools installed as packages. An installation script has been added to the servermanagement directory. + + +Installation on RHEL 8 requires: + +- python 3.9 or 3.10 (39 still default). +- redis.x86_64 5.0.3-5.module+el8.4.0+12927+b9845322 @rhel-8-for-x86_64-appstream-rpms +- redis-devel.x86_64 5.0.3-5.module+el8.4.0+12927+b9845322 @rhel-8-for-x86_64-appstream-rpms + + + + +# Older RHEL 7 Notes. +# ================================================== + + +The following is a guide to help with the installation of Learning Observer (LO) on NCSU systems. + +This guide assumes you are using an RHEL system. +Additionally, depending on where on the system you place the repository, you may need to run all commands as a sudo user. + +## Requirements + +LO is confirmed to work on `Python 3.8`. +Along with the base install of Python, LO requires the Python developer tools. +These can be installed with the following commands: + +```bash +sudo yum install rh-python38 # base python +sudo yum install python38-devel # developer tools for python 3.8 +``` + +On RHEL 7, the `python38-devel` is no longer recognized as a package. +To properly fetch the developer tools, use the following: + +```bash +sudo subscription-manager repos --enable rhel-7-server-optional-rpms --enable rhel-server-rhscl-7-rpms +sudo yum install rh-python38-python-devel.x86_64 +``` + +The Python installation should be located at `/opt/rh/rh-python38`. +Note this location for future sections. + +There is a chance you'll encounter an issue when installing the requirements, specifically `py-bcrypt`. +The developer tools do not show up in the exact proper place, so we need to create a soft symbolic link between the correct location and where they are located. +To create this link, use the following: + +```bash +cd /opt/rh/rh-python38/root +sudo ln -s usr/include/ . # check that Python.h exists in usr/include/python3.8/Python.h +``` + +Note, we are creating a link between the subdirectory `/opt/rh/rh-python38/root/usr/include` and `/opt/rh/rh-python38/root`. +Using `/usr/include` will result in the incorrect link. + +## Install + +### Virtual Environment + +To make sure we are using the proper installation of Python, we will use a virtual environment. +To do this, run the following command: + +```bash +/path/to/python3.8/ -m venv /path/of/desired/virtual/environment +``` + +Again, keep note of where the virtual environment is located for future steps. + +### Config files + +For each system, you'll need to create a new `creds.yaml` file within the `/path/to/repo/learning_observer` directory. +This file defines what type of connections are allowed to be made to the system. +Luckily, there is an example file you can copy located in the `/path/to/repo/learning_observer/learning_observer` directory. +When attempting to run the system later on in this setup guide, if you have any misconfigured here, then the system will tell you what's wrong. + +Some of the main changes that need to be made are: + +1. types of `auth` allowed, for simple setup, just remove the `google` child and all its subchildren +1. `aio` session secret and max age +1. `event_auth` to allow access from various locations (like Chromebooks) +1. `server` for reconfiguring the port information +1. `config:logging` for determining the `max_size` (in bytes) of each log file and total `backups` to keep around before rotating. + +More configurables are expected to be included in this config file in the future. + +### Package installation + +Before we get started installing packages, we must ensure that the `pip` in our virtual environment is up to date. +Some of the packages located in the `requirements.txt` file require `wheel` to be installed first. +After the base requirements are installed, we will also need to install the local packages (the Writing Observer module and the Learning Observer module). +To handle all the installs, use the following: + +```bash +cd writing_observer # cd into the top level of the repository +/path/to/venv/bin/pip install --upgrade pip # upgrade pip +/path/to/venv/bin/pip install wheel # install wheel +/path/to/venv/bin/pip install -r requirements.txt # install package requirements +/path/to/venv/bin/pip install -e learning_observer/ # install learning observer module +/path/to/venv/bin/pip install -e modules/writing_observer/ # install writing observer module +``` + +### Needed directories + +When installing Learning Observer for the first time, we need to create a few directories. +Use the following commands: + +```bash +mkdir /path/to/repo/learning_observer/learning_observer/static_data/course_lists +mkdir /path/to/repo/learning_observer/learning_observer/static_data/course_rosters +mkdir /path/to/repo/learning_observer/learning_observer/logs +mkdir /path/to/repo/learning_observer/learning_observer/logs/startup +``` + +### Proxy server + +By default, LO runs on port 8888. +Configure nginx, or another proxy server, for LO's port. + +### Executable files + +If this is the first time you are running the server on your system, you might need to make the shell scripts in the `servermanagement` directory executable. +To do this, use the following commands + +```bash +chmod +x /path/to/repo/servermanagement/RunLearningObserver.sh +chmod +x /path/to/repo/servermanagement/BackupWebSocketLogs.sh +``` + +## System specific changes + +There are various lines of code that point to specific servers. +For each setup, we need to make sure these are pointing to the proper place. + +### Server + +#### Auth information + +On the server, we need to point the redirect uri to the server we are working with. +Depending on how the credentials files was handled, this change may not be necessary to get the system running. +The redirect uri is used with the Google login. +If that is not used, then this step is not needed. +This is located in `/path/to/repo/learning_observer/learning_observer/auth/social_sso.py`. + +#### Server management + +Additionally, we need to set up the server management files in the `/path/to/repo/servermanagement` direcotry. + +In the `RunLearningObserver.sh` file, you'll want to set the system variables to match the current system. + +```bash +VIRTUALENV_PYTHON="/full/path/to/venv/bin/pip" +LEARNING_OBSERVER_LOC="/full/path/to/repo/learning_observer" +LOGFILE_DEST="/path/to/log/storage" +``` + +In the `BackupWebsocketLogs.sh` file, you'll want to set log directory to the same place as you set in `RunLearningObserver.sh` and set where the logs should be backed up. + +```bash +LOGFILE_SRC="/path/to/log/storage" +LOGFILE_DEST="/path/to/log/backups" +``` + +### Client + +On the clientside, we need to add the correct server to the `websocket_logger()` method in the `/path/to/repo/extension/extension/background.js` file. +If the server has SSL enabled, then the address we add should start with `wss://`. +If SSL is not enabled, then the address should start with `ws://`. +If a proxy server is not setup yet, make sure to include the port number (default 8888) on the end of the address. +An example of each instance is shown below: + +```js +websocket_logger("wss://writing.csc.ncsu.edu/wsapi/in/") // SSL enabled, nginx set +websocket_logger("ws://writing.csc.ncsu.edu:8888/wsapi/in/") // SSL not enabled, nginx not setup +``` + +## Running the server + +There are 2 different ways we can run the system. +One is better for debugging, whereas the other is best for when you want to run the server and leave it up. +We suggest completely testing the installation with the debugging steps first. + +### For debugging + +To run the system for debugging, we will just run the Learning Observer module. +This will output all the log information to the console. +To do this, use the following command: + +```bash +/path/to/venv/bin/python /learning_observer/learning_observer/ # run the learning observer module from within the learning observer directory +``` + +You should see any errors printed directly to the console. + +### As a server + +To run the system as a server, we will run the `RunLearningObserver.sh` script. +This fetches the virtual environment, runs the server, and pipes files into the proper log location we setup during the **System specific changes** section. +Run the following commands: + +```bash +./servermanagement/RunLearningObserver.sh +``` + +Check the logs for any errors. + +## Connecting the client + +The client is run through a Google Chrome extension. +To properly use the client, you must sign into Chrome and use the same account to access to Google Docs. + +From there, navigate to the extension manage located in settings. +Turn on Developer Mode (top right), then click the `Load Unpacked` button. +This opens a file explorer, where you should locate the repository. +More specifically, select the `writing_observer/extension/extension` directory. +This will unpack the extension and make it available for use in Google Chrome. + +To make sure it is working, click on the `background page` link on extension card from within the extension manager. +This opens an inspect window. +On this window, select the `Console` tab. +Next, open a Google doc and start typing. +You should see events within the console. +Ensure there are no logs sprinkled in. + +## Backing up logs + +Whenever a websocket is made, the server creates a new log file for that connection on top of the primary logs files. +We need to backup both the generic log files as well as all the websocket specific logs. + +### General logs + +The main logger for events is located in `event_logger.json`. +This is automatically backed up via the built-in Python logging module. +The settings for this file are handling via the `creds.yaml` file that you previously setup. +Simply changing the values and restarting the server will update the logging procress. + +### Websocket logs + +The websocket logs take a little more setting up. +We will set up a daily `cron` job to run a backup script, `/path/to/repo/servermanagement/BackupWebsocketLogs.sh`. +The backup script will search the log directory for any logs that match the websocket pattern and were last modified in the last **60 minutes**. +Next, the backup script will remove any files that match the pattern and were modifed in the last **120 minutes**. + +To set up the cron job, we first enter the crontab utility then add a line for the backup script. + +```bash +crontab -e # open the cron job menu + +0 1 * * * /usr/bin/sh /full/path/to/repo/servermanagement/BackupWebsocketLogs.sh # line to add to the cronjob +# Run it at the 0th minute every hour, every day, every month, and so on +``` + + + + +# Usage Notes +# ================================================================= +# Instructions for Configuring Writing Observer on RHEL Installations +### Install Global Dependencies +1. sudo yum install redis +2. sudo yum install git +3. sudo yum install nginx + +## Install Required RH Python 3.8 +4. sudo subscription-manager repos --enable rhel-7-server-optional-rpms \ + --enable rhel-server-rhscl-7-rpms +5. sudo yum -y install @development +6. sudo yum -y install rh-python38 + +* rh-pyhon38 dev tools are also required + +## Setup RH Python 38 and Virtual Envs +7. scl enable rh-python38 bash +8. python --version +* The output should indicate that python 3.8 is active +9. sudo pip install virtualenvwrapper +10. sudo source `/opt/rh/rh-python38/root/usr/local/bin/virtualenvwrapper.sh` + +## Install Local Dependencies +11. sudo git clone `https://github.com/ArgLab/writing_observer` +12. cd writing_observer +13. make install +14. sudo mkvirtualenv learning_observer +15. pip install -r requirements.txt +16. cd learning_observer +17. python setup.py develop +18. python learning_observer + +* At this point, follow the system's further instructions until the process runs on port 8888 by default + +## Server Setup +19. Populate creds.yaml with required Google Cloud Parameters +20. Configure nginx on `port 80` as a proxy for Learning Observer on `port 8888` +21. Replace all instances of `writing.csc.ncsu.edu` with custom server address in all files in directory `~/writing_observer/learning_observer/learning_observer/auth` + +## Client/Extension Setup +22. Replace all instances of `writing.csc.ncsu.edu` with custom server address in `~/writing_observer/extension/background.js` +* If SSL is not enabled for the server, all websocket protocols should begin with `ws://` as opposed to `wss://` +23. Open Chrome and navigate to `chrome://extensions` +24. Click on "Load Unpacked". Select `~/writing_observer/extensions` and ensure that it is enabled +25. Select `background page` on the extension section and ensure no errors are present +26. Open a Google Docs document while signed into Chrome and ensure websocket communication between client and server is active \ No newline at end of file diff --git a/docs/usagenotes.md b/docs/usagenotes.md new file mode 100644 index 00000000..fe1d8305 --- /dev/null +++ b/docs/usagenotes.md @@ -0,0 +1,8 @@ +## Usage Notes. + +For the moment this doc will act as some shared working knowledge for the use of the system with students. + + +# Extension Security. + +NOTE: The current working version of the chrome extension requires that the user is logged into both a Google account (which is registered to the classroom) *and* Chrome. This requirement stems from the security model used by the extension. On Chromebook devices this is the default behavior. In future work other devices will require a change. \ No newline at end of file diff --git a/extension/extension/inject.js b/extension/extension/inject.js new file mode 100644 index 00000000..4581c03a --- /dev/null +++ b/extension/extension/inject.js @@ -0,0 +1,20 @@ +/* + Inject script. This is a web_accessible_resources used to pass the id + of the document as a globally accessible variable to the extension. + It is called by the injectScript function in writing.js to make the result + accessible using an event listener +*/ + +let script = document.createElement('script') +script.id = 'tmpScript' + +const code = "_docs_flag_initialData.info_params.token" +script.textContent = 'document.getElementById("tmpScript").textContent = JSON.stringify(' + code + ')' + +document.documentElement.appendChild(script) + +let result = script.textContent + +window.postMessage({ from: 'inject.js', data: result }) + +script.remove() diff --git a/extension/extension/service_worker.js b/extension/extension/service_worker.js new file mode 100644 index 00000000..138ba942 --- /dev/null +++ b/extension/extension/service_worker.js @@ -0,0 +1,8 @@ +// Combining the two background scripts into one to serve +// as a single service worker script + +try { + importScripts("./writing_common.js", "./background.js"); +} catch (e) { + console.log(e); +} diff --git a/learning_observer/learning_observer/main.py b/learning_observer/learning_observer/main.py index 9d27fde3..e9b0e279 100644 --- a/learning_observer/learning_observer/main.py +++ b/learning_observer/learning_observer/main.py @@ -50,6 +50,14 @@ # Run argparse args = settings.parse_and_validate_arguments() +# This will need to move but for the moment we hack with +# this to prefer the GPU where possible. +import spacy +#spacy.prefer_gpu() +#debug_log("Preferring GPU Use.") +spacy.require_gpu() +debug_log("Requiring GPU Use.") + def configure_event_loop(): ''' diff --git a/learning_observer/learning_observer/rosters.py b/learning_observer/learning_observer/rosters.py index 4329eb5b..0f233f92 100644 --- a/learning_observer/learning_observer/rosters.py +++ b/learning_observer/learning_observer/rosters.py @@ -163,7 +163,6 @@ def adjust_external_gc_ids(resp_json): # Pull the actual profile data. student_profile = student_json['profile'] - # Calculate the new ID to use for our student. google_id = auth.google_id_to_user_id(student_profile['id']) diff --git a/modules/wo_highlight_dashboard/wo_highlight_dashboard/module.py b/modules/wo_highlight_dashboard/wo_highlight_dashboard/module.py index cbf8b1f8..0a0c86ce 100644 --- a/modules/wo_highlight_dashboard/wo_highlight_dashboard/module.py +++ b/modules/wo_highlight_dashboard/wo_highlight_dashboard/module.py @@ -30,6 +30,37 @@ "webfonts/fa-solid-900.ttf": d.FONTAWESOME_TTF } +# As of today, our goal isn't to have consistent versions installed, +# so much as to verify hashes to block man-in-the-middle +# attacks. We're keeping versions which on different systems here. +# +# We may (and should) remove deprecated versions in the future, but we +# do expect to continue to work with more than one version. +# +# A better design would map version URLs to sha hashes, under +# DRY. That can be done once we either kill "old" above or figure out +# what URL that came from. At that point, we can replace Minty_URLs +# with THIRD_PARTY["css/bootstrap.min.css"]["hash"] + +# Minty_URLs = [ +# 'https://cdn.jsdelivr.net/npm/bootswatch@5.1.3/dist/minty/bootstrap.min.css', +# 'https://cdn.jsdelivr.net/npm/bootswatch@5.2.3/dist/minty/bootstrap.min.css', +# ] + +# if (dbc.themes.MINTY not in Minty_URLs): +# print("WARN:: Unrecognized Minty URL detected: {}".format(dbc.themes.MINTY)) +# print("You will need to update dash bootstrap components hash value.\n") + +# FontAwesome_URLs = [ +# "https://use.fontawesome.com/releases/v6.3.0/css/all.css", +# "https://use.fontawesome.com/releases/v6.1.1/css/all.css" +# ] + +# if (dbc.icons.FONT_AWESOME not in FontAwesome_URLs): +# print("WARN:: Unrecognized Fontawesome URL detected: {}".format(dbc.icons.FONT_AWESOME)) +# print("You will need to update the FontAwesome bootstrap components hash value.\n") + + COURSE_DASHBOARDS = [{ 'name': NAME, 'url': "/wo_highlight_dashboard/dash/dashboard/", diff --git a/modules/writing_observer/writing_observer/aggregator.py b/modules/writing_observer/writing_observer/aggregator.py index f45e0671..6a81b422 100644 --- a/modules/writing_observer/writing_observer/aggregator.py +++ b/modules/writing_observer/writing_observer/aggregator.py @@ -13,11 +13,12 @@ import learning_observer.constants as constants import learning_observer.kvs import learning_observer.settings -from learning_observer.stream_analytics.fields import KeyField, KeyStateType, EventField import learning_observer.stream_analytics.helpers # import traceback import learning_observer.util +from learning_observer.log_event import debug_log + pmss.register_field( name='use_nlp', description='Flag for loading in and using AWE Components. These are '\ @@ -176,7 +177,6 @@ async def get_latest_student_documents(student_data): # Compile a list of the active students. active_students = [s for s in student_data if 'writing_observer.writing_analysis.last_document' in s] - # Now collect documents for all of the active students. document_keys = ([ learning_observer.stream_analytics.helpers.make_key( @@ -190,6 +190,7 @@ async def get_latest_student_documents(student_data): kvs_data = await kvs.multiget(keys=document_keys) + # Return blank entries if no data, rather than None. This makes it possible # to use item.get with defaults sanely. For the sake of later alignment # we also zip up the items with the keys and users that they come from @@ -208,7 +209,9 @@ async def get_latest_student_documents(student_data): # Now insert the student data and pass it along. doc['student'] = student writing_data.append(doc) - + + print(writing_data) + return writing_data @@ -223,16 +226,19 @@ async def remove_extra_data(writing_data): return writing_data -async def merge_with_student_data(writing_data, student_data): - ''' - Add the student metadata to each text - ''' +# async def merge_with_student_data(writing_data, student_data): +# ''' +# Add the student metadata to each text. Because we may have +# fewer entries in writing_data than student_data we iterate +# over the student_data locating writing data that matches it +# if any. +# ''' - for item, student in zip(writing_data, student_data): - if 'edit_metadata' in item: - del item['edit_metadata'] - item['student'] = student - return writing_data +# for item, student in zip(writing_data, student_data): +# if 'edit_metadata' in item: +# del item['edit_metadata'] +# item['student'] = student +# return writing_data # TODO the use_nlp initialization code ought to live in a @@ -375,6 +381,7 @@ async def latest_data(runtime, student_data, options=None): # single_doc.update(annotated_text) :return: The latest writing data. ''' + debug_log("WritingObserver latest_data students:", student_data) # HACK we have a cache downstream that relies on redis_ephemeral being setup # when that is resolved, we can remove the feature flag @@ -385,29 +392,18 @@ async def latest_data(runtime, student_data, options=None): # Get the latest documents with the students appended. writing_data = await get_latest_student_documents(student_data) - # Strip out the unnecessary extra data. + # Strip out the unnecessary edit_metadata from the merged + # student and writing data. writing_data = await remove_extra_data(writing_data) - # print(">>> WRITE DATA-premerge: {}".format(writing_data)) - - # This is the error. Skipping now. - writing_data_merge = await merge_with_student_data(writing_data, student_data) - # print(">>> WRITE DATA-postmerge: {}".format(writing_data_merge)) - - # #print(">>>> PRINT WRITE DATA: Merge") - # #print(writing_data) - - # just_the_text = [w.get("text", "") for w in writing_data] - - # annotated_texts = await writing_observer.awe_nlp.process_texts_parallel(just_the_text) - - # for annotated_text, single_doc in zip(annotated_texts, writing_data): - # if annotated_text != "Error": - # single_doc.update(annotated_text) - - writing_data = await merge_with_student_data(writing_data, student_data) + # Now process the remaining data. Previously this called + # for merge_with_student_data however that is unnecessary + # as the steps above will already integrate the student + # information into the writing data. writing_data = await processor(writing_data, options) + debug_log("WritingObserver latest_data result: ", writing_data) + return {'latest_writing_data': writing_data} diff --git a/modules/writing_observer/writing_observer/writing_analysis.py b/modules/writing_observer/writing_observer/writing_analysis.py index c2cb9b33..2c848296 100644 --- a/modules/writing_observer/writing_observer/writing_analysis.py +++ b/modules/writing_observer/writing_observer/writing_analysis.py @@ -338,7 +338,6 @@ async def last_document(event, internal_state): Small bit of data -- the last document accessed. This can be extracted from `document_list`, but we don't need that level of complexity for the 1.0 dashboard. - This code accesses the code below which provides some hackish support functions for the analysis. Over time these may age off with a better model. @@ -352,6 +351,46 @@ async def last_document(event, internal_state): return False, False +# Basic class tests and extraction. +# ------------------------------- +# A big part of this project is wrapping up google doc events. +# In doing that we are reverse-engineering some of the elements +# particularly the event types. This code provides some basic +# wrappers for event types to simplify extraction of key elements +# and to simplify event recognition. +# +# Over time this will likely expand and will need to adapt to keep +# up with any changes in the event structure. For now it is just +# a thin abstraction layer on a few of the pieces. + +def is_visibility_eventp(event): + """ + Given an event return true if it is a visibility + event which indicates changing the doc shown or + active. + + Here we look for an event with 'client' + containing the field 'event_type' of + 'visibility' + """ + Event_Type = event.get('client', {}).get('event', None) + return (Event_Type == 'visibility') + + +def is_keystroke_eventp(event): + """ + Given an event return true if it is a keystroke + event which indicates changing the doc shown or + active. + + Here we look for an event with 'client' + containing the field 'event_type' of + 'keystroke' + """ + Event_Type = event.get('client', {}).get('event', None) + return (Event_Type == 'keystroke') + + # Simple hack to match URLs. This should probably be moved as well # but for now it works. # diff --git a/servermanagement/AddWOtoVENV.sh b/servermanagement/AddWOtoVENV.sh new file mode 100755 index 00000000..c0187663 --- /dev/null +++ b/servermanagement/AddWOtoVENV.sh @@ -0,0 +1,109 @@ +#!/usr/bin/env bash +# +# Add AWOtoVENV +# Collin F. Lynch + +# This script takes as argument a specified VENV. It +# then adds the Learning Observer, Writing Observer, and +# the dashboard. Construction of the VENV can be done +# using the SetupVENV script located in this directory. + + +# Argument +# -------------------------------------------- +# This takes a single argument that should point +# to the directory of the VENV. You can then +# use this to make any necessary changes. +VIRTUAL_ENV="$1" +echo "USING VENV: $VIRTUAL_ENV" + + + +# Parameters: +# --------------------------------------------- +# Change these if you need to use a different +# python or pip. Otherwise leave them as-is. +PYTHON_CMD="python" +PIP_CMD="pip" + +CODE_REPOS_LOC="../../" + +# Activate VENV +# --------------------------------------------------------- +source "$VIRTUAL_ENV/bin/activate" + + +# Installation +# ---------------------------------------------------------- +# If we plan to use a GPU then this line must also +# be run. Comment out the code below if you do +# not want cuda installed or edit it for your +# library version. +# +# Note that by default we seem to be unable to rely +# on spacy to pull the right cuda on its own +#echo -e "\n=== Installing Spacy CUDA, comment out if not needed. ===" +#echo -e "\n Using CUDA v. 117" +#"$PIP_CMD" install spacy[cuda117] + +# If you are using cuda 12.1 as we are on some +# systems then spacy's passthrough install will +# not work. Therefore you will need a two-step +# process. +echo -e "\n Using CUDA v. 12.x" +"$PIP_CMD" install cupy-cuda12x +"$PIP_CMD" install spacy[cuda12x] + + +# Install basic requirements. +echo -e "\n=== Installing Requirements.txt ===" +cd .. +"$PIP_CMD" install -r requirements.txt + +echo -e "\n=== Installing Learning Observer ===" +make install +#cd learning_observer +#"$PYTHON_CMD" setup.py develop + +pip install --upgrade spacy[cuda12x] +pip install --upgrade pydantic + + +echo -e "\n=== Installing Modules ===" +cd ../modules/ + +echo -e "\n--- Installing Writing Observer ---" +cd ./writing_observer +"$PYTHON_CMD" setup.py develop +cd .. + +echo -e "\n--- Installing lo_dash_react_components. ---" +cd ./lo_dash_react_components +nvm install +nvm use +npm install +"$PYTHON_CMD" setup.py develop +pip install . +cd .. + +echo -e "\n--- Installing wo_highlight_dashboard. ---" +cd ./wo_highlight_dashboard +"$PYTHON_CMD" setup.py develop +cd .. + + +echo -e "\n--- Installing common student errors. ---" +cd ./wo_common_student_errors +"$PYTHON_CMD" setup.py develop +cd .. + + +echo -e "\n--- Installing bulk analysius (askGPT). ---" +cd ./wo_bulk_essay_analysis +"$PYTHON_CMD" setup.py develop +cd .. + + + + + diff --git a/servermanagement/BackupWebSocketLogs.sh b/servermanagement/BackupWebSocketLogs.sh new file mode 100644 index 00000000..c3dbd59a --- /dev/null +++ b/servermanagement/BackupWebSocketLogs.sh @@ -0,0 +1,17 @@ +# System Variables +# -------------------------------------- +LOGFILE_SRC="/usr/local/share/Projects/WritingObserver/Repo-Fork/writing_observer/learning_observer/learning_observer/logs" +LOGFILE_DEST="/usr/local/share/Projects/WritingObserver/Repo-Fork/writing_observer/learning_observer/learning_observer/logs" + +# Make the backup name +# --------------------------------------- +LOG_DATE=$(date "+%m-%d-%Y--%H-%M-%S") +BACKUP_NAME="$LOGFILE_DEST/learning_observer_backup_$LOG_DATE.tar.gz" +echo $BACKUP_NAME; + +# Create the backup +# --------------------------------------- +echo "Backing up web socket logs" +find $LOGFILE_SRC -name "????-??-??T*.log" -mmin +60 -print0 | tar -czvf $BACKUP_NAME --null -T - +echo "Removing backed up web sockets logs" +find $LOGFILE_SRC -name "????-??-??T*.log" -mmin +120 -delete diff --git a/servermanagement/RunLearningObserver.sh b/servermanagement/RunLearningObserver.sh new file mode 100755 index 00000000..ec1ee977 --- /dev/null +++ b/servermanagement/RunLearningObserver.sh @@ -0,0 +1,46 @@ +#!/usr/bin/env bash +# =============================== +# RunLearningObserver.sh +# Collin F. Lynch +# +# This bash script provides a simple wrapper to run the +# learning observer service and pipe the data to a logfile +# over time this should be integrated into the systemd +# service process. This uses static variables to specify +# the location of the virtualenv and the command and +# specifies the location for the running logfile. + +# System Variables +# -------------------------------------- +VIRTUALENV_PATH="/usr/local/share/projects/WritingObserver/VENV/WOVenv" +#VIRTUALENV_PYTHON="/usr/local/share/Projects/WritingObserver/VirtualENVs/learning_observer/bin/python3.9" +LEARNING_OBSERVER_LOC="/usr/local/share/projects/WritingObserver/Repositories/ArgLab_writing_observer/learning_observer" +LOGFILE_DEST="/usr/local/share/projects/WritingObserver/Repositories/ArgLab_writing_observer/learning_observer/learning_observer/logs" + + +# Make the logfile name +# --------------------------------------- +LOG_DATE=$(date "+%m-%d-%Y--%H-%M-%S") +LOGFILE_NAME="$LOGFILE_DEST/learning_observer_service_$LOG_DATE.log" +echo $LOG_NAME; + +DOC_PROCESSOR_LOG="$LOGFILE_DEST/document_processor_service_$LOG_DATE.log" +echo $DOC_PROCESSOR_LOG; + +# Now run the thing. +# -------------------------------------- +echo "Running Learning Observer Service..." +cd $LEARNING_OBSERVER_LOC + +source $VIRTUALENV_PATH/bin/activate + +nohup python learning_observer/doc_processor.py > $DOC_PROCESSOR_LOG 2>&1 & +DOC_PROCESS_ID=$! +echo $DOC_PROCESS_ID > $LOGFILE_DEST/doc_run.pid + +nohup python learning_observer > $LOGFILE_NAME 2>&1 & +PROCESS_ID=$! +echo $PROCESS_ID > $LOGFILE_DEST/run.pid + +# Set the number of allowed open files to something large 8192 +prlimit --pid $PROCESS_ID --nofile=8192 diff --git a/servermanagement/SetupVENV.sh b/servermanagement/SetupVENV.sh new file mode 100755 index 00000000..0483b0c1 --- /dev/null +++ b/servermanagement/SetupVENV.sh @@ -0,0 +1,44 @@ +#!/usr/bin/env bash +# +# SetupVENV.sh +# Collin F. Lynch + +# This script performs the basic VENV setup necessary for our LO +# server. When called it takes as an argument the path for the +# VENV storage and a name. It then generates the VENV and upgrades +# the local pip install. It does *not* install the workbench or +# LO code. That part must be done with separate scripts that +# are located in this folder and in the AWE_Workbench code. + + +# Argument Parsing +# ----------------------------------------------- +# The first argument to the script will specify the name of +# the virtual environment. Use something simple like WOVenv +VIRTUAL_ENV_NAME=$1 + +# The second should be a path to your working directory (above the +# repositories) where you will actually run the code. +VIRTUAL_ENV_LOC=$2 + + +# Parameters +# ----------------------------------------------- +# Change these params if you need to shift python +# or pip versions. Otherwise leave them as-is. + +PYTHON_CMD="python3.9" +PIP_CMD="pip" + +# Execution +# --------------------------------------------------------- +echo "1) Generating VENV" +"$PYTHON_CMD" -m venv "$VIRTUAL_ENV_LOC/$VIRTUAL_ENV_NAME" + +# Initialize +echo "2) Starting $VIRTUAL_ENV_NAME" +source "$VIRTUAL_ENV_LOC/$VIRTUAL_ENV_NAME/bin/activate" + +# Update the Pip Version. +echo "3) Updgrading Pip" +"$PIP_CMD" install --upgrade pip diff --git a/servermanagement/learning_observer_logrotate b/servermanagement/learning_observer_logrotate new file mode 100644 index 00000000..d3dc6bef --- /dev/null +++ b/servermanagement/learning_observer_logrotate @@ -0,0 +1,22 @@ +/path/to/repo/learning_observer/learning_observer/logs/*.pid +{ + daily + rotate 2 + olddir /path/to/backup + compress + missingok + notifempty +} + +/path/to/repo/learning_observer/learning_observer/logs/*.json +/path/to/repo/learning_observer/learning_observer/logs/learning_observer_service*.log +/path/to/repo/learning_observer/learning_observer/logs/debug.log +/path/to/repo/learning_observer/learning_observer/logs/incoming_websocket.log +{ + daily + rotate 5 + olddir /path/to/backup + compress + missingok + notifempty +} diff --git a/testcode/TestRedis.py b/testcode/TestRedis.py new file mode 100644 index 00000000..ca7cb42e --- /dev/null +++ b/testcode/TestRedis.py @@ -0,0 +1,22 @@ +#!/usr/bin/env python +# Simple Asyncio redis test. + + +import asyncio +import asyncio_redis + + +async def example(): + # Create Redis connection + connection = await asyncio_redis.Connection.create(host='localhost', port=6379) + + # Set a key + await connection.set('my_key', 'my_value') + + # When finished, close the connection. + connection.close() + + +if __name__ == '__main__': + loop = asyncio.get_event_loop() + loop.run_until_complete(example()) diff --git a/testcode/WebSocketTest.py b/testcode/WebSocketTest.py new file mode 100644 index 00000000..0ad026e0 --- /dev/null +++ b/testcode/WebSocketTest.py @@ -0,0 +1,22 @@ +#!/usr/bin/env python +# ================================== +# WebSocketTest.py +# Collin F. Lynch. +# +# This is a simple piece of code that I put together to +# ping the websockets API of the server just to confirm +# that it is running. +# +# Just gets a reject at the moment which is fine. + + +import asyncio +import websockets + +def test_url(url, data=""): + async def inner(): + async with websockets.connect(url) as websocket: + await websocket.send(data) + return asyncio.get_event_loop().run_until_complete(inner()) + +test_url("wss://writing.csc.ncsu.edu/wsapi/in")