Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

algorithm for determining similar topics based on votes #5

Open
ddfridley opened this issue Feb 18, 2021 · 6 comments
Open

algorithm for determining similar topics based on votes #5

ddfridley opened this issue Feb 18, 2021 · 6 comments

Comments

@ddfridley
Copy link
Contributor

ddfridley commented Feb 18, 2021

Each user will see 2 topics that they have created, plus 2 topics from each of 9 other people. (20 total)
The user is asked to group the topics that are similar, and pick the one that is most representative in each group.
There will be 100's of users, and each user will be shown items at random, but each user's input will be seen an equal number of times.

Develop an algorithm for figuring out most representative groupings, based on the user input of what they group and what they select as most representative.

There will be a round based system, where if there are 1000 users, each user will review a group of 20, from 10 different users. Then, the most representative top 200 groups (10%) will be distributed at random in groups of 20 to users again, and they will be asked to group them. If there are 10000 people, another round will be added.

The number of topics a user creates, and the number of other people's input that will be shown to a users should be variables that we can tweak based on users experience. (Basically - is 20 items to many and users burn out?)

@ddfridley
Copy link
Contributor Author

Derek will investigate and try to draw out a process, and investigate using in javascripts, or python.

@ddfridley ddfridley added the Inprogress Issue is assigned to someone and being worked on label Mar 18, 2021
@ddfridley
Copy link
Contributor Author

David and Derek met on 3/30 and generated some example data to use for grouping. But more is needed. Goal to finish process workflow for next week. Hope to have basic javascript for processing initial data.

@ddfridley
Copy link
Contributor Author

Derek wrote a test of grouping responses, and group by count.
A challenge is how do we keep track how often topics are shown.
Derek and David are working on a google form in order to get initial data to seed the algorithm.

@ddfridley
Copy link
Contributor Author

@dplem working on algorithm to determine how many times a topic has been seen, and then figure out how often a topic has been voted up.
Here is some data found to help with generating test data.
https://colab.research.google.com/drive/1EJol2OxLwWbgdn4_e98pZeFv4ZQqkOKr?usp=sharing

On friday we will talk about how to read in a csv file.

@ddfridley ddfridley removed the Inprogress Issue is assigned to someone and being worked on label Jul 9, 2021
@ddfridley
Copy link
Contributor Author

@ddfridley
Copy link
Contributor Author

Ignore the text to the left-this is a whiteboard drawing of an algorithm for figuring it out from the first round.

20211126_140920

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants