Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Social competence survey for bonus points #839

Merged
merged 9 commits into from
Mar 5, 2024

Conversation

juandpenan
Copy link
Contributor

** Note: Your contribution is expected to meet the conventions and policies described in the contribution guidelines **

Description

Closes issue #827

Changes proposed in this pull request:

  • Introduced bonuses for receptionist and restaurant tasks.
  • Assigned qualitative assessment scores ranging from 0 to 50 based on a survey for teams.
  • Mandated each team to bring a minimum of two public evaluators and inform the referee beforehand.
  • Provided a Microsoft Form for surveys and an online spreadsheet for result visibility.

Copy link
Member

@johaq johaq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like the idea. Couple of notes:

  • I'm not sure we want the referee to evaluate since they should be focusing on running the test
  • Probably need to keep the group of volunteers consistent for one test block for fairness in evaluation

Aside from the notes please change task to test which is the usual terminology in the rulebook (I know not 100% consistent everywhere but it should be)

general_rules/PenaltiesBonuses.tex Outdated Show resolved Hide resolved
general_rules/PenaltiesBonuses.tex Outdated Show resolved Hide resolved
general_rules/PenaltiesBonuses.tex Outdated Show resolved Hide resolved
@juandpenan
Copy link
Contributor Author

juandpenan commented Jan 12, 2024

Like the idea. Couple of notes:

  • I'm not sure we want the referee to evaluate since they should be focusing on running the test
  • Probably need to keep the group of volunteers consistent for one test block for fairness in evaluation

Aside from the notes please change task to test which is the usual terminology in the rulebook (I know not 100% consistent everywhere but it should be)

Thank you, @johaq, for the review! I've incorporated the suggested changes and added an additional item based on our discussion in the last meeting. " The referee has the authority to skip the social assessment test if they believe the robot's performance is not suitable for measurement."

Copy link
Member

@johaq johaq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two small typos.
Otherwise looks good to me.

general_rules/PenaltiesBonuses.tex Outdated Show resolved Hide resolved
general_rules/PenaltiesBonuses.tex Outdated Show resolved Hide resolved
@juandpenan
Copy link
Contributor Author

Two small typos. Otherwise looks good to me.

Done! Thanks a lot

johaq
johaq previously approved these changes Jan 13, 2024
@akinobu1998
Copy link
Member

The form and spreadsheet look like owned by a personal account and the spreadsheet is private.
Is this URL and QR code the finalized version? If not, we can distribute the QR code via Telegram or printed QR code.
Sometimes online document disappears by accident, and I also worry about spam. Then we had to change the link.

I think we should also include the question list in the rulebook to track changes and review the questions more easily.

@johaq
Copy link
Member

johaq commented Jan 20, 2024

I'm not familiar with office forms and if you can host them on your own. Ideally the form would be on the athome website or hosted by a University I think.

@sunava sunava self-requested a review February 26, 2024 12:08
@@ -42,6 +42,26 @@ \subsection{Bonus for outstanding performance}\label{rule:outstanding_performanc
\item It is the decision of the \iaterm{Technical Committee}{TC} if (and to which degree) the bonus score is granted.
\end{enumerate}

\subsection{Bonus for perceived social intelligence}\label{rule:perceived_intelligence}
\begin{enumerate}
\item For the test \iterm{Receptionist} in \iterm{Stage~I} and, \iterm{Restaurant} in \iterm{Stage~II} tests. Teams are allowed to request an assessment of the robot's performance regarding its perceived social intelligence.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imho should be mandatory and integrated in the scoring of the tasks - to make sure league/referee prepares volunteers.
To be useable data we have to define a minimum number of respondents,
for testing the procedure (or getting an overview of the league) in local leagues we can always opt to skip parts of a task e.g. this survey.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it has to be mandatory, but I agree, that it should be integrated in the scoring of the tasks. Since you can't lose anything anyway, any rational team should always opt for social scoring.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we can just make it mandatory

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Teams are allowed Teams are assessed


\item Every team seeking a social assessment should inform the referee before the test. that they want to be evaluated using perceived social intelligence.

\item After the test is completed, the evaluators will fill out the form accessible via the QR code in Figure \ref{fig:qr-survey}, or through this \href{https://forms.office.com/Pages/ResponsePage.aspx?id=6sSEXw03nkuDDHVvi_G1H7VNGCdGFtZJs0ryJVVWtCFUQVFSWDlYM0FHRVA2QllIT0tOQjI2QUcxQi4u}{link}.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to be hosted somewhere under control of the league. Fallback sheet in this github for printing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can figure out hosting later


\item The referee has the authority to skip the social assessment test if they believe the robot's performance is not suitable for measurement.

\item The score will be automatically recorded in this \href{https://urjc-my.sharepoint.com/:x:/r/personal/juan_pena_urjc_es/Documents/ROBOCUP%20@HOME%20PSI%20SCALE%20PROPOSAL.xlsx?d=wfdc816bee34742e1a9e5bea95677985d&csf=1&web=1&e=zRwl4u}{spreadsheet}.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to be hosted somewhere under control of the league. Fallback script maybe? Link or document how to calculate score atleast.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think we need to mention how the score is calculated. Average value from all 15 ratings collected times 10, so that a maximum of 50 points can be achieved.

$$\text{Social Score} = \text{scaling} \times \frac{1}{n} \sum_{i=1}^{n} \text{Rating}_i$$

$$\text{scaling} = \frac{\text{max Social Score}}{\text{max Rating}} = 10$$

$$\text{n} = | \text{{Rating}} |$$

Copy link
Member

@LeanderVonSeelstrang LeanderVonSeelstrang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally good, but it should be fully integrated into the task and the method of converting the assessment into a score should be explicitly stated. Also the survey should always be linked. Possibly a non-editable version.

general_rules/PenaltiesBonuses.tex Show resolved Hide resolved

\item This bonus, ranging from 0 to 50, depends on the robot's social performance which will be assessed by Referees using a specially designed scale in a survey.

\item Every team seeking a social assessment should inform the referee before the test. that they want to be evaluated using perceived social intelligence.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small typo. A point in the middle after test. Also should -> must. Maybe also request how mich before. E.g. an hour.

Any team seeking a social assessment must inform the referee at least one hour before the test that they wish to be evaluated on perceived social intelligence.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should actually be the day before the test in the team leader meeting

general_rules/PenaltiesBonuses.tex Show resolved Hide resolved

\item The referee has the authority to skip the social assessment test if they believe the robot's performance is not suitable for measurement.

\item The score will be automatically recorded in this \href{https://urjc-my.sharepoint.com/:x:/r/personal/juan_pena_urjc_es/Documents/ROBOCUP%20@HOME%20PSI%20SCALE%20PROPOSAL.xlsx?d=wfdc816bee34742e1a9e5bea95677985d&csf=1&web=1&e=zRwl4u}{spreadsheet}.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think we need to mention how the score is calculated. Average value from all 15 ratings collected times 10, so that a maximum of 50 points can be achieved.

$$\text{Social Score} = \text{scaling} \times \frac{1}{n} \sum_{i=1}^{n} \text{Rating}_i$$

$$\text{scaling} = \frac{\text{max Social Score}}{\text{max Rating}} = 10$$

$$\text{n} = | \text{{Rating}} |$$

@@ -42,6 +42,26 @@ \subsection{Bonus for outstanding performance}\label{rule:outstanding_performanc
\item It is the decision of the \iaterm{Technical Committee}{TC} if (and to which degree) the bonus score is granted.
\end{enumerate}

\subsection{Bonus for perceived social intelligence}\label{rule:perceived_intelligence}
\begin{enumerate}
\item For the test \iterm{Receptionist} in \iterm{Stage~I} and, \iterm{Restaurant} in \iterm{Stage~II} tests. Teams are allowed to request an assessment of the robot's performance regarding its perceived social intelligence.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it has to be mandatory, but I agree, that it should be integrated in the scoring of the tasks. Since you can't lose anything anyway, any rational team should always opt for social scoring.

@juandpenan
Copy link
Contributor Author

Thanks for the feedback. So I modify the following:

  • Added questions
  • Added the score calculation method
  • Made the score "mandatory" for receptionist and restaurant tests
  • Also in the upcoming days will look for different options to host the survey.

Thanks!

LeroyR
LeroyR previously approved these changes Mar 4, 2024
Copy link
Member

@LeroyR LeroyR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, slight wording change needed as assessment is mandatory

@@ -42,6 +42,26 @@ \subsection{Bonus for outstanding performance}\label{rule:outstanding_performanc
\item It is the decision of the \iaterm{Technical Committee}{TC} if (and to which degree) the bonus score is granted.
\end{enumerate}

\subsection{Bonus for perceived social intelligence}\label{rule:perceived_intelligence}
\begin{enumerate}
\item For the test \iterm{Receptionist} in \iterm{Stage~I} and, \iterm{Restaurant} in \iterm{Stage~II} tests. Teams are allowed to request an assessment of the robot's performance regarding its perceived social intelligence.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Teams are allowed Teams are assessed

Copy link
Member

@LeanderVonSeelstrang LeanderVonSeelstrang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update.

@juandpenan juandpenan dismissed stale reviews from LeanderVonSeelstrang and LeroyR via ed2d76b March 5, 2024 08:16
@juandpenan
Copy link
Contributor Author

lgtm, slight wording change needed as assessment is mandatory

Done! thankss

@LeroyR LeroyR removed the request for review from sunava March 5, 2024 09:20
@LeroyR LeroyR merged commit 01eecfe into RoboCupAtHome:master Mar 5, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants