Difference between revisions of "2010:Evalutron6000 Walkthrough"

From MIREX Wiki
Line 88: Line 88:
  
 
'''Figure 9. Progress indicated on the My Assignment Page '''
 
'''Figure 9. Progress indicated on the My Assignment Page '''
 +
 +
 +
==Grading Expectations and "Reasonableness"==
 +
 +
For each query-candidate pair, we need you to assign BOTH a Broad Category score AND a Fine Score (i.e., a numeric grade between 0 and 100,  0 is meant to represent complete different and 100 perfectly similar or identical.). You have the freedom to make whatever associations you desire between a particular Broad Category score and its related Fine Score. In fact, we expect to see variations across evaluators with regard to the relationships between Broad Categories and Fine Scores as this is a normal part of human subjectivity. However, we will be using the two different types of scores to do important inter-related post-Evalutron calculations so, please, do be thoughtful in selecting your Broad Categories and related Fine Scores. What we are really asking here is that you apply a level of "reasonableness" to both your scores and your associations. For example, if you score a candidate in the VERY SIMILAR category, a Fine Score of 21 would not be, by most standards, "reasonable". Same applies at the other extreme. For example, a Broad Category score of NOT SIMILAR should not be associated with a Fine Score of, say, 72 or 84, etc.

Revision as of 17:03, 14 July 2010

UPDATE 14 July 2010

The 2010 Evalutron has a new look and implementation. It is easier to use than the original implementation. However, it is necessary to read through this document before you start using the system. Enjoy!

Welcome to the Evalutron 6000

In order to use the Evalutron 6000 you will need to be using a modern web browser (e.g., Firefox, Internet Explorer, Safari, Mozilla, etc) that supports JavaScript (ECMAScript) and Cookies. Evalutron has been tested on Windows XP, MacOS X, and RedHat Linux. If you are using a different platform and having trouble, please try accessing Evalutron 6000 from another machine. If you are still having difficulty, contact

mirproject@lists.lis.uiuc.edu.

When first visiting the Evalutron 6000 homepage, you will see a page similar to this (Fig. 1).

2010 e6k home.png

Figure 1. Evalutron 6000 home page.

Register to the Evalutron 6000

If you have an account with the submission system, then you can use the same account for Evalutron 6000. Otherwise, you must register a new account. Click on the "Register" link on the left side of the page to create an account.

The registration page is fairly straightforward (Fig. 2). All fields are required. You can create any username and password you wish. Username must be at least 5 characters long and passwords must be at least 8 characters long and are case-sensitive. Before completing the registration, you will receive an email with an activation link. Clicking that link will complete your registration.

2010 e6k register.png

Figure 2. Evalutron 6000 registration page.

Agree to the Informed Consent

Before starting evaluation, every evaluator must read and agree to the terms of the Informed Consent document. Otherwise, you will be redirect to the informed consent page when you try to get evaluation assignments. Clicking the "Informed Consent" link on the left-side menu will show the form (Fig. 3). The evaluation, because it is using human judgments of similarity, is considered a human-subjects research project and the Evaluatron is basically a survey instrument. To indicate your consent to participate in the evaluation, scroll down the page and check the "I Agree" checkbox below the informed consent document.

If you have questions about your rights as a subject in this research project, you should contact the UIUC IRB office (http://www.irb.uiuc.edu) for more information. The research protocol for this project is IRB# 07066.

2010 e6k consent.png

Figure 3. Evalutron 6000 informed consent page.

Get Your Assignments

To start the evaluation process, click the "My Assignments" link on the left-side menu. The assignment page is similar to Fig. 4. This page shows all tasks available in the Evalutron 6000, and initially there is no assignment given. Be careful to select the task you intend to participate, and click the "Get Assignment" button under this task. The system will assign you a number of queries to evaluate.

Please note once assignments are made, they cannot be changed and you are responsible to finish evaluations assigned to you. Therefore, please NEVER click "Get Assignment" for the task you do not intend to participate.

2010 e6k getAssignment.png

Figure 4. Evalutron 6000 get assignment page.

Evaluate

Clicking the "Evaluate Query" button will lead you to the evaluation page (Fig. 5). This page consists of instructions on the top and a list of query-candidate pairs. Please read the instructions carefully. The query in each of the query-candidate pairs is the same, and it is aligned with each candidate so that you can replay it at any time when you evaluate the candidate. For Audio Music Similarity (AMS) task, each query and candidate is 30 second long. For Symbolic Melody Similarity (SMS) task, the length varies. Clicking on the player button besides each query or candidate will load the song into the player and begin playing. Clicking the player button again will pause it. We recommend you listen to the entire query at least once before evaluating any candidate files.

2010 e6k eval page.png

Figure 5. Sample evaluation page.

Please note that the list of query-candidate pairs scrolls within the page, there are more candidates than may be immediately visible on the page. Please scroll to the bottom of the candidate list to make sure you've evaluated each song.

Once you have a feeling for whether or not the candidate is similar to the query, click the "Not Similar", "Somewhat Similar" or "Very Similar" radio buttons to the right of the query-candidate pair. Each grader will also need to assign a fine-grained score for the similarity of the candidate to the query on a scale of 0-100. To input the fine score, you need to move the scaler, and once you let go of the scaler the system will automatically record the score.

2010 e6k qcq detail.png

Figure 6. Close up image of a query-candidate pair and evaluation buttons.

Only after you input BOTH the broad category and the fine score can a query-candidate pair be marked as green, indicating you have complete evaluating this query-candidate pair.

2010 e6k qcq done detail.png

Figure 7. Close up image of a completed query-candidate pair.

You can always change your evaluation for any candidate by toggling the radio buttons and adjust the Fine Score selection scale. Once an evaluation has been made, however, it cannot be retracted, only changed. (i.e., you cannot "unvote").

Work on Another Query

At the bottom of the evaluation page, there is a "View All Assignments" button (Fig. 8). At any time, clicking this button will direct you to your assignment page similar to the one shown in Fig. 4. When you have completed evaluating all of the candidates for this query, you may click this button to continue on another query.

2010 e6k eval page bottom.png

Figure 8. Close up image of "View All Assignments" button which loads the assignment page.

You may also click the "My Assignments" link on the left-side menu to go to the My Assignments page.

On the My Assignments page, you may click another "Evaluate Query" button to load evaluation page (like Fig. 5) for another query.

You will see a list of all of the queries you have evaluated (or are evaluating) at the My Assignments page. You can return to any query by clicking on the "Evaluate Query" underneath it (Fig. 4). You can re-evaluate any candidate for any query at any time, up to the closing of the evaluation system.

Monitor Progress

At any time a grader may monitor his/her progress on the My Assignment page (Fig. 9). Each query has a status bar where the completed portion will be marked as green and the unfinished part red. When all the bars become green, the assignments are all completed.

2010 e6k progress.png

Figure 9. Progress indicated on the My Assignment Page


Grading Expectations and "Reasonableness"

For each query-candidate pair, we need you to assign BOTH a Broad Category score AND a Fine Score (i.e., a numeric grade between 0 and 100, 0 is meant to represent complete different and 100 perfectly similar or identical.). You have the freedom to make whatever associations you desire between a particular Broad Category score and its related Fine Score. In fact, we expect to see variations across evaluators with regard to the relationships between Broad Categories and Fine Scores as this is a normal part of human subjectivity. However, we will be using the two different types of scores to do important inter-related post-Evalutron calculations so, please, do be thoughtful in selecting your Broad Categories and related Fine Scores. What we are really asking here is that you apply a level of "reasonableness" to both your scores and your associations. For example, if you score a candidate in the VERY SIMILAR category, a Fine Score of 21 would not be, by most standards, "reasonable". Same applies at the other extreme. For example, a Broad Category score of NOT SIMILAR should not be associated with a Fine Score of, say, 72 or 84, etc.