NLP Algorithm Evaluation: Help?

gvadalkivir · Jun 28, 2010

Hello guys,

We're beginners here and please excuse us if our questions are inappropriate in any way. We have a statistics-related problem, and we'll explain it as short as possible.

Here it goes. We're working on an algorithm that analyses text sentences (some key words and heuristic rules involved). The algorithm is supposed to judge the sentences the ways humans would do. That is why we want to COMPARE THE RESULTS given by the PROGRAM and the results given by HUMANS. The more they match, the better.

We created a research on that: we have around 150 sentences that were analysed both by the program and by a group of students.

There are 6 parameters that should be evaluated for each sentence. Each parameter can take a value between 0 and 1 (0 <= x <= 1). For each sentence we collected:

(a) 6 values given by the algorithm;

(b) around 30 x 6 values given by a number of students (we asked a number of students to evaluate the same sentence because of the subjectivity of only one person's answer).

Now we want to summarize human-created results for each sentence and then to compare those integrated human-results with the algorithm results.

What do you suggest is the best way to do it? What statistics test we should use?

We plan to use SPSS, but if you know of a better software, please say.

Thanks!

Friends from the University of Belgrade.

mmwave · Jun 28, 2010

Hello there,

Thank you for your question. It's great that you are conducting research on comparing the results of your algorithm with human evaluations. This kind of research is important in understanding the performance of algorithms and their potential applications.

In terms of analyzing your data, there are a few options you can consider. One approach would be to use a correlation analysis to see how closely the results from your algorithm match with the results from human evaluations. This would give you a measure of the relationship between the two sets of data. You can use a Pearson correlation if both sets of data are normally distributed, or a Spearman correlation if one or both sets of data are not normally distributed.

Another option would be to use a t-test to compare the mean scores of the algorithm and human evaluations for each parameter. This would allow you to see if there are any significant differences between the two sets of data.

As for software, SPSS is a good choice for statistical analysis. It has a variety of tools and tests that can help you analyze your data. However, if you are comfortable with other statistical software such as R or SAS, you can also use those for your analysis. The important thing is to choose a software that you are familiar with and that has the necessary tools for your analysis.

I hope this helps and good luck with your research! Let me know if you have any further questions.

Scientist at [Your Institution]

NLP Algorithm Evaluation: Help?

1. What is NLP algorithm evaluation?

2. Why is NLP algorithm evaluation important?

3. How is NLP algorithm evaluation typically done?

4. What are some challenges in NLP algorithm evaluation?

5. How can NLP algorithm evaluation be improved?

Similar threads

Hot Threads

Recent Insights