简体   繁体   中英

What machine learning paradigm / algorithm can I use to select from a pool of possible choices?

I have a large question bank and students. The goal is to select questions for an exam for a student.

Questions have various properties:

  • Grade Level
  • Subjects (could be multiple: fractions, word problems, addition)
  • How other students did on this question (percent right, wrong, etc)
  • Has the student seen this question before or those like it?

So I want to choose questions for a student based on how the student is doing. My feedback for whether or not it's a "good" exam is the following:

  • Human feedback. A person can review the exam and reject certain questions for qualitative reasons
  • How the student does on the exam? If they got 100% right, that's bad. If they got 20% right, that's bad. We want to target 75%
  • Qualitative feedback on the exam as a whole from the teacher

I feel that a neural network is a possible solution here, but I'm not sure how. Any thoughts?

Thanks in advance.

If I understand the question correctly you would have to learn whether the relation between a question and a student is "good" or "bad"? This would give you a binary classification problem, where the input is a feature vector combining both the question's features as well as that of the student?

You can always throw this in a network and see how it does, I'm guessing that you don't have too many questions and students, but as you're classifying pairs your datasize does increase, which is nice.

I would suggest probabilistic modeling, since you have some noise on your real data introduced by the human evaluation. Two annotator would not for sure give the same "qualitative feedback" about the same exam.

It's best to have a model that takes account of uncertainties; Bayesian approach ! If you have little knowledge about this area, I point you to Bishop - Pattern recognition book - freely available online and you can use library like mc-stan lib or edward-lib . There's also a course about probabilistic modeling on coursera , where in the first chapters they treat an example very close to your use case.

One more comment about your suggestion using NN: since you don't have a lot of features (6 as you have mentioned), NN will easily overfit unless you have millions of data points.. This is slightly easy problem in terms of model complexity and you don't need hidden layers to accomplish a good result.

I hope this will help.

Try also looking over Ranking Algorithms. You can train it for combination (student, question) and point this kind of combinations or generate an ordered function.

I don't have much experience with it but may be worth trying it.

I'm not sure whether neural networks are the best way to go here. They could be, but I thought almost instantly of something else.

Given the information in your question, you might want to check a statistic approach here, with some techniques like PCA or more broadly multivariate analysis .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM