简体   繁体   English

基于j2ee数据挖掘的健康预测系统

[英]Health prediction system using data mining in j2ee

I want to develop a health prediction system using data mining.我想开发一个使用数据挖掘的健康预测系统。 Can anyone give me some tips on how to develop it?谁能给我一些关于如何开发它的提示?

The requirement is that when user enter first symptoms then system will check that in how many disease have this same symptoms after that system will give some options related to symptoms so it can differentiate between those disease and can infer what the disease is.要求是当用户输入第一个症状时,系统将检查有多少疾病具有相同的症状,然后系统将提供与症状相关的一些选项,以便它可以区分这些疾病并推断出疾病是什么。

The added component here (and this is where the data mining and prediction portions comes in) is that, when the user enters a symptom, it should also suggest other symptoms that they might also be experiencing.这里添加的组件(这是数据挖掘和预测部分的用武之地)是,当用户输入一个症状时,它还应该建议他们可能也遇到的其他症状。 For example, if they have a fever, there is a high probability that they also have chills, so when they enter "fever" it should suggest "chills" as an additional symptom.例如,如果他们发烧,那么他们也有寒战的可能性很高,所以当他们输入“发烧”时,应该提示“寒战”作为附加症状。

在此处输入图片说明

Here's one possible database design:这是一种可能的数据库设计:

在此处输入图片说明

Here's an example of what the data would look like:以下是数据外观的示例: 在此处输入图片说明 So, basically, this is saying that the symptoms of the flu are fever, chills, and fatigue, but the symptoms of a cold is just "fatigue."所以,基本上,这是说流感的症状是发烧、发冷和疲劳,但感冒的症状只是“疲劳”。 (Obviously this isn't all that complete but it's good for illustration). (显然,这并不是那么完整,但它很好地说明了这一点)。

One person searched for a disease whose symptoms were "fever" and "chills."一个人搜索了一种症状为“发烧”和“发冷”的疾病。 A second person searched for a disease whose symptoms were just "chills."第二个人搜索了一种症状只是“发冷”的疾病。 A third person searched for a disease whose symptoms were "fever" and "fatigue."第三个人搜索了一种症状为“发烧”和“疲劳”的疾病。

Searches would be done with a stored procedure: you input one or more symptoms and it inserts a record of the search into the SearchHistory table and retrieves a list of diseases associated with those.搜索将使用存储过程完成:您输入一个或多个症状,它会将搜索记录插入 SearchHistory 表中,并检索与这些相关的疾病列表。

If you want to calculate the odds of two symptoms going together, you can have a User-Defined Function that calculates the percent of searches that contain both symptoms.如果您想计算两个症状同时出现的几率,您可以使用用户定义的函数来计算包含这两个症状的搜索百分比。 For example, if searches 1, 2, and 3 have both "fever" and "chills" as symptoms, but search 4 has "fever" and "fatigue" and search 5 has just "chills," there's obviously a 60% chance that someone who searches for "fever" will also search for "chills."例如,如果搜索 1、2 和 3 都有“发烧”和“发冷”作为症状,但搜索 4 有“发烧”和“疲劳”,而搜索 5 只有“发冷”,那么显然有 60% 的可能性是搜索“发烧”的人也会搜索“寒战”。

There's also a 20% chance that someone who searches for "fever" will also search for "fatigue," but in this case the sample is obviously too small to know for sure if that's actually representative.搜索“发烧”的人也有 20% 的机会搜索“疲劳”,但在这种情况下,样本显然太小,无法确定这是否真的具有代表性。 (That's one of the downsides to this design - your predictions will get better over time, but the early predictions might not be too accurate). (这是这种设计的缺点之一 - 随着时间的推移,您的预测会变得更好,但早期的预测可能不太准确)。

The advantage of this is that you don't have to manually enter any data on the odds of any symptoms occurring together, and the system'll "automatically" adapt and improve over time (ie your predictions will keep improving as you get more data) depending on what users of your system search on.这样做的好处是您不必手动输入任何症状同时发生的几率的任何数据,并且系统会“自动”适应并随着时间的推移而改进(即随着您获得更多数据,您的预测将不断改进) ) 取决于您的系统用户搜索的内容。

The downside, of course, is that you'd only have data on likely co-occurring symptoms once people started using the system, so early users wouldn't get the advantage of having the predictions, and it would take a little while before it was all that accurate at predicting what symptoms are likely to "go together."当然,不利的一面是,一旦人们开始使用该系统,您将只拥有可能同时出现的症状的数据,因此早期用户将无法获得预测的优势,并且需要一段时间才能实现在预测哪些症状可能“一起出现”方面非常准确。 (Think of the case above where you were predicting a 20% chance of "fever" and "fatigue" going together based only on a single search). (想想上面的案例,您仅基于一次搜索就预测“发烧”和“疲劳”同时发生的几率为 20%)。

Hope this helps some.希望这对一些人有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM