I am trying to learn a bit of PANDAS and so I'm going through some R code and trying to reproduce things in Python.
I have the following simple example
tempdat <- data.frame(unit=c('feet','feet','feet','feet','metres','metres','metres','metres'),
feet=c(50,45,75,60,26,32,40,45))
t.test(feet~unit, alternative='two.sided', conf.level=.95, var.equal=F, data=tempdat)
I want to do the equivalent function in Python, and this is what I have so far, but the results are different.
tempdat = pd.DataFrame({'unit':['feet','feet','feet','feet','metres','metres','metres','metres'], 'feet':[50,45,75,60,26,32,40,45]})
feet_group = tempdat[tempdat['unit']=='feet']
metres_group = tempdat[tempdat['unit']=='metres']
stats.ttest_ind(feet_group['feet'], metres_group['feet'], equal_var=False)
On the face an error in the first line: tempdat is python built-in dict. So it must have unique keys. So after definition
tempdat={'feet':50,'feet':45,'feet':75,'feet':60,'metres':26,'metres':32,'metres':40,'metres':45}
you will have only last values:
tempdat={'feet': 60, 'metres': 45}
Therefore the test results differ
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.