Python 谷歌表格 API

Question

所以我有这个谷歌表 API，我正在从中获取数据并运行 KS 测试。 但是，我只想对一个数字运行 KS 测试。 但是，字符串也由单词组成。 例如，给你

 2020-09-15 00:05:13,chemsense,co,concentration,-0.51058,
2020-09-15 00:05:43,chemsense,co,concentration,-0.75889,
2020-09-15 00:06:09,chemsense,co,concentration,-1.23385,
2020-09-15 00:06:33,chemsense,co,concentration,-1.23191,
2020-09-15 00:06:58,chemsense,co,concentration,-0.94495,
2020-09-15 00:07:23,chemsense,co,concentration,-1.16024,

如果我将此作为字符串，我将如何仅对每行的最后一个数字运行 KS 测试。 对于instsnace，我只想在-.51、-.75、-1.23、-1.23、-.94、-1.16上运行KS测试

这是我的 Google 表格的屏幕截图：

这是我的一些代码：

from scipy import stats
import numpy as np
import gspread
from oauth2client.service_account import  ServiceAccountCredentials
import re


np.seterr(divide='ignore', invalid='ignore')
def estimate_cdf (col,bins=10,):
    print (col)
    # 'col'
    # 'bins'

    hist, edges = np.histogram(col)
    csum = np.cumsum(hist)



    return csum/csum[-1], edges
    print (csum)



scope = ["https://spreadsheets.google.com/feeds",'https://www.googleapis.com/auth/spreadsheets',"https://www.googleapis.com/auth/drive.file","https://www.googleapis.com/auth/drive"]
creds = ServiceAccountCredentials.from_json_keyfile_name("creds.json", scope)

client = gspread.authorize(creds)

sheet = client.open("sheet1").sheet1  # Opens the spreadhseet

data = sheet.get_all_records()


row = sheet.row_values(3)  # Grab a specific row






number_regex = r'^-?\d+\.?\d*$'





col = sheet.col_values(3)  # Get a specific column print (col)

col2= sheet.col_values(4)
dolphin= estimate_cdf(adjusted := [float(i) for i in col if re.match(i, number_regex)], len(adjusted))



print(col)
print(col2)




shtest =stats.shapiro(col)
print(shtest)




#thelight= sheet.update_cell(5,6,col)
#print(thelight)

k2test =stats.ks_2samp(col, col2, alternative='two-sided', mode='auto')
print(k2test)

这是我的一些错误信息：

温度,64.7959999999999,65.03830769230765', '2020-09-25 11:38:51,metsense,htu21d,温度,64.85,65.013384615,133846153842502,1,9-40,1,6-16-16-16-10-2020-25 64.99538461538458', '2020-09-25 11:39:42,metsense,htu21d,温度,65.066,64.98015384615381', '2020-09-25,4,999t, 106:99.16,994.99t.161,994.99t.161,994.99.94 2020-09-25 11:40:31,metsense,htu21d,温度,64.976,64.93861538461535', '2020-09-25 11:40:57,metsense,htu21d,温度,6706-308.676-308.608.6069. 25 11:41:22,metsense,htu21d,温度,65.048,64.93584615384611','2020-09-25 11:41:48,metsense,htu21d,温度,64.994,64307-53846205'1846205' :12,metsense,htu21d,温度,64.976,64.93169230769227', '2020-09-25 11:42:37,metsense,htu21d,温度,64.94,64.9441538409230769227', 2020-09-25 ,htu21d,temperature,64.994,64.95523076923072', '2020-09-25 11:43:28,metsense,htu21d,temperature,64.9'] 回溯（最近一次调用）：文件“C：/用户/詹姆斯/ mProjectsfreshproj/shapiro wilks.py”，第 60 行，在 shtest =stats.shapiro(col) 文件“C:\\Users\\james\\PycharmProjectsfreshproj\\venv\\lib\\site-packages\\scipy\\stats\\morestats.py”，第 1676 行, 在 shapiro a, w, pw, ifault = statlib.swilk(y, a[:N//2], init) ValueError: could not convert string to float: ',,,,,'

进程以退出代码 1 结束

Answer 1

问题

给定来自 Google Sheets API 的字符串，对每个字符串的最后一个数字运行 kstest。

解决方案

更好的方法是直接从 Google Sheets API 获取数字，将它们存储并提供给stats.kstest 。

使用现有的字符串

您可以使用str.split拆分字符串，然后将其转换为浮动。

例子

>>> s = '2020-09-15 00:05:43,chemsense,co,concentration,-0.75889,'

>>> s.split(',')
['2020-09-15 00:05:43', 'chemsense', 'co', 'concentration', '-0.75889', '']

>>> s.split(',')[4] # get the number (5th item in the list)
'-0.75889'

>>> float(s.split(',')[4]) # convert to float type
-0.75889

>>> round(float(s.split(',')[4]), 2) # round to 2 decimal place
-0.76

from scipy import stats

# Assuming strings coming back from API are in a list
str = [
'2020-09-15 00:05:13,chemsense,co,concentration,-0.51058,',
'2020-09-15 00:05:43,chemsense,co,concentration,-0.75889,',
'2020-09-15 00:06:09,chemsense,co,concentration,-1.23385,',
'2020-09-15 00:06:33,chemsense,co,concentration,-1.23191,',
'2020-09-15 00:06:58,chemsense,co,concentration,-0.94495,',
'2020-09-15 00:07:23,chemsense,co,concentration,-1.16024,'
]

x = []

for s in str:
  x.append(float(s.split(',')[4]))

stats.kstest(x, 'norm')

Python 谷歌表格 API

问题描述

1 个解决方案

解决方案1
0 2020-10-18 16:09:58

问题

解决方案

使用现有的字符串

例子

Python 谷歌表格 API

问题描述

1 个解决方案

解决方案1 0 2020-10-18 16:09:58

问题

解决方案

使用现有的字符串

例子

解决方案1
0 2020-10-18 16:09:58