[英]Filtering and sorting data with pandas
I want to get some part of information from API , but I don't know how to filter data (I want to get only chosen values and don't get values if key don't contain "BTC" string) I'm trying to do something like this: 我想从API中获取部分信息,但是我不知道如何过滤数据(我想仅获取选择的值,并且如果键不包含“ BTC”字符串则不获取值),我正在尝试做这样的事情:
{"BTC_MINT":{"volume":11.00, "high24":0.002, "low24":0.001},
"BTC_NOTE":{"volume":11.00, "high24":0.002, "low24":0.001}}
I started with pandas, but I don't know if is it proper way. 我从熊猫开始,但是我不知道这是否合适。
link = 'https://poloniex.com/public?command=returnTicker'
with urllib.request.urlopen(link) as rawdata:
data = rawdata.readall().decode()
data = json.loads(data)
print(data.items())
data = pd.DataFrame([[cur, last, volume, high24, low24]
for cur, d in data.items()
for last, x, x, x, volume, x, x, high24, low24 in d.items()])
Unfortunately, this code don't work. 不幸的是,此代码不起作用。 I get following error: 我收到以下错误:
[cur, last, volume, high24, low24] for cur, d, x, w, d, q in data.items() for last, x, x, x, volume, x, x, high24, low24 in d.items()
ValueError: need more than 2 values to unpack
Could someone help and tell me how should I do it? 有人可以帮我告诉我该怎么做吗?
df = pd.DataFrame({symbol: {"baseVolume": data[symbol].get("baseVolume"),
"high24hr": data[symbol].get("high24hr"),
"low24hr": data[symbol].get("low24hr")}
for symbol in data}).T
>>> df.head()
baseVolume high24hr low24hr
BTC_1CR 0.00000000 0.00000000 0.00000000
BTC_ABY 0.01968682 0.00000020 0.00000019
BTC_ADN 0.00000000 0.00000000 0.00000000
BTC_ARCH 0.07205024 0.00004813 0.00004693
BTC_BBR 0.19846259 0.00002123 0.00002115
To just get the names in the index starting with BTC
, do the following: 要只获取以BTC
开头的索引中的名称,请执行以下操作:
>>> df[df.index.str.startswith('BTC')].head()
baseVolume high24hr low24hr
BTC_1CR 0.00000000 0.00000000 0.00000000
BTC_ABY 0.01968682 0.00000020 0.00000019
BTC_ADN 0.00000000 0.00000000 0.00000000
BTC_ARCH 0.07205024 0.00004813 0.00004693
BTC_BBR 0.19846259 0.00002123 0.00002115
You can just pass your dictionary (data) to pd.Dataframe to create a pandas dataframe. 您可以只将字典(数据)传递到pd.Dataframe来创建pandas数据框。 If you want to subset it to only contain columns with the string BTC in it, you can do: 如果要对其进行子集化,使其仅包含其中包含字符串BTC的列,则可以执行以下操作:
df = pd.DataFrame(data)
new_cols = [x for x in df.columns if x.find('BTC') > -1]
new_df = df[new_cols]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.