繁体   English   中英

Jupyter Notebook-Python代码

[英]Jupyter Notebook - Python Code

我正在做一个Jupyter Notebook分析一些看起来像这样的数据:

我正在分析的数据

我必须找出以下信息:

问题

这是我尝试过的方法,但是它不起作用,我对如何执行b部分完全感到困惑。

# Import relevant packages/modules
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import stats

# Import relevant csv data file
data = pd.read_csv("C:/Users/Hanna/Desktop/Sheridan College/Statistics for Data Science/Assignment 1/MATH37198_Assignment1_Individual/IGN_game_ratings.csv")

# Part a: Determine the z-score of "Super Mario Kart" and print out result
superMarioKart_zscore = data[data['Game']=='Super Mario Kart']   ['Score'].stats.zscore()
print("Z-score of Super Mario Kart: ", superMarioKart_zscore)

# Part b: The top 20 (most common) platforms

# Part c: The average score of all the Shooter games
averageShooterScore = data[data['Group']=='Game']['Score'].mean()
# Print output
print("The average score of all the Shooter games is: ", averageShooterScore)

# Part d: The top two platforms witht the most perfect scores (10)

# Part e: The probability of a game randomly selected that is an RPG
# First find the number of games in the list that is an RPG
numOfRPGGames = 0
for game in data['Game']:
    if data['Genre'] == 'RPG':
        numOfRPGGames += 1
# Divide this by the total number of games to find the probablility of selecting one
print("The probability of selecting a game that is an RPG is: ", numOFRPGGames/totalNumGames)

# Part f: The probability of a game randomly selected with a score less than 5
# First find the number of games in the list with a score less than 5 using a for loop:
numScoresLessThan5 = 0
for game in data['Game']:
    if data['Score'] < 5:
        numScoresLessThan5 += 1
# Divide this by the total number of games to find the probablility of selecting one
print("The probability of selecting a game with a score less than 5 is: ", numScoresLessThan5/totalNumGames)

熊猫具有出色的内置函数来应对此类问题。 这是使用我从CSV导入的一些测试数据来解决b部分的建议。 我使用的test.csv仅具有这些字段,但是在您更改列名并导入新文件的情况下仍然有效

样本CSV结构

# Import relevant packages/modules
import numpy as np
import pandas as pd

# Import a dummy csv data file
data = pd.read_csv("./test.csv")
# Visualize the file before the process
print(data)

# Extract the column you're interesting in counting
initial_column = data['Name']

# Create object for receiving the output of the value_counts function
count_object = pd.value_counts(initial_column)

# Create an empty list for receiving the sorted values
sorted_grouped_column = []

# You determine the number of items. In your exercise is 20.
number_of_items = 3
counter = 0

for i in count_object.keys():
    if counter == number_of_items:
        break
    else:
        sorted_grouped_column.append(i)
        counter = counter + 1

print(sorted_grouped_column)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM