I made a post earlier ( Getting excel data into Database - beginner ) about getting data into SQlite.
I have done some further research and now understand the basics, therefore I have created the following code:
import sqlite3
conn = sqlite3.connect('financials.db')
cur = conn.cursor()
cur.execute('DROP TABLE IF EXISTS financials')
cur.execute('''
CREATE TABLE "financials"(
"Mkt_Cap" REAL,
"EV" REAL,
"PE" REAL,
"Yield" REAL
)
''')
fname = input('Enter the name of the csv file:')
if len(fname) < 1 : fname="data.csv"
with open(fname) as csv_file:
csv_reader = csv.reader(csv_file, delimiter=',')
for row in csv_reader:
print(row)
Below is how my CSV data is currently formatted (It just gets scrapped and put into a CSV file):
Given that, would I be able to extract the values of the table rows using something like this:
Mkt_cap=row[0]
EV = row[1]
I would then write an Insert command and commit to get the data into the database.
Or do I need to reformat my CSV data?
It is a bit tricky because the data in the CSV are transposed. Usually you would have each row defining a year and columns be fiscal period, capitalization, ev, etc.
You could transpose the data yourself but I would use pandas for that. Assuming your csv looks as such based on your screenshot:
Valuation,,,,,,
Fiscal Period: December,2017,2018,2019,2020,2021,2022
Capitalization,270120,215323,248119,-,-
Entreprise Value (EV),262351,208330,232655,204634,200604,196917
P/E ratio,25.7x,16.0x,19.1x,67.1x,19.6x,15.3x
Yield,0.94%,1.83%,1.59%,0.83%,1.54%,1.74%
Here some example code:
import pandas as pd
df = pd.read_csv('data.csv', headers=None, na_values='-')
# first row does not mean much so let us remove it
df = df.drop(df.index[0])
# transpose the data to get it back in shape
df = df.transpose()
# use first row as header
df.columns = df.iloc[0]
# remove first row from data
df = df.drop(df.index[0])
# iterate over each row
for _, row in df.iterrows():
print(f'cap: {row["Capitalization"]}\t'
f'EV: {row["Entreprise Value (EV)"]}\t'
f'PE: {row["P/E ratio"]}\t'
f'Yield: {row["Yield"]}')
result:
cap: 270120 EV: 262351 PE: 25.7x Yield: 0.94%
cap: 215323 EV: 208330 PE: 16.0x Yield: 1.83%
cap: 248119 EV: 232655 PE: 19.1x Yield: 1.59%
cap: 237119 EV: 204634 PE: 67.1x Yield: 0.83%
cap: nan EV: 200604 PE: 19.6x Yield: 1.54%
cap: nan EV: 196917 PE: 15.3x Yield: 1.74%
You may want to change your format first.
Currently you have labels on left and going down. The Machine is look for the labels from left to right.
Think also about the Sort Method and looking for an index, would it be easiest to retrieve the column year or would it be best to have it go index to index until it hits a year.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.