I have a list of tidal height data with a reading every 10 minutes for 1 year that I've loaded in to a list from csv.
The end result I'm trying to achieve is to be able to (line or bar)graph tides and end up with something like this: https://nt.gov.au/__data/assets/pdf_file/0020/430238/2018-mar-tidal-info-ntports-centre-island-graph.pdf
I'm pretty new to programming and have set myself the smaller task of creating a tidal graph from height data for a given day. I would then output multiple graphs to make up a week etc.
# import numpy as np
# from datetime import datetime
DATA:
010120170010 1.700
010120170020 1.650
for line in csv_reader:
data_times.append(datetime.strptime(line[0], "%d%m%Y%H%M"))
data_height.append(float(line[2]))
np_data_times = np.array(data_times)
np_data_height = np.array(data_height)
create array only with today's heights Is there a better way that does the python equivalent of the SQL 'select * from times where date = today()'? Can I create a dictionary with time: height rather than 2 arrays? (I've read that dicts are unordered so stayed away from that approach)
Plot array divided every 6 hours I'd also like to provide data points to the chart but only show times divided every 3 or 6 hours across the X axis. This would give a smoother and more accurate graph. So far I've only found out how to give data to the x axis and it's labels in a 1:1 fashion when I may want 6:1 or 18:1 etc. Is there a particular method I should be looking at?
# import matplotlib.pyplot as plt
plt.title("Tides for today")
plt.xlabel(datetime.date(real_times[0]))
plt.ylabel("Tide Height")
plt.plot(real_times, real_heights)
plt.show()
Don't use a dictionary. This would make everything slow and hard to handle.
I would suggest to consider using pandas .
Reading in the data would work like this:
import pandas as pd
from datetime import datetime
conv={"Time" : lambda t: datetime.strptime(t, "%d%m%Y%H%M")}
df = pd.read_csv("datafile.txt", header=None, delim_whitespace=True,
names=["Time", "Tide"], converters=conv,index_col=0 )
This results in something like
Tide
Time
2017-01-01 00:10:00 1.70
2017-01-01 00:20:00 1.65
2017-01-01 05:20:00 1.35
2017-01-02 00:20:00 1.75
You can now filter the dataframe , eg for selecing only the data from the first of january:
df["2017-01-01":"2017-01-01"]
You could directly plot the data like
df["2017-01-01":"2017-01-01"].plot(kind="bar")
or
df["2017-01-01 00:00":"2017-01-01 06:00"].plot(kind="bar")
This will work nicely if the times are equally spaced because it creates a categorical bar plot. (just remember that you might need to use pyplot.show()
if working in a script)
You may also use matplotlib to draw the bars
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.dates
df1 = df["2017-01-01 00:00":"2017-01-01 06:00"]
plt.bar(df1.index,df1.Tide,
width=np.diff(matplotlib.dates.date2num(df1.index.to_pydatetime()))[0], ec="k")
plt.show()
To get control over the xaxis ticks and labels this latter matplotlib solution would be the way to go. First set the bars to align to the edges, align="edge"
. Then use formatters and locators as shown in the official dates example . A grid can be defined using plt.grid
.
plt.bar(df1.index,df1.Tide,
width=np.diff(matplotlib.dates.date2num(df1.index.to_pydatetime()))[0],
align="edge", ec="k")
hours = matplotlib.dates.HourLocator() # every hour
#hours = matplotlib.dates.HourLocator(byhour=range(24)[::3]) # every 3 hours
fmthours=matplotlib.dates.DateFormatter("%m-%d %H:%M")
plt.gca().xaxis.set_major_locator(hours)
plt.gca().xaxis.set_major_formatter(fmthours)
plt.grid(True, axis="x", linewidth=1, color="k")
plt.gcf().autofmt_xdate()
plt.show()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.