I am new to python and Pandas.I am trying to get some data from websites and write them to an excel file in a table format.The "getting data" part works but I can't seem to write them into an excel file. The format that I want to write is each one of my values should go to the same row in a "time-data-data-data..." format.
This is a python 3 project.I have tried ".T" trick from pandas.
a = sites[k]
r = requests.get(a)
c = r.content
soup = BeautifulSoup(c, "html.parser")
all_code = soup.find_all("td", {"class": "right"})
my_value[k] = all_code[0].text
k += 1
my_file = pandas.read_excel("data.xlsx")
my_file_t = my_file.T
my_file_t[datetime.datetime.now()] = [my_value[0], my_value[1], my_value[2], my_value[3], my_value[4], my_value[5],
my_value[6], my_value[7], my_value[8]]
my_file = my_file_t.T
I want it to write to the data.xlsx file.But the program gives an error. The error is
File "C:/Users/kayab/Desktop/Projects/WORK IN PROGRESS/borsa/catcher.py", line 25, in <module>
my_value[6], my_value[7], my_value[8]]
File "C:\Users\kayab\AppData\Roaming\Python\Python37\site-packages\pandas\core\frame.py", line 3370, in __setitem__
self._set_item(key, value)
File "C:\Users\kayab\AppData\Roaming\Python\Python37\site-packages\pandas\core\frame.py", line 3445, in _set_item
value = self._sanitize_column(key, value)
File "C:\Users\kayab\AppData\Roaming\Python\Python37\site-packages\pandas\core\frame.py", line 3630, in _sanitize_column
value = sanitize_index(value, self.index, copy=False)
File "C:\Users\kayab\AppData\Roaming\Python\Python37\site-packages\pandas\core\internals\construction.py", line 519, in sanitize_index
raise ValueError('Length of values does not match length of index')
ValueError: Length of values does not match length of index
As you are trying to achieve all the links to an excel sheet with a tab named as the time stamp .
# Convert the list into dataframe
my_file = pd.Dataframe(my_value)
now = datetime.datetime.utcnow()
time_tab = now.strftime("%Y-%m-%d")
# Save the dataframe as an excel
my_file.to_excel('data.xlsx', sheet_name=time_tab)
However, the whole program can be written in a more pythonic way.
site_url = sites[idx]
content_requested = requests.get(site_url)
content = content_requested.content
soup = BeautifulSoup(content, "html.parser")
# Beautiful Soup returns an iterator object for find_all
all_code = soup.find_all("td", {"class": "right"})
# Save all the URL's into an list
my_value = []
for code in all_code:
my_value.append(code[0].text)
# Convert the list into dataframe
my_file = pd.Dataframe(my_value)
now = datetime.datetime.utcnow()
time_tab = now.strftime("%Y-%m-%d")
# Save the dataframe as an excel
my_file.to_excel('data.xlsx', sheet_name=time_tab)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.