简体   繁体   中英

How to read a link from a cell in Google Spreadsheet if it's inside href tag (gspread)

I am new to stackoverflow, so I sorry in advance if I do something wrong

I have a spreadsheet on Google sheets, for example, this one

And there is a link in the cell inside the href tag. I want to get the link and the text of the cell using Google Sheets API or gspread.

I have already tried this solution but I get access token 'None'.

I have tried to web scrape with beautifulsoup, but it didn't work as well.

As for bs4 solution, I tried using this code, that I found here

from bs4 import BeautifulSoup
import requests

html = requests.get('https://docs.google.com/spreadsheets/d/1v8vM7yQ-27SFemt8_3IRiZr-ZauE29edin-azKpigws/edit#gid=0').text
soup = BeautifulSoup(html, "lxml")
tables = soup.find_all("table")

content = []

for table in tables:
    content.append([[td.text for td in row.find_all("td")] for row in table.find_all("tr")])

print(content)

I figured it out. Here's the full code if anyone needs it

import requests
import gspread
import urllib.parse
import pickle



spreadsheetId = "###"  # Please set the Spreadsheet ID.
cellRange = "Yoursheetname!A1:A100"  # Please set the range with A1Notation. In this case, the hyperlink of the cell "A1" of "Sheet1" is retrieved.


with open('token_sheets_v4.pickle', 'rb') as token:
    # get this file here
    # https://developers.google.com/identity/sign-in/web/sign-in
    credentials = pickle.load(token)

client = gspread.authorize(credentials)

# 1. Retrieve the access token.
access_token = client.auth.token

# 2. Request to the method of spreadsheets.get in Sheets API using `requests` module.
fields = "sheets(data(rowData(values(hyperlink))))"
url = "https://sheets.googleapis.com/v4/spreadsheets/" + spreadsheetId + "?ranges=" + urllib.parse.quote(cellRange) + "&fields=" + urllib.parse.quote(fields)
res = requests.get(url, headers={"Authorization": "Bearer " + access_token})
print(res)

# 3. Retrieve the hyperlink.
obj = res.json()
print(obj)
link = obj["sheets"][0]['data'][0]['rowData'][0]['values'][0]['hyperlink']
print(link)

UPDATE!!

More elegant solution is this. Creating service:

CLIENT_SECRET_FILE = 'secret/secret.json'
API_SERVICE_NAME = 'sheets'
API_VERSION = 'v4'
SCOPES = ['https://www.googleapis.com/auth/spreadsheets.readonly']


def Create_Service():
    cred = None

    pickle_file = f'secret/token_{API_SERVICE_NAME}_{API_VERSION}.pickle'
if os.path.exists(pickle_file):
    with open(pickle_file, 'rb') as token:
        cred = pickle.load(token)

if not cred or not cred.valid:
    if cred and cred.expired and cred.refresh_token:
        cred.refresh(Request())
    else:
        flow = InstalledAppFlow.from_client_secrets_file(CLIENT_SECRET_FILE, SCOPES)
        cred = flow.run_local_server()

    with open(pickle_file, 'wb') as token:
        pickle.dump(cred, token)

try:
    service = build(API_SERVICE_NAME, API_VERSION, credentials=cred)
    print(API_SERVICE_NAME, 'service created successfully')
    return service
except Exception as e:
    print('Unable to connect.')
    print(e)
    return None

service = Create_Service()

And extracting links from each sheet in a spreadsheet in a form of convenient dictionaries

    fields = "sheets(properties(title),data(startColumn,rowData(values(hyperlink))))"
    
    print(service.spreadsheets().get(spreadsheetId=self.__spreadsheet_id,
                                     fields=fields).execute())

So, how fields work. We go to Spreadsheet object description and looking for JSON representation. If we want to return, for example sheet object from that json representation, we just use this fields = "sheets" , because Spreadsheet has field "sheets" it its json representation.

Ok, cool. We got sheets object. How to access sheet object fields? Just click on that thing and look for its fields.

在哪里寻找对象描述

So, how to combine fields? It's easy. For example, I want to return field "properties" and "data" from sheets object, I write the fields string that way: fields = "sheets(properties,data)" . So we just list them as arguments in an ordinary function but without space.

The same applies for objects that return data fields and ect.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM