lxml-html read produces empty list Python 3.6.4

Question

I am trying to read the two line elements for STRaND-1 from this link: http://celestrak.com/NORAD/elements/cubesat.txt , so I can track it from a ground station I'm building. I don't really understand how to use the xtree.xpath command an I'd like to learn how. I'm trying the following code that I found from a similar question asked on here a while ago:

import numpy as np
from lxml import html
import requests
line_number = 50
for word in range(0,5):
    page = requests.get("http://celestrak.com/NORAD/elements/cubesat.txtid=%s" % word)
    tree = html.fromstring(page.text)
    print (tree.xpath("//b/text()")

This should print the code between the elements of the html page right? How do I just print from a certain line? Expecially when there is no html prefix before the text that I want?

Thanks for your time.

Answer 1

Try below solution to get required data:

import requests

url = "http://celestrak.com/NORAD/elements/cubesat.txt"
response = requests.get(url)

page_content = response.text
all_lines = [line.strip() for line in page_content.split("\n")]
for index, line in enumerate(all_lines):
    if line == "STRAND-1":
        first_value = all_lines[index + 1]
        second_value = all_lines[index + 2]
        break

print(first_value, "\n", second_value)

Output:

1 39090U 13009E   18037.58367953  .00000016  00000-0  21168-4 0  9998 
 2 39090  98.5328 245.5663 0008674 331.4360  28.6349 14.35009671259097

Answer 2

I figured out how to do this with help from Andersson. (Thanks a million!)

Using a urllib.request.urlopen, a basic for loop and .decode utf-8 I got it to work. Didn't even need lxml. I know this is far from the most elegant implementation of this logic, and any input on how to clean it up and condense it would be appreciated, but at least it works for me.

My Code:

from urllib.request import urlopen


line_number1 = 50
line_number2 = 1

with urlopen("http://celestrak.com/NORAD/elements/cubesat.txt") as TLEDB:
    i = 1
    for line in TLEDB:
        if i == line_number1:
            break
        i += 1
    line1 = line.decode("utf-8")
    print(line1)

    n = 1
    for line in TLEDB:
        if n == line_number2:
            break
        n += 1
    line2 = line.decode("utf-8")
    print(line2)

Thanks again for the help.

L

lxml-html read produces empty list Python 3.6.4

Question

2 answers

solution1
2 ACCPTED 2018-02-07 20:48:12

solution2
1 2018-02-07 20:44:32

lxml-html read produces empty list Python 3.6.4

Question

2 answers

solution1 2 ACCPTED 2018-02-07 20:48:12

solution2 1 2018-02-07 20:44:32

solution1
2 ACCPTED 2018-02-07 20:48:12

solution2
1 2018-02-07 20:44:32