簡體   English   中英

將文件中的行讀入字符串

[英]read line in file into string

我想在下面的代碼中替換"Song Title", "Song Artist"

find_Lyrics("Song Title", "Song Artist")

與我在兩個 txt 文件中的歌曲標題和歌曲藝術家。 /artistchart.txt 的內容:

DaBaby
Jack Harlow
DJ Khaled
The Weeknd
SAINt JHN
Megan Thee Stallion
Harry Styles
DJ Khaled
Juice WRLD
Chris Brown
Lil Mosey
Jawsh 685
Juice WRLD
Lady Gaga
Harry Styles
Gabby Barrett
Dua Lipa
Post Malone
Lewis Capaldi
Lil Baby
Doja Cat
Justin Bieber
Pop Smoke
StaySolidRocky
Luke Bryan
Miranda Lambert
Dua Lipa
Future
Powfu
Trevor Daniel
Maren Morris
Pop Smoke
Sam Hunt
Roddy Ricch
Maddie & Tae
Juice WRLD
Lil Baby
Juice WRLD
Morgan Wallen
Surfaces
Rod Wave
Juice WRLD
Lil Baby
Moneybagg Yo
Drake
Megan Thee Stallion
BENEE
NLE Choppa
Juice WRLD
LOCASH
Juice WRLD
JP Saxe
Jason Aldean
Florida Georgia Line
Pop Smoke
Chris Janson
Doja Cat
Ariana Grande
Thomas Rhett
Young T
Marshmello
Juice WRLD
Black Eyed Peas
Juice WRLD
Kane Brown
Saweetie
Keith Urban
Juice WRLD
Lee Brice
Pop Smoke
Justin Moore
Luke Combs
Kane Brown
THE SCOTTS
Pop Smoke
Migos
Juice WRLD
Juice WRLD
Juice WRLD
Morgan Wallen
Jhene Aiko
Don Toliver
Trevor Daniel
surf mesa
Rod Wave
HARDY
Lil Durk
Luke Combs
Juice WRLD
AJR
Ashley McBryde
Juice WRLD
Drake
Polo G
Juice WRLD
Gunna
Topic
Pop Smoke
Parker McCollum
J. Cole

和 /songchart.txt 的內容:

Rockstar
Whats Poppin
Popstar
Blinding Lights
Roses
Savage
Watermelon Sugar
Greece
Come & Go
Go Crazy
Blueberry Faygo
Savage Love
Wishing Well
Rain On Me
Adore You
I Hope
Break My Heart
Circles
Before You Go
We Paid
Say So
Intentions
For The Night
Party Girl
One Margarita
Bluebird
Dont Start Now
Life Is Good
Death Bed
Falling
The Bones
The Woo
Hard To Forget
The Box
Die From A Broken Heart
Hate The Other Side
The Bigger Picture
Conversations
Chasin You
Sunday Best
Rags2Riches
Lifes A Mess
Emotionally Scarred
Said Sum
Toosie Slide
Girls In The Hood
Supalonely
Walk Em Down
Blood On My Jeans
One Big Country Song
Righteous
If The World Was Ending
Got What I Got
I Love My Country
Got It On Me
Done
Like That
Stuck With U
Be A Light
Dont Rush
Be Kind
Titanic
Mamacita
Stay High
Be Like That
Tap In
God Whispered Your Name
Bad Energy
One Of Them Girls
Mood Swings
Why We Drink
Lovin On You
Cool Again
The Scotts
Something Special
Need It
Tell Me U Luv Me
Up Up And Away
Fighting Demons
More Than My Hometown
B.S.
After Party
Past Life
ily
Girl Of My Dreams
One Beer
3 Headed Goat
Does To Me
Man Of The Year
Bang!
One Night Standards
Cant Die
Chicago Freestyle
Flex
Screw Juice
Dollaz On My Head
Breaking Me
Enjoy Yourself
Pretty Heart
The Climb Back

這是我的代碼:

import requests
from bs4 import BeautifulSoup as Parse


def make_soup(url):
    """
    Parse a web page info html
     """
    user_agent = {
        'User-Agent': "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36"
    }
    r = requests.get(url, headers=user_agent)
    html = Parse(r.content, "html.parser")
    return html


def format_url(string):
    """
    Replace les spaces with '%20'
    """
    return string.replace(" ", "%20")


def get_song_url(html):
    song_url = html.find("a", {"class": "title"})["href"]
    return song_url


def find_Lyrics(titre, artiste):
    url = f"https://www.musixmatch.com/fr/search/{artiste}%20{titre}/tracks"

    url = format_url(url)
    pageweb = make_soup(url)

    # Recupere le lien de la chanson
    song_url = pageweb.find("a", {"class": "title"})["href"]
    song_url = "https://www.musixmatch.com" + song_url


# Recupere les paroles
    pageweb = make_soup(song_url)
    paroles = list()
    for span in pageweb.find_all("span", {"class": "lyrics__content__ok"}):
        # open file and print to it
        file1 = open('newlyrics.txt', 'a')
    print(span.text, file=file1)


filepath1 = '/home/redapemusic35/VimWiki/subjects/projects/tutorial/songchart.txt'
filepath2 = '/home/redapemusic35/VimWiki/subjects/projects/tutorial/artistchart.txt'

with open(filepath1) as fb, open(filepath2) as hp:
    for song, artist in zip(fb, hp):
        find_Lyrics(song.strip(), artist.strip())

如果我將輸入文件減少到僅前幾個項目,則代碼將按我的意願工作。 但是,如果我嘗試運行整個 txt 文件,我會收到錯誤消息:

回溯(最后一次調用):文件“tutorial/spiders/musicmatchapi2.py”,第 54 行,在 find_Lyrics(song.strip(), artist.strip()) 文件“tutorial/spiders/musicmatchapi2.py”,第 46 行, 在 find_Lyrics print(span.text, file=file1) UnboundLocalError: Local variable 'span' referenced before assignment

我非常肯定錯誤存在於我的兩個輸入文件之一中,因為當我運行它時,當每個列表中只有少數藝術家和歌曲時,代碼工作正常。 但我不認為這是由於其中一首歌曲與其中一位藝術家不匹配造成的,因為當這種情況發生時我會遇到不同的錯誤。

有什么方法可以找到導致錯誤的原因而無需單獨運行每個藝術家和歌曲組合?

將文件 object 傳遞到find_Lyrics() function 是問題所在。 所以我所做的只是同時打開這兩個文件,逐行讀取並將字符串傳遞到 function。

with open(filepath1) as fb, open(filepath2) as hp:
    for song, artist in zip(fb, hp):
        find_Lyrics(song.strip(), artist.strip())

所以你的刮刀看起來像,

import requests
from bs4 import BeautifulSoup as Parse


def make_soup(url):
    """
    Parse a web page info html
     """
    user_agent = {
        'User-Agent': "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36"
    }
    r = requests.get(url, headers=user_agent)
    html = Parse(r.content, "html.parser")
    return html


def format_url(string):
    """
    Replace les spaces with '%20'
    """
    return string.replace(" ", "%20")


def get_song_url(html):
    song_url = html.find("a", {"class": "title"})["href"]
    return song_url


def find_Lyrics(titre, artiste):
    url = f"https://www.musixmatch.com/fr/search/{artiste}%20{titre}/tracks"

    url = format_url(url)
    pageweb = make_soup(url)

    # Recupere le lien de la chanson
    song_url = pageweb.find("a", {"class": "title"})["href"]
    song_url = "https://www.musixmatch.com" + song_url


# Recupere les paroles
    pageweb = make_soup(song_url)
    paroles = list()
    for span in pageweb.find_all("span", {"class": "lyrics__content__ok"}):
        # open file and print to it
        file1 = open('newlyrics.txt', 'a')
    print(span.text, file=file1)


filepath1 = 'countrysongs.txt'
filepath2 = 'countryartists.txt'

with open(filepath1) as fb, open(filepath2) as hp:
    for song, artist in zip(fb, hp):
        find_Lyrics(song.strip(), artist.strip())

希望它能給出預期的 output。


更新

似乎藝術家和歌曲列表的內容與網站上使用的歌曲和藝術家名稱不兼容。 所以你需要更新你的列表,或者你可以處理異常,這樣程序就不會終止。

注意:這是一個臨時解決方案和一個基本的異常處理。 因此,您需要手動更新您的列表,或者可以編寫一個程序從網站上刮取正確的名稱。

with open(filepath1) as fb, open(filepath2) as hp:
    for song, artist in zip(fb, hp):
        try:
            find_Lyrics(song.strip(), artist.strip())
        except:
            print("URL Not Found")

減少復制粘貼。 這里真正的問題是您如何將信息傳遞到您的 find_lyrics function。

artistlist = "artists.txt"
songlist = "songs.txt"
artists = []
with open(artistlist) as al:
   artists = [a.strip() for a in al]
songs = []
with open(songlist) as sl:
   songs = [s.strip() for s in sl]

tuples = [(songs[i], artists[i]) for i in range(0, len(artists))]
# tuples = list(zip(songs, artists))

for row in tuples:
  find_lyrics(*row)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM