从列表 python 的每个值中删除特定字符

Question

我有这个电影列表，我想删除点“。” 从每个标题。

我不能只删除每个值的第一个字符，因为并非所有值都以点“。”开头。

   ['Sueños de fuga(1994)',
     'El padrino(1972)',
     'Citizen Kane(1941)',
     '12 hombres en pugna(1957)',
     'La lista de Schindler(1993)',
     'Lo bueno, lo malo y lo feo(1966)',
     'El imperio contraataca(1980)',
     'El señor de los anillos: El retorno del rey(2003)',
     'Batman - El caballero de la noche(2008)',
     '.El padrino II(1974)',
     '.Tiempos violentos(1994)',
     '.El club de la pelea(1999)',
     '.Psicosis(1960)',
    '.2001: Odisea del espacio(1968)',
    '.Metropolis(1927)',
    '.La guerra de las galaxias(1977)',
     ]

此外，该列表正在被废弃，因此仅手动删除该点是行不通的。

这是我到目前为止的代码：

from bs4 import BeautifulSoup
import requests
import pandas as pd

url = "https://www.imdb.com/list/ls024149810/"
page = requests.get(url)
soup = BeautifulSoup(page.content, "html.parser")

# scrap movie names
scraped_movies = soup.find_all('h3', class_='lister-item-header')

# parse movie names
movies = []
for movie in scraped_movies:
    movie = movie.get_text().replace('\n', "")
    movie = movie.strip(" ")
    movies.append(movie)

# remove the first two characters of each value on the list
movies = [e[2:] for e in movies]  

# remove the remaining dots "."
while (movies.count(".")):
    movies.remove(".")

# print list
print (movies)

Answer 1

尝试使用替换方法删除点

movie = movie.get_text().replace('\n', "").replace('.', "")

Answer 2

你可以试试这个：

# remove the remaining dots "."
for word in movies:
    if word.startswith("."):
        movies[movies.index(word)] = word.replace(".", "")

或者使用它，如果任何元素以点开头，它将查找并替换点，如果不是以点开头，它将忽略其他元素，并且当列表不包含以点开头的元素时它也可以工作。

# remove the remaining dots "."    
movies = [word.replace(".", "") for word in movies if not all(word.startswith(".") for word in movies)]

编辑代码：

from bs4 import BeautifulSoup
import requests
import pandas as pd

url = "https://www.imdb.com/list/ls024149810/"
page = requests.get(url)
soup = BeautifulSoup(page.content, "html.parser")

# scrap movie names
scraped_movies = soup.find_all('h3', class_='lister-item-header')

# parse movie names
movies = []
for movie in scraped_movies:
    movie = movie.get_text().replace('\n', "")
    movie = movie.strip(" ")
    movies.append(movie)

# remove the first two characters of each value on the list
movies = [e[2:] for e in movies]
print(movies)

# remove the remaining dots "."
movies = [word.replace(".", "") for word in movies if not all(word.startswith(".") for word in movies)]

# print list
print (movies)

Output：

['The Shawshank Redemption(1994)', 'The Godfather(1972)', 'Citizen Kane(1941)', '12 Angry Men(1957)', "Schindler's List(1993)", 'Il buono, il brutto, il cattivo(1966)', 'The Empire Strikes Back(1980)', 'The Lord of the Rings: The Return of the King(2003)', 'The Dark Knight(2008)', '.The Godfather Part II(1974)', '.Pulp Fiction(1994)', '.Fight Club(1999)', '.Psycho(1960)', '.2001: A Space Odyssey(1968)', '.Metropolis(1927)', '.Star Wars(1977)', '.The Lord of the Rings: The Fellowship of the Ring(2001)', '.Terminator 2: Judgment Day(1991)', '.The Matrix(1999)', '.Raiders of the Lost Ark(1981)', '.Casablanca(1942)', '.The Wizard of Oz(1939)', '.Shichinin no samurai(1954)', '.Forrest Gump(1994)', '.Inception(2010)']
['The Shawshank Redemption(1994)', 'The Godfather(1972)', 'Citizen Kane(1941)', '12 Angry Men(1957)', "Schindler's List(1993)", 'Il buono, il brutto, il cattivo(1966)', 'The Empire Strikes Back(1980)', 'The Lord of the Rings: The Return of the King(2003)', 'The Dark Knight(2008)', 'The Godfather Part II(1974)', 'Pulp Fiction(1994)', 'Fight Club(1999)', 'Psycho(1960)', '2001: A Space Odyssey(1968)', 'Metropolis(1927)', 'Star Wars(1977)', 'The Lord of the Rings: The Fellowship of the Ring(2001)', 'Terminator 2: Judgment Day(1991)', 'The Matrix(1999)', 'Raiders of the Lost Ark(1981)', 'Casablanca(1942)', 'The Wizard of Oz(1939)', 'Shichinin no samurai(1954)', 'Forrest Gump(1994)', 'Inception(2010)']

Answer 3

对于列表理解，这应该是一件非常简单的事情。 如果您获取电影列表，则可以简单地将点替换为空。 此代码同时将您的电影的虚线开头和 append 替换到您的电影列表中。

movies = [x.replace('.', '') for x in scraped_movies]

Output：

['Sueños de fuga(1994)', 'El padrino(1972)', 'Citizen Kane(1941)', '12 hombres en pugna(1957)', 'La lista de Schindler(1993)', 'Lo bueno, lo malo y lo feo(1966)', 'El imperio contraataca(1980)', 'El señor de los anillos: El retorno del rey(2003)', 'Batman - El caballero de la noche(2008)', 'El padrino II(1974)', 'Tiempos violentos(1994)', 'El club de la pelea(1999)', 'Psicosis(1960)', '2001: Odisea del espacio(1968)', 'Metropolis(1927)', 'La guerra de las galaxias(1977)']

如果在某些情况下您担心点在标题中的其他位置而不是开头，那么您可以为string.startswith('.')运行 if 语句以更准确地匹配。

从列表 python 的每个值中删除特定字符

问题描述

3 个解决方案

解决方案1
1 2022-07-27 16:30:56

解决方案2
0 2022-07-27 16:39:27

解决方案3
0 2022-07-27 16:49:57

从列表 python 的每个值中删除特定字符

问题描述

3 个解决方案

解决方案1 1 2022-07-27 16:30:56

解决方案2 0 2022-07-27 16:39:27

解决方案3 0 2022-07-27 16:49:57

解决方案1
1 2022-07-27 16:30:56

解决方案2
0 2022-07-27 16:39:27

解决方案3
0 2022-07-27 16:49:57