繁体   English   中英

如何提取由点后跟一组特定的单词正则表达式python的数字?

[英]How to extract digits separated by a dot followed by a specific set of words regex python?

我试图从下面的字符串示例中提取数字“ 4.3”,该示例后面总是跟着这组单词:“ 5星”:

('B01A0NB55A', 'Star Wars Mug, Lightsabers Appear With Heat (12 oz)4.3 out of 5 stars948$9.99$9.99Eligible for Shipping to United Arab EmiratesMore Buying Choices$6.91(2 used & new offers)')

如何使用正则表达式提取它们?

这是我的代码:

import re
data_tup = [('B077T5MG5F', 'Star Wars: The Last Jedi (Theatrical Version)MPAA Rating: PG-13 (Parents Strongly Cautioned)|Closed Caption3.8 out of 5 stars4,738Prime Videofrom$2.99$2.99to rentStarring:Oscar Isaac,Mark Hamill,Daisy RidleyandJohn BoyegaDirected by:Rian JohnsonRuntime:151 minutes'),
('B079T2F3CY', 'Star Wars Poster Inspired Watercolor Wall Art Jedi Yoda Death Star Prints Decor Paper Set of 6 8x10 P49 by PGbureau4.8 out of 5 stars16$24.99$24.99Eligible for Shipping to United Arab Emirates'),
('B00VF0M7QE', 'Star Wars: Return of the Jedi (Theatrical Version)MPAA Rating: PG (Parental Guidance Suggested)|Closed Caption4.5 out of 5 stars1,055Prime Videofrom$19.99$19.99to buyStarring:Mark Hamill,Harrison Ford,Carrie Fisher, et al.Directed by:Richard MarquandRuntime:134 minutes'),
('B01J5GKX60', 'Star Wars Classic Space Battle Full Sheet Set4.5 out of 5 stars53$34.99$34.99Eligible for Shipping to United Arab EmiratesOnly 2 left in stock - order soon.More Buying Choices$26.99(3 new offers)'),
('B079MB31DY', "SponsoredThese are ads for products you'll find on Amazon.com.Clicking an ad will take you to the product's page.Learn more about Sponsored Products.Enjoy The Wood Star Wars Music Box Wooden Star Wars Fans Custom Gift for Boyfriend Gift for Brother4.9 out of 5 stars22$19.99$19.99Eligible for Shipping to United Arab Emirates"),
('B00ZYXVU7K', "SponsoredThese are ads for products you'll find on Amazon.com.Clicking an ad will take you to the product's page.Learn more about Sponsored Products.Star Wars Lightsaber Heat Change Mug4.1 out of 5 stars158$13.95$13.95Eligible for Shipping to United Arab EmiratesOnly 9 left in stock - order soon."),
('B014HPF5G2', 'Hasbro Gaming Star Wars Bop It Game4.7 out of 5 stars446$14.99$14.99$16.99$16.99Eligible for Shipping to United Arab EmiratesMore Buying Choices$7.99(16 used & new offers)Ages: 8 years and up'),
('B00VN0DLRA', 'Star Wars: A New HopeMPAA Rating: PG (Parental Guidance  Suggested)|Closed Caption4.5 out of 5 stars2,226Prime Videofrom$19.99$19.99to buyStarring:Mark Hamill,Harrison Ford,Carrie FisherandPeter CushingDirected by:George LucasRuntime:124 minutes'),
('B079MB31DY', 'Enjoy The Wood Star Wars Music Box Wooden Star Wars Fans Custom Gift for Boyfriend Gift for Brother4.9 out of 5 stars22$19.99$19.99Eligible for Shipping to United Arab Emirates'),
('B076FDK9TF', 'Lenovo Star Wars: Jedi Challenges, Smartphone Powered Augmented Reality ExperienceDec 1, 2017|by Lenovo4.0 out of 5 stars102iOS$64.99$64.99$99.99$99.99Eligible for Shipping to United Arab EmiratesMore Buying Choices$35.99(35 used & new offers)'),
('B015NFSC24', "Star Wars Classic Logo and Tie Fighter Men's Short Sleeve T-Shirt4.8 out of 5 stars52$15.89$15.89-$19.99$19.99")]

for tup in data_tup:
    number_of_stars = re.search(r'([0-9.,]*)out of 5 stars',tup[1])
    print(number_of_stars)

但是我得到这个结果:

<re.Match object; span=(111, 125), match='out of 5 stars'>
<re.Match object; span=(119, 133), match='out of 5 stars'>
<re.Match object; span=(114, 128), match='out of 5 stars'>
<re.Match object; span=(49, 63), match='out of 5 stars'>
<re.Match object; span=(252, 266), match='out of 5 stars'>
<re.Match object; span=(189, 203), match='out of 5 stars'>
<re.Match object; span=(39, 53), match='out of 5 stars'>
<re.Match object; span=(86, 100), match='out of 5 stars'>
<re.Match object; span=(103, 117), match='out of 5 stars'>
<re.Match object; span=(107, 121), match='out of 5 stars'>
<re.Match object; span=(69, 83), match='out of 5 stars'>

这是我想要得到的:

3.8
4.8
4.5
4.5
4.9
4.1
4.7
4.9
4.0
4.8

该匹配对象中包含很多东西。 文档位于https://docs.python.org/3/library/re.html#match-objects

解决方法如下:

import re
data_tup = [('B077T5MG5F', 'Star Wars: The Last Jedi (Theatrical Version)MPAA Rating: PG-13 (Parents Strongly Cautioned)|Closed Caption3.8 out of 5 stars4,738Prime Videofrom$2.99$2.99to rentStarring:Oscar Isaac,Mark Hamill,Daisy RidleyandJohn BoyegaDirected by:Rian JohnsonRuntime:151 minutes'),
('B079T2F3CY', 'Star Wars Poster Inspired Watercolor Wall Art Jedi Yoda Death Star Prints Decor Paper Set of 6 8x10 P49 by PGbureau4.8 out of 5 stars16$24.99$24.99Eligible for Shipping to United Arab Emirates'),
('B00VF0M7QE', 'Star Wars: Return of the Jedi (Theatrical Version)MPAA Rating: PG (Parental Guidance Suggested)|Closed Caption4.5 out of 5 stars1,055Prime Videofrom$19.99$19.99to buyStarring:Mark Hamill,Harrison Ford,Carrie Fisher, et al.Directed by:Richard MarquandRuntime:134 minutes'),
('B01J5GKX60', 'Star Wars Classic Space Battle Full Sheet Set4.5 out of 5 stars53$34.99$34.99Eligible for Shipping to United Arab EmiratesOnly 2 left in stock - order soon.More Buying Choices$26.99(3 new offers)'),
('B079MB31DY', "SponsoredThese are ads for products you'll find on Amazon.com.Clicking an ad will take you to the product's page.Learn more about Sponsored Products.Enjoy The Wood Star Wars Music Box Wooden Star Wars Fans Custom Gift for Boyfriend Gift for Brother4.9 out of 5 stars22$19.99$19.99Eligible for Shipping to United Arab Emirates"),
('B00ZYXVU7K', "SponsoredThese are ads for products you'll find on Amazon.com.Clicking an ad will take you to the product's page.Learn more about Sponsored Products.Star Wars Lightsaber Heat Change Mug4.1 out of 5 stars158$13.95$13.95Eligible for Shipping to United Arab EmiratesOnly 9 left in stock - order soon."),
('B014HPF5G2', 'Hasbro Gaming Star Wars Bop It Game4.7 out of 5 stars446$14.99$14.99$16.99$16.99Eligible for Shipping to United Arab EmiratesMore Buying Choices$7.99(16 used & new offers)Ages: 8 years and up'),
('B00VN0DLRA', 'Star Wars: A New HopeMPAA Rating: PG (Parental Guidance  Suggested)|Closed Caption4.5 out of 5 stars2,226Prime Videofrom$19.99$19.99to buyStarring:Mark Hamill,Harrison Ford,Carrie FisherandPeter CushingDirected by:George LucasRuntime:124 minutes'),
('B079MB31DY', 'Enjoy The Wood Star Wars Music Box Wooden Star Wars Fans Custom Gift for Boyfriend Gift for Brother4.9 out of 5 stars22$19.99$19.99Eligible for Shipping to United Arab Emirates'),
('B076FDK9TF', 'Lenovo Star Wars: Jedi Challenges, Smartphone Powered Augmented Reality ExperienceDec 1, 2017|by Lenovo4.0 out of 5 stars102iOS$64.99$64.99$99.99$99.99Eligible for Shipping to United Arab EmiratesMore Buying Choices$35.99(35 used & new offers)'),
('B015NFSC24', "Star Wars Classic Logo and Tie Fighter Men's Short Sleeve T-Shirt4.8 out of 5 stars52$15.89$15.89-$19.99$19.99")]

for tup in data_tup:
    number_of_stars = re.search(r'([0-9.,]*) out of 5 stars', tup[1]).group(1)
    print(number_of_stars)

调用.group(0)将为您提供字符串的整个匹配部分(例如, 3.8 out of 5 stars )。 调用.group(1)只会给您与第一组括号中的表达式匹配的内容。 另外,在单词“ out”之前放置一个空格,您不希望该空格成为您提取的数字的一部分。

>>> for tup in data_tup:
...     re.search('([0-9.,]*) out of 5 stars',tup[1]).group()[0:3]
...
'3.8'
'4.8'
'4.5'
'4.5'
'4.9'
'4.1'
'4.7'
'4.5'
'4.9'
'4.0'
'4.8'

这行得通,但可能是摆脱[0:3]的更干净的方法,并且您可以根据需要将其转换为float

根据其他解决方案,您可以更改为

re.search('([0-9.,]*) out of 5 stars',tup[1]).group(1)
for tup in data_tup:
    number_of_stars = re.search(r'(\d.\d) out of 5 stars', tup[1]).group(1)
    print(number_of_stars)
  • 使用.group(1)提取第一个括号中的内容
  • 如果使用group(0)group() ,它将返回匹配的整个字符串。
  • \\d等于[0-9]
  • 记得在“ 5星”之前添加一个空格
  • 提取的数据的类型为字符串,如果要计算,则将其强制转换为浮点型。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM