[英]How to extract dates and all of the data following them using re.findall in python
I have a string, and I'm hoping to use regular expressions to separate some of the information from the string. 我有一个字符串,我希望使用正则表达式从字符串中分离出一些信息。
The following code that I have has two problems that I'm aware of: 1) it is failing to capture the date, "20120623" for a reason that I'm not quite sure I understand (perhaps the compiled regex query is not doing what I expect it to do?), and 2) the current regex expression I'm using to designate how the expression ends won't work for the last match in the string. 我遇到的以下代码有两个我知道的问题:1)由于我不确定我是否理解(也许已编译的正则表达式查询未完成),因此无法捕获日期“ 20120623”我期望它做什么?),以及2)我用于指定表达式结束方式的当前regex表达式不适用于字符串中的最后一个匹配项。
I may revise the title of this question, depending on what the problem/solution ends up being. 我可能会修改该问题的标题,具体取决于最终解决的问题/解决方案是什么。
The current code: 当前代码:
import re
s='20120622 10.3 -84.8$Sabulodes colombiata;Clemensia cincinnata;Gn sp_san_luis_1037;Glena mopsaria;Euphyia sp_group_san_luis;Thysanopyga carfinia;Glena mopsaria;Eusarca minucia;Euphyia sp_group_san_luis;Scopula compensata;Gn sp_san_luis_1028;Gn sp_san_luis_7003;Trygodes amphion;Phyllodonta latrata;Gn sp_san_luis_1003;Idaea sp_group_san_luis;Gn sp_san_luis_1002;Melanolophia sp_san_luis;Hemiceras pernubila;Leucula meganira;Oospila athena;Gn sp_san_luis_6001;Epimecis semicompleta;Gn sp_san_luis_1004;Glena mopsaria;Euphyia sp_group_san_luis;Iridopsis validaria;Eusarca asteria--cayennaria;Disphragis proba;Trygodes amphion;Gn sp_san_luis_8022;Melanolophia sp_san_luis;Melanolophia sp_san_luis;Melanolophia sp_san_luis;Idaea sp_group_san_luis;Gn sp_san_luis_1013;Gn sp_san_luis_7001;Disphragis proba;Gn sp_san_luis_1003;Eois sp_san_luis_b;Prochoerodes striata;Nola sp_san_luis_a;Oospila venezuelata;Eois sp_san_luis_b;Oospila venezuelata;Glena mopsaria;Idaea sp_group_san_luis;Oospila venezuelata;Gn sp_san_luis_1002;Phyllodonta latrata;Virbia sp_san_luis_a;Oospila venezuelata;Hemiceras pernubila;Pantherodes conglomerata;Nephodia betala;Melanolophia sp_san_luis;Herbita medama--medona;Euphyia sp_group_san_luis;Idaea sp_group_san_luis;Glena mopsaria;Pantherodes conglomerata;Glena mopsaria;Eois sp_san_luis_k;Melanolophia sp_san_luis;Idaea sp_group_san_luis;Eois sp_san_luis_b;Idaea sp_group_san_luis;Oospila venezuelata;Gn sp_san_luis_1010;Gn sp_san_luis_7001;Eois sp_san_luis_k;Hemiceras pernubila;Dyspteris tenuivitta;Macaria pernicata;Hemiceras pernubila;Hymenia perspectalis;Oxydia sp_san_luis_c;Cliniodes opalalis;Gn sp_san_luis_1042;Oospila athena;Eois sp_san_luis_b;Hemiceras rufescens;Glena mopsaria;Gn sp_san_luis_1011;Phyllodonta latrata;Simopteryx torquataria;Perasia helvina;Nola sp_san_luis_a;Gn sp_san_luis_2023;Nephodia auxesia;Nephodia betala;Glena mopsaria;Ametris nitocris;Gn sp_san_luis_1003;Glena mopsaria;Hymenia perspectalis;Phyllodonta latrata;Eusarca asteria--cayennaria;Disphragis proba;Gn sp_san_luis_2028;Ptychamalia sp_san_luis_a;Nola sp_san_luis_a;Anticla antica;Nola sp_san_luis_a;Semaeopus sabuloides;Simopteryx torquataria;Gn sp_san_luis_1002;Phyllodonta latrata;Glena mopsaria;Gizma undilinealis;Eois sp_san_luis_d;Oxydia bilinea;Hemiceras pernubila;Hemiceras pernubila;Thysanopyga amarantha;Plusiodonta sp_san_luis_a;Gn sp_san_luis_1011;Cyclophora sp_san_luis_a;Bleptina caradrinalis;Opharus rudis;Melanolophia sp_san_luis;Schrankia macula;Leucula meganira;Iridopsis lurida--oberthuri--herse;Schrankia macula;Letis buteo;Scopula sp_group_san_luis;Herbita amicaria;Gn sp_san_luis_1013;Adhemarius ypsilon;Adhemarius ypsilon;Adhemarius ypsilon;Euphyia sp_group_san_luis;Idaea tacturata;Adhemarius ypsilon;Hemerophila gradella;Acrosemia vulpecularia;Pyrgion repanda;Epimecis matronaria;Scopula sp_group_san_luis;Eois sp_san_luis_b;Virbia sp_san_luis_a;Conchylodes erinalis;Eois sp_san_luis_b;Epimecis matronaria;Melanolophia sp_san_luis;Melese monima;Rhabdatomis laudamia;Lirimiris inopinata;Josia sp_group_san_luis;Pseudodirphia menander;Josia sp_group_san_luis;Disphragis proba;Cliniodes opalalis;Melanolophia sp_san_luis;Prorifrons rufescens;Antiblemma concinnula;Melanolophia sp_san_luis;Nola sp_san_luis_a;Melanolophia sp_san_luis;Cliniodes opalalis;Eois sp_san_luis_b;Renia vinasalis;Eucereon relegata;Colla rhodope;Conchylodes erinalis;Oospila venezuelata;Semaeopus illimitata;Eucereon aurantiaca;Macaria approximaria--gambarina--ostia;Hemiceras modesta;Microphysetica hermeasalis;Nola sp_san_luis_a;Melanolophia sp_san_luis;Oospila venezuelata;Cautethia spuria;Euphyia sp_group_san_luis;Gn sp_san_luis_8046;Polypoetes villia;Synnomos sp_san_luis_a;Hemiceras pernubila;Hemerophila gradella;Nola sp_san_luis_a;Iridopsis validaria;Sarsina purpurascens;Argyrotome prospecta;Gn sp_san_luis_2008;Eulepidotis alabastraria;Gn sp_san_luis_7001;Desmia bajulalis;Eusarca asteria--cayennaria;Nola sp_san_luis_a;Oospila venezuelata;Macaria approximaria--gambarina--ostia;Epimecis matronaria;Glena mopsaria;Euphyia sp_group_san_luis;Iridopsis lurida--oberthuri--herse;Gizma undilinealis;Dichomeris arotrosema;Iridopsis lurida--oberthuri--herse;Disphragis proba;Gn sp_san_luis_7003;Eusarca asteria--cayennaria;Acharia hyperoche;Thysanopyga amarantha;Oospila athena;Glena mopsaria;Crambidia myrlosea;Marimatha nigrofimbria;Eucereon aurantiaca;Euphyia sp_group_san_luis;Gn sp_san_luis_1023;Pyrgion repanda;Microphysetica hermeasalis;Lobocleta tenellata;Scopula sp_group_san_luis;Clepsis sp_san_luis_a;Iridopsis lurida--oberthuri--herse;Argyrotome prospecta;Phrygionis polita;Marimatha nigrofimbria;Iridopsis lurida--oberthuri--herse;Semaeopus viridiplaga;Euphyia sp_group_san_luis;Iridopsis lurida--oberthuri--herse;Nephodia auxesia;Josia sp_group_san_luis;Herbita amicaria;Melanolophia sp_san_luis;Semaeopus illimitata;Thysanopyga amarantha;Opisthoxia miletia;Hymenia perspectalis;Gn sp_san_luis_4001;Gn sp_san_luis_8057;Oospila athena;Hymenia perspectalis 20120623 10.3 -84.8$Amorbia emigratella;Idaea sp_group_san_luis;Oospila athena;Lomographa argentata;Idaea sp_group_san_luis;Dysodia oculatana;Ptychamalia sp_san_luis_a;Gizma undilinealis;Udea rubigalis;Antaeotricha sp_san_luis_b;Bryoptera friaria;Euphyia sp_group_san_luis;Gn sp_san_luis_1028;Glena mopsaria;Epicrisias eschara;Gn sp_san_luis_1037;Hemiceras pernubila;Melanolophia sp_san_luis;Eois sp_san_luis_b;Lonomia electra;Gn sp_san_luis_1027;Macaria approximaria--gambarina--ostia;Gn sp_san_luis_1002;Gn sp_san_luis_1001;Eusarca asteria--cayennaria;Eucereon tigrata;Simopteryx torquataria;Phyllodonta latrata;Eois sp_san_luis_b;Thysanopyga amarantha;Oospila athena;Scopula sp_group_san_luis;Oospila venezuelata;Lineodes sp_san_luis_a;Gn sp_san_luis_1001;Glena mopsaria;Glena mopsaria;Amorbia emigratella;Oospila venezuelata;Oospila venezuelata;Oospila venezuelata;Nola sp_san_luis_a;Oospila venezuelata;Gn sp_san_luis_1002;Glena mopsaria;Gn sp_san_luis_1002;Anomis texana;Melanolophia sp_san_luis;Paragonia cruraria;Euphyia sp_group_san_luis;Gn sp_san_luis_1013;Spodoptera eridania;Eois sp_san_luis_f;Euphyia sp_group_san_luis;Herbita amicaria;Disphragis proba;Anomis texana;Hemiceras pernubila;Thysanopyga amarantha;Physocleora pauper;Disphragis proba;Acrosemia vulpecularia;Melanolophia sp_san_luis;Glena mopsaria;Gn sp_san_luis_1023;Gn sp_san_luis_1023;Gn sp_san_luis_1001;Gn sp_san_luis_1003;Bagisara laverna;Clemensia leopardina;Conchylodes erinalis;Gn sp_san_luis_1001;Bleptina caradrinalis;Gn sp_san_luis_1001;Nephodia auxesia;Leucula meganira;Nola sp_san_luis_a;Thysanopyga amarantha;Eois sp_san_luis_e;Orthofidonia sp_san_luis_a;Isogona continua--natatrix;Oospila venezuelata;Manduca lucetius;Manduca lucetius;Rhabdatomis laudamia;Gn sp_san_luis_7003;Oxydia trychiata;Idaea sp_group_san_luis;Stenoma byssina;Melanolophia sp_san_luis;Amorbia emigratella;Thysanopyga amarantha;Marimatha nigrofimbria;Iridopsis lurida--oberthuri--herse;Clepsis sp_san_luis_a;Sabulodes colombiata;Scopula sp_group_san_luis;Idalus crinis;Macrocneme iole;Urodus sp_san_luis_c;Hymenia perspectalis;Gonodonta paraequalis;Disphragis proba;Rhabdatomis laudamia;Ethmia exornata;Hymenia perspectalis;Gn sp_san_luis_2021;Quentalia subumbrata;Disphragis proba;Oospila venezuelata;Amorbia emigratella;Glena mopsaria;Herbita amicaria;Hylesia continua;Hylesia continua;Hylesia continua;Gn sp_san_luis_1023;Ethmia exornata;Gn sp_san_luis_1003;Psilosetia pura;Psilosetia pura;Psilosetia pura;Thysanopyga casperia;Glena mopsaria;Nola sp_san_luis_a;Herbita amicaria;Idaea sp_group_san_luis;Gn sp_san_luis_4001;Clemensia leopardina;Gn sp_san_luis_1004;Thysanopyga amarantha;Conchylodes erinalis;Disphragis proba;Semaeopus illimitata;Gn sp_san_luis_1010;Idaea sp_group_san_luis;Renia vinasalis;Euphyia sp_group_san_luis;Glena mopsaria;Josia sp_group_san_luis;Eois sp_san_luis_e;Stenoma byssina;Pyrgion repanda;Glena mopsaria;Cimicodes albicosta;Phyllodonta latrata;Lineodes sp_san_luis_a;Hymenia perspectalis;Acharia hyperoche;Conchylodes erinalis;Oxydia bilinea;Iridopsis lurida--oberthuri--herse;Hemiceras pernubila;Oxydia masthala;Isogona continua--natatrix;Amorbia emigratella;Cliniodes opalalis;Hypena livia;Cecharismena sp_san_luis_a;Hemiceras nigricosta;Glenopteris oculifera;Melanolophia sp_san_luis;Antaeotricha sp_san_luis_a;Leucanopsis longa;Anomis texana;Crambidia cephalica;Ascalapha odorata;Ascalapha odorata 20120623 33.9 -83.3$Renia flavipunctalis;Peridea basitriens;Peridea basitriens;Idaea obfusaria;Eulithis diversilineata;Clemensia albata;Acrolophus texanella;Chytonix palliatricula;Idia rotundalis;Spilosoma congrua;Spilosoma congrua;Spilosoma congrua;Parapediasia decorellus;Aethiophysa lentiflualis;Datana major;Datana major;Spodoptera ornithogalli;Cisthene packardii;Idaea obfusaria;Nigetia formosalis;Clemensia albata;Eutrapela clemataria;Ectropis crepuscularia;Heterocampa obliqua;Heterocampa obliqua;Baileya dormitans;Datana drexelii;Datana drexelii;Eulithis diversilineata;Crambidia pallida--uniformis;Arta statalis;Megalopyge opercularis;Megalopyge opercularis;Heterocampa obliqua;Isochaetes beutenmuelleri;Eudryas grata;Eudryas grata;Parasa chloris;Parasa chloris;Palthis asopialis;Apoda biguttata;Apoda biguttata;Paectes abrostoloides;Clemensia albata;Loxostegopsis merrickalis;Loxostegopsis merrickalis;Hypagyrtis esther;Idia julia;Microcrambus elegans;Megalopyge opercularis;Melanolophia signataria;Ectropis crepuscularia;Nigetia formosalis 20120624 10.3 -84.8$Nola sp_san_luis_a;Nola sp_san_luis_a;Gn sp_san_luis_2009;Amorbia emigratella;Eois sp_san_luis_b;Eupithecia sp_group_san_luis;Eusarca asteria--cayennaria;Gn sp_san_luis_1027;Glena mopsaria;Iridopsis validaria;Macaria carpo;Glena mopsaria;Campatonema lineata;Eusarca asteria--cayennaria;Ametris nitocris;Melanolophia sp_san_luis;Iridopsis lurida--oberthuri--herse;Thysanopyga carfinia;Conchylodes erinalis;Gn sp_san_luis_1004;Phyllodonta latrata;Gn sp_san_luis_1001;Nola sp_san_luis_a;Gn sp_san_luis_1001;Glena mopsaria;Gn sp_san_luis_1003;Euphyia sp_group_san_luis;Eois sp_san_luis_b;Euphyia sp_group_san_luis;Pareuchaetes insulata;Phyllodonta latrata;Glena mopsaria;Xylophanes porcus;Trygodes amphion;Synnomos sp_san_luis_a;Nola sp_san_luis_a;Eucereon aroa;Nephodia betala;Eusarca asteria--cayennaria;Thysanopyga amarantha;Gn sp_san_luis_1024;Amorbia emigratella;Conchylodes erinalis;Sphacelodes vulneraria;Sabulodes arge;Nola sp_san_luis_a;Gn sp_san_luis_1042;Euphyia sp_group_san_luis;Simopteryx torquataria;Clepsis sp_san_luis_a;Oospila venezuelata;Oxydia bilinea;Oospila venezuelata;Eusarca asteria--cayennaria;Anomis texana;Eusarca crameraria--brown;Nola sp_san_luis_a;Melanolophia sp_san_luis;Nola sp_san_luis_a;Melanolophia sp_san_luis;Gn sp_san_luis_8097;Gn sp_san_luis_1001;Nola sp_san_luis_a;Semaeopus viridiplaga;Gn sp_san_luis_1023;Iridopsis lurida--oberthuri--herse;Isochromodes caleta;Isochromodes caleta;Antaeotricha sp_san_luis_d;Crambidia myrlosea;Oospila venezuelata;Gn sp_san_luis_8032;Phyllodonta latrata;Amorbia emigratella;Oospila venezuelata;Gn sp_san_luis_2022;Glena mopsaria;Oospila venezuelata;Glena mopsaria;Euphyia sp_group_san_luis;Gn sp_san_luis_1001;Macaria carpo;Hemiceras pernubila;Gn sp_san_luis_1003;Pyrgion repanda;Gn sp_san_luis_1002;Gn sp_san_luis_1023;Opharus rudis;Tricentrogyna vinacea;Trygodes amphion;Anticarsia gemmatalis;Eois sp_san_luis_b;Glena mopsaria;Euphyia sp_group_san_luis;Synnomos sp_san_luis_a;Nola sp_san_luis_a;Glena mopsaria;Trygodes amphion;Lobocleta tenellata;Gn sp_san_luis_7001;Thysanopyga amarantha;Synnomos sp_san_luis_a;Gn sp_san_luis_1002;Gn sp_san_luis_1002;Gn sp_san_luis_1002;Eusarca asteria--cayennaria;Cratoptera zarumata;Sabulodes colombiata;Synnomos sp_san_luis_a;Stenoma byssina;Sanys irrosea;Gn sp_san_luis_1001;Gn sp_san_luis_1011;Euphyia sp_group_san_luis;Sanys irrosea;Herbita aglausaria;Eois sp_san_luis_b;Bagisara laverna;Anticarsia gemmatalis;Melanolophia sp_san_luis;Eois sp_san_luis_e;Synnomos urota;Gn sp_san_luis_1003;Iridopsis lurida--oberthuri--herse;Gn sp_san_luis_1002;Gn sp_san_luis_1003;Eusarca asteria--cayennaria;Gn sp_san_luis_8097;Opharus rudis;Thysanopyga amarantha;Scopula sp_group_san_luis;Gn sp_san_luis_1042;Idaea sp_group_san_luis;Eusarca asteria--cayennaria;Pareuchaetes insulata;Iridopsis validaria;Glena mopsaria;Gn sp_san_luis_1003;Herminocala sabata;Gizma undilinealis;Nola sp_san_luis_a;Herminocala sabata;Josia sp_group_san_luis;Melanolophia sp_san_luis;Gn sp_san_luis_7001;Polypoetes villia;Gn sp_san_luis_2006;Gn sp_san_luis_8073;Nola sp_san_luis_a;Eois sp_san_luis_b;Glena mopsaria;Gn sp_san_luis_7001;Melanolophia sp_san_luis;Gn sp_san_luis_1004;Idaea sp_group_san_luis;Nola sp_san_luis_a;Eusarca sp_san_luis_a;Gn sp_san_luis_1002;Gn sp_san_luis_8005;Idaea sp_group_san_luis;Dichomeris arotrosema;Simopteryx torquataria;Idaea sp_group_san_luis;Nola sp_san_luis_a;Thysanopyga amarantha;Simopteryx torquataria;Thysanopyga amarantha;Gn sp_san_luis_1023;Trygodes amphion;Eois sp_san_luis_b;Semaeopus viridiplaga;Euphyia sp_group_san_luis;Hemiceras pernubila;Elaphria sp_san_luis_a;Rivula leucosticta;Scopula sp_group_san_luis;Thysanopyga amarantha;Gn sp_san_luis_7001;Semaeopus viridiplaga;Dichomeris arotrosema;Melanolophia sp_san_luis;Epimecis semicompleta;Phrygionis platinata;Gn sp_san_luis_1005;Opharus rudis;Melanolophia sp_san_luis;Pachylia syces;Pachylia syces;Pachylia syces;Pachylia syces;Pachylia syces;Thysanopyga amarantha;Eois sp_san_luis_b;Semaeopus sp_san_luis_a;Bertholdia specularis;Melese monima;Rhabdatomis laudamia;Tosale aucta;Phostria tedea;Ischnurges eudamidasalis;Semaeopus viridiplaga;Trygodes amphion;Opisthoxia miletia;Eusarca melenda;Conchylodes erinalis;Conchylodes erinalis;Glena mopsaria;Melanolophia sp_san_luis;Hymenia perspectalis;Tosale aucta;Gn sp_san_luis_7001;Euphyia sp_group_san_luis;Gn sp_san_luis_8001;Scopula sp_group_san_luis;Glena mopsaria;Amorbia emigratella;Eois sp_san_luis_f;Amorbia emigratella;Amorbia emigratella;Hemerophila gradella;Amorbia emigratella;Givira lineaeplena;Rhabdatomis laudamia;Hypena andraca;Macaria carpo;Clepsis sp_san_luis_a;Euphyia sp_group_san_luis;Gn sp_san_luis_2020;Clepsis sp_san_luis_a;Clepsis sp_san_luis_a;Disphragis proba;Cliniodes opalalis;Gn sp_san_luis_7003;Hymenia perspectalis;Micrathetis dasarada;Iridopsis lurida--oberthuri--herse;Diphthera festiva;Gn sp_san_luis_1001;Thysanopyga amarantha;Desmia bajulalis;Idaea sp_group_san_luis;Euphyia sp_group_san_luis;Conchylodes concinnalis;Hymenia perspectalis;Euphyia sp_group_san_luis;Tetanolita mynesalis;Pachylioides resumens;Cratoptera zarumata;Iridopsis lurida--oberthuri--herse;Glena mopsaria;Euphyia sp_group_san_luis;Gn sp_san_luis_2003;Udea rubigalis;Cimicodes albicosta;Euphyia sp_group_san_luis;Renodes curviluna;Trygodes amphion;Iridopsis lurida--oberthuri--herse;Lobocleta tenellata;Iridopsis lurida--oberthuri--herse;Gn sp_san_luis_1001;Oospila venezuelata;Xenosoma nigromarginatum;Eucereon relegata;Iridopsis lurida--oberthuri--herse;Renodes curviluna;Macaria approximaria--gambarina--ostia;Glena mopsaria;Lineodes sp_san_luis_a;Cosmosoma impar;Eucereon tigrata;Eusarca asteria--cayennaria;Sabulodes colombiata;Pseudodirphia menander;Hymenia perspectalis;Eucereon tigrata;Sabulodes colombiata;Oospila venezuelata;Pyrgion repanda;Ischnurges eudamidasalis 20120624 33.9 -83.3$Acrolophus popeanella;Feltia subterranea;Acrolophus texanella;Acrolophus texanella;Neodactria luteolella;Neodactria luteolella;Renia flavipunctalis;Lithacodes fasciola;Bleptina inferior;Amydria effrentella;Cisthene packardii;Baileya ophthalmica;Protoboarmia porcelaria;Glyphidocera lactiflosella;Clemensia albata;Spilosoma congrua;Spilosoma congrua;Spilosoma congrua;Rhyacionia rigidana;Diatraea lisetta'
CR_query=re.compile(r'(\d\d\d\d\d\d\d\d\s)10.3\s-84.8\$(.*?)\d\d\d\d\d\d\d\d')
x=re.findall(CR_query,s)
d=[i[0] for i in x]
print "d", d
c=[i[1] for i in x]
print "c", c
print "len(c)", len(c)
Try using the regex: 尝试使用正则表达式:
(\d{8}\s)10\.3\s-84\.8\$(.*?)(?=\d{8}|$)
Your current regex was preventing successive matches because of matching overlaps; 您当前的正则表达式由于匹配重叠而阻止了连续匹配; I removed this by using a positive lookahead
(?= ... )
which is a zero-width assertion. 我通过使用正宽度
(?= ... )
(零宽度断言(?= ... )
消除了此问题。
Also, I escaped your periods. 另外,我逃脱了你的任期。 Those should be escaped if intended to be literal dots.
如果要用作文字点,则应将其转义。
It doesn't look like your regex will match your string. 它看起来不像您的正则表达式将匹配您的字符串。
Your regex looks for 8 digits, followed by "10.3 -84.8$" and then captures everything and then 8 more digits afterwards. 您的正则表达式将查找8位数字,后跟“ 10.3 -84.8 $”,然后捕获所有内容,然后再捕获8位数字。 However, it doesn't look like your string has those 8 digits anywhere else.
但是,看起来您的字符串在其他任何地方都不具有这8位数字。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.