简体   繁体   中英

Impossible to get specific table with `read_html

I am trying to access a table with read_html function ( URL ).

My problem is that it seems that read_html does not find my table whereas it exists. I don't know why except that the table I want to read has the same attribute than an other one.

I did first :

url = 'https://geco.amf-france.org/Bio/res_part.aspx?NomProd=&OrderBy=&OrderBySens=&NumAgr=&CodeISIN=&NomSoc=&selectNRJ=OPCVM&ClassProd=0&TypeProd=&npos='
df = pd.read_html(url)
df

it returns 3 df store in df[0], df 1 and df[3] but none is the main table that I saw. It is just little table ie and not the 'big' one as in the image :

[                                                   0
 0  Recherche Liste  Vendredi 2 octobre 2020  Prod...,
                    0             1                     2  \
 0          Recherche     Recherche             Recherche   
 1  Produit financier  Gestionnaire  Valeurs liquidatives   
 
                                3        4                       5  \
 0                      Recherche    Liste                   Liste   
 1  Associations Professionnelles  Encours  Gestionnaires agréés   
 
                                        6                        7  
 0                                  Liste  Vendredi 2 octobre 2020  
 1  Gestionnaires de l'EEE - Passeport IN   Recherche Documentaire  ,
                    0   1                         2      3
 0   Code ISIN part : NaN          Nom de Produit :    NaN
 1  N° d'agrément : NaN  Classification produit :    NaN
 2           Nom SG : NaN       Régime juridique :  OPCVM] 

I also tried :

df = pd.read_html(url, attrs = {'class' : 'ctcoltableresult2'})
df

but it returns nothing

我想得到的主表

Any ideas?

It looks like if you specify the falvor as bs4 it works. I also added attrs to only get the 'big' table.

url = 'https://geco.amf-france.org/Bio/res_part.aspx?NomProd=&OrderBy=&OrderBySens=&NumAgr=&CodeISIN=&NomSoc=&selectNRJ=OPCVM&ClassProd=0&TypeProd=&npos='
df = pd.read_html(url, flavor='bs4', attrs={'class':'ctcoltableresult2'})
df

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM