繁体   English   中英

Python-如何读取用逗号分隔的csv文件,值中包含逗号?

[英]Python - How to read a csv files separated by commas which have commas within the values?

该文件具有一个URL,其中包含逗号。 例如:〜oref = https://tuclothing.tests.co.uk/c/Girls/Girls_Underwear_Socks&Tights?INITD=GNav-CW-GrlsUnderwear&title=内衣,+袜子+ &+紧身衣

在内衣和袜子之间有一个逗号,这使我的生活不轻松。

有没有办法向读者(熊猫,CSV阅读器等)表明整个URL只是一个值?

这是具有列和值的更大样本:

Event Time,User ID,Advertiser ID,TRAN Value,Other Data,ORD Value,Interaction Time,Conversion ID,Segment Value 1,Floodlight Configuration,Event Type,Event Sub-Type,DBM Auction ID,DBM Request Time,DBM Billable Cost (Partner Currency),DBM Billable Cost (Advertiser Currency),
1.47E+15,CAESEKoMzQamRFTrkbdTDT5F-gM,2934701,,~oref=https://tuclothing.tests.co.uk/c/NewIn/NewIn_Womens?q=%3AnewArrivals&page=2&size=24,4.60E+12,1.47E+15,1,0,940892,CONVERSION,POSTCLICK,,,0,0,
1.47E+15,CAESEKQhGXdLq0FitBKF5EPPfgs,2934701,,~oref=https://tuclothing.tests.co.uk/c/Women/Women_Accessories?INITD=GNav-WW-Accesrs&q=%3AnewArrivals&title=Accessories&mkwid=sv5biFf2y_dm&pcrid=90361315613&pkw=leather%20bag&pmt=e&med=Search&src=Google&adg=Womens_Accessories&kw=leather+bag&cmp=TU_Women_Accessories&adb_src=4,4.73E+12,1.47E+15,1,0,940892,CONVERSION,POSTCLICK,,,0,0,
1.47E+15,CAESEEpNRaLne21k6juip9qfAos,2934701,,num=16512910;~oref=https://tuclothing.tests.co.uk/,1,1.47E+15,1,0,940892,CONVERSION,POSTCLICK,,,0,0,
1.47E+15,CAESEJ3a2YRrPSSeeRUFHDSoXNQ,2934701,,~oref=https://tuclothing.tests.co.uk/c/Girls/Girls_Underwear_Socks&Tights?INITD=GNav-CW-GrlsUnderwear&title=Underwear,+Socks+&+Tights,8.12E+12,1.47E+15,1,0,940892,CONVERSION,POSTCLICK,,0,0,0
1.47E+15,CAESEGmwaNjTvIrQ3MoIvqiRC8U,2934701,,~oref=https://tuclothing.tests.co.uk/login/checkout,1.75E+12,1.47E+15,1,0,940892,CONVERSION,POSTCLICK,,,0,0,
1.47E+15,CAESEM3G-Nh6Q0OhboLyOhtmtiI,2934701,,~oref=https://3984747.fls.doubleclick.net/activityi;~oref=http%3A%2F%2Fwww.tests.co.uk%2Fshop%2Fgb%2Fgroceries%2Ffrozen-%2Fbeef--pork---lamb,3.74E+12,1.47E+15,1,0,940892,CONVERSION,POSTCLICK,,,0,0,
1.47E+15,CAESENlK7oc-ygl637Y2is3a90c,2934701,,~oref=https://tuclothing.tests.co.uk/,5.10E+12,1.47E+15,1,0,940892,CONVERSION,POSTCLICK,,,0,0,

在这种情况下,您遇到的唯一逗号似乎位于URL中。 您可以通过预处理器方法来运行csv文件,该方法会去除URL中的逗号或对它们进行URL编码。

就个人而言,我会选择将逗号转换为%2E的URL编码方法,这样,当您开始读取csv行值时,您的URL中就没有逗号了,但是URL仍然保留了指向参考/目标页面。

如果您在其他字段(不是URL)或csv行中的其他未知/随机位置中遇到此问题,那么解决方案将根本不容易。 但是,由于您每次都确切知道问题发生在哪里,因此您可以对该字符执行静态查找,如果在该特定字段中找到,则将其替换。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM