简体   繁体   中英

Reading from certain rows in python with csv module

So I have this information from a csv file in python, I am using the csv module.

ALPHABETICAL ORDER                                                  
        Positions               Classifications                         
Company Booth   Full-Time   Full-Time Visa Sponsor  Part-Time   Internship  Freshman    Sophomore   Junior  Senior  Post-Bacs   MS  PhD Alumni
AIG 10              Yes         Jr          MS      
Baylor�College�of�Medicine    19  Yes Yes                                     Recent
CGG 17  Yes Yes                             MS  PhD Recent
Citi    27/28   Yes         Yes         Jr  Sr              
ExxonMobil  11  Yes         Yes Fr  Soph    Jr  Sr  PB          
    ...                                             
Flow-Cal�Inc. 16  Yes         Yes         Jr  Sr              All
Global�Shop�Solutions   18  Yes         Yes             Sr  PB          All
Harris�County�CTS   22  Yes         Yes         Jr  Sr  PB  MS  PhD All
HCSS    29  Yes         Yes Fr  Soph    Jr  Sr  PB  MS      Recent
Hitachi�Consulting    13  Yes                         Sr      MS      
HP�Inc.   1   Yes         Yes         Jr          MS      Recent
INT�Inc.  20  Yes Yes     Yes         Jr  Sr      MS  PhD 
JPMorgan�Chase�&�Co   3   Yes         Yes         Jr  Sr              
Leidos  390 Yes         Yes Fr  Soph    Jr  Sr  PB  MS      
McKesson    26  Yes                         Sr              

MRE�Consulting�Ltd. 2   Yes                         Sr  PB  MS      All
NetIQ   7               Yes     Soph    Jr  Sr  PB          
PROS    21  Yes                         Sr      MS  PhD All
San�Jacinto�College��   14              Yes     Soph    Jr  Sr  PB  MS      
SAS 4   Yes         Yes Fr  Soph    Jr  Sr  PB  MS      Recent
Smartbridge 8   Yes                         Sr  PB  MS      
Sogeti�USA    15  Yes                         Sr  PB  MS      
Southwest�Research�Institute    12  Yes         Yes         Jr  Sr  PB  MS  PhD All
The�Reynolds�and�Reynolds�Company   23  Yes Yes     Yes Fr  Soph    Jr  Sr  PB          All
UH�Enterprise�Systems   9   Yes Yes Yes Yes Fr  Soph    Jr  Sr  PB  MS  PhD All
U.S.�Marine�Corps   25  Yes         Yes Fr  Soph    Jr  Sr  PB  MS      All
ValuD�Consuting�LLC 5   Yes                         Sr  PB          All
Wipro   24  Yes                         Sr  PB          
BOOTH ORDER                                                 
    Booth   Positions               Classifications                         
Company #   Full-Time   "Full-Time
Visa Sponsor"   Part-Time   Internship  Freshman    Sophomore   Junior  Senior  Post-Bacs   MS  PhD Alumni
HP�Inc.   1   Yes         Yes         Jr          MS      Recent
MRE�Consulting,�Ltd.    2   Yes                         Sr  PB  MS      All
JPMorgan�Chase�&�Co   3   Yes         Yes         Jr  Sr              
SAS 4   Yes         Yes Fr  Soph    Jr  Sr  PB  MS      Recent
ValuD�Consuting�LLC 5   Yes                         Sr  PB          All
NetIQ   7               Yes     Soph    Jr  Sr  PB          
Smartbridge 8   Yes                         Sr  PB  MS      
UH�Enterprise�Systems   9   Yes Yes Yes Yes Fr  Soph    Jr  Sr  PB  MS  PhD All
AIG 10              Yes         Jr          MS      
ExxonMobil  11  Yes         Yes Fr  Soph    Jr  Sr  PB          
Southwest�Research�Institute    12  Yes         Yes         Jr  Sr  PB  MS  PhD All
Hitachi�Consulting    13  Yes                         Sr      MS      
San�Jacinto�College��   14              Yes     Soph    Jr  Sr  PB  MS      
Sogeti�USA    15  Yes                         Sr  PB  MS      
Flow-Cal,�Inc.    16  Yes         Yes         Jr  Sr              All
CGG 17  Yes Yes                             MS  PhD Recent
Global�Shop�Solutions   18  Yes         Yes             Sr  PB          All
Baylor�College�of�Medicine    19  Yes Yes                                     Recent
INT,�Inc. 20  Yes Yes     Yes         Jr  Sr      MS  PhD 
PROS    21  Yes                         Sr      MS  PhD All
Harris�County�CTS   22  Yes         Yes         Jr  Sr  PB  MS  PhD All
The�Reynolds�and�Reynolds�Company   23  Yes Yes     Yes Fr  Soph    Jr  Sr  PB          All
Wipro   24  Yes                         Sr  PB          
U.S.�Marine�Corps   25  Yes         Yes Fr  Soph    Jr  Sr  PB  MS      All
McKesson    26  Yes                         Sr              
Citi    27/28   Yes         Yes         Jr  Sr              
HCSS    29  Yes         Yes Fr  Soph    Jr  Sr  PB  MS      Recent
Leidos  30  Yes         Yes Fr  Soph    Jr  Sr  PB  MS      

As you can see theirs quite a bit of data, and I am trying to only get this from it

0 AIG,10,,,,Yes,,,Jr,,,MS,,
1 Baylor College of Medicine,19,Yes,Yes,,,,,,,,,,Recent
2 CGG,17,Yes,Yes,,,,,,,,MS,PhD,Recent
3 Citi,27/28,Yes,,,Yes,,,Jr,Sr,,,,
4 ExxonMobil,11,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,,,
5 Flow-Cal Inc.,16,Yes,,,Yes,,,Jr,Sr,,,,All
6 Global Shop Solutions,18,Yes,,,Yes,,,,Sr,PB,,,All
7 Harris County CTS,22,Yes,,,Yes,,,Jr,Sr,PB,MS,PhD,All
8 HCSS,29,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,Recent
9 Hitachi Consulting,13,Yes,,,,,,,Sr,,MS,,
10 HP Inc.,1,Yes,,,Yes,,,Jr,,,MS,,Recent
11 INT Inc.,20,Yes,Yes,,Yes,,,Jr,Sr,,MS,PhD,
12 JPMorgan Chase & Co,3,Yes,,,Yes,,,Jr,Sr,,,,
13 Leidos,390,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,
14 McKesson,26,Yes,,,,,,,Sr,,,,
15 MRE Consulting Ltd.,2,Yes,,,,,,,Sr,PB,MS,,All
16 NetIQ,7,,,,Yes,,Soph,Jr,Sr,PB,,,
17 PROS,21,Yes,,,,,,,Sr,,MS,PhD,All
18 San Jacinto College ,14,,,,Yes,,Soph,Jr,Sr,PB,MS,,
19 SAS,4,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,Recent
20 Smartbridge,8,Yes,,,,,,,Sr,PB,MS,,
21 Sogeti USA,15,Yes,,,,,,,Sr,PB,MS,,
22 Southwest Research Institute,12,Yes,,,Yes,,,Jr,Sr,PB,MS,PhD,All
23 The Reynolds and Reynolds Company,23,Yes,Yes,,Yes,Fr,Soph,Jr,Sr,PB,,,All
24 UH Enterprise Systems,9,Yes,Yes,Yes,Yes,Fr,Soph,Jr,Sr,PB,MS,PhD,All
25 U.S. Marine Corps,25,Yes,,,Yes,Fr,Soph,Jr,Sr,PB,MS,,All
26 ValuD Consuting LLC,5,Yes,,,

So the code should skip any rows that have like ",,," and only give me the rows that I need (0-26).

I've tried looking around, and I see a lot of help with pandas module, however I can't use pandas module for this assignment.

I've tried doing it this way

with open('Spring.csv', 'r') as csv_file:
    csv_reader = csv.reader(csv_file, delimiter='\t')
    next(csv_reader)
    next(csv_reader)
    next(csv_reader)

    for line in csv_reader:
        print(line)

However this isn't effective I am seeing and I am sure their is a better way to sort through data. Any input or advice would be appreciated :)

Here is what you want to do. Your issue is that your print is printing even the non-ascii chars as well. I have added re.sub to get only ascii chars.

import csv
import re
repeatCheck = []
with open('Spring.csv', 'r') as csv_file:
    csv_reader = csv.reader(csv_file, delimiter=' ')
    idx = 1
    for line in list(csv_reader)[3:]:
            rowdata = ",".join(i for i in line)

            if (not rowdata.startswith(",,,")) and (rowdata not in repeatCheck) and (rowdata != ""):
                print(idx, re.sub('[^a-zA-Z0-9\n\.,]', ' ', rowdata ))
                idx+=1
                repeatCheck.append(rowdata)

Output:

1 AIG,10,,,,,,,,,,,,,,Yes,,,,,,,,,Jr,,,,,,,,,,MS
2 Baylor      College      of      Medicine,,,,19,,Yes,Yes,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,Recent
3 CGG,17,,Yes,Yes,,,,,,,,,,,,,,,,,,,,,,,,,,,,,MS,,PhD,Recent
4 Citi,,,,27 28,,,Yes,,,,,,,,,Yes,,,,,,,,,Jr,,Sr
5 ExxonMobil,,11,,Yes,,,,,,,,,Yes,Fr,,Soph,,,,Jr,,Sr,,PB
6 Flow Cal      Inc.,16,,Yes,,,,,,,,,Yes,,,,,,,,,Jr,,Sr,,,,,,,,,,,,,,All
7 Global      Shop      Solutions,,,18,,Yes,,,,,,,,,Yes,,,,,,,,,,,,,Sr,,PB,,,,,,,,,,All
8 Harris      County      CTS,,,22,,Yes,,,,,,,,,Yes,,,,,,,,,Jr,,Sr,,PB,,MS,,PhD,All
9 HCSS,,,,29,,Yes,,,,,,,,,Yes,Fr,,Soph,,,,Jr,,Sr,,PB,,MS,,,,,,Recent
10 Hitachi      Consulting,,,,13,,Yes,,,,,,,,,,,,,,,,,,,,,,,,,Sr,,,,,,MS
11 HP      Inc.,,,1,,,Yes,,,,,,,,,Yes,,,,,,,,,Jr,,,,,,,,,,MS,,,,,,Recent
12 INT      Inc.,,20,,Yes,Yes,,,,,Yes,,,,,,,,,Jr,,Sr,,,,,,MS,,PhD
13 JPMorgan      Chase             Co,,,3,,,Yes,,,,,,,,,Yes,,,,,,,,,Jr,,Sr
14 Leidos,,390,Yes,,,,,,,,,Yes,Fr,,Soph,,,,Jr,,Sr,,PB,,MS
15 McKesson,,,,26,,Yes,,,,,,,,,,,,,,,,,,,,,,,,,Sr
16 MRE      Consulting      Ltd.,2,,,Yes,,,,,,,,,,,,,,,,,,,,,,,,,Sr,,PB,,MS,,,,,,All
17 NetIQ,,,7,,,,,,,,,,,,,,,Yes,,,,,Soph,,,,Jr,,Sr,,PB
18 PROS,,,,21,,Yes,,,,,,,,,,,,,,,,,,,,,,,,,Sr,,,,,,MS,,PhD,All
19 San      Jacinto      College            ,,,14,,,,,,,,,,,,,,Yes,,,,,Soph,,,,Jr,,Sr,,PB,,MS
20 SAS,4,,,Yes,,,,,,,,,Yes,Fr,,Soph,,,,Jr,,Sr,,PB,,MS,,,,,,Recent
21 Smartbridge,8,,,Yes,,,,,,,,,,,,,,,,,,,,,,,,,Sr,,PB,,MS
22 Sogeti      USA,,,,15,,Yes,,,,,,,,,,,,,,,,,,,,,,,,,Sr,,PB,,MS
23 Southwest      Research      Institute,,,,12,,Yes,,,,,,,,,Yes,,,,,,,,,Jr,,Sr,,PB,,MS,,PhD,All
24 The      Reynolds      and      Reynolds      Company,,,23,,Yes,Yes,,,,,Yes,Fr,,Soph,,,,Jr,,Sr,,PB,,,,,,,,,,All
25 UH      Enterprise      Systems,,,9,,,Yes,Yes,Yes,Yes,Fr,,Soph,,,,Jr,,Sr,,PB,,MS,,PhD,All
26 U.S.      Marine      Corps,,,25,,Yes,,,,,,,,,Yes,Fr,,Soph,,,,Jr,,Sr,,PB,,MS,,,,,,All
27 ValuD      Consuting      LLC,5,,,Yes,,,,,,,,,,,,,,,,,,,,,,,,,Sr,,PB,,,,,,,,,,All
28 Wipro,,,24,,Yes,,,,,,,,,,,,,,,,,,,,,,,,,Sr,,PB
29 BOOTH,ORDER
30 Company, ,,,Full Time,,,Full Time
Visa Sponsor,,,Part Time,,,Internship,,Freshman,,,,Sophomore,,,Junior,,Senior,,Post Bacs,,,MS,,PhD,Alumni
31 MRE      Consulting,      Ltd.,,,,2,,,Yes,,,,,,,,,,,,,,,,,,,,,,,,,Sr,,PB,,MS,,,,,,All
32 Flow Cal,      Inc.,,,,16,,Yes,,,,,,,,,Yes,,,,,,,,,Jr,,Sr,,,,,,,,,,,,,,All
33 INT,      Inc.,20,,Yes,Yes,,,,,Yes,,,,,,,,,Jr,,Sr,,,,,,MS,,PhD
34 Leidos,,30,,Yes,,,,,,,,,Yes,Fr,,Soph,,,,Jr,,Sr,,PB,,MS

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM