简体   繁体   English

Python 从文本文件中读取特定值和总和

[英]Python read specific value from text file and total sum

I have this text file, Masterlist.txt, which looks something like this:我有这个文本文件 Masterlist.txt,它看起来像这样:

S1234567A|Jan Lee|Ms|05/10/1990|Software Architect|IT Department|98785432|PartTime|3500
S1234567B|Feb Tan|Mr|10/12/1991|Corporate Recruiter|HR Corporate Admin|98766432|PartTime|1500
S1234567C|Mark Lim|Mr|15/07/1992|Benefit Specialist|HR Corporate Admin|98265432|PartTime|2900
S1234567D|Apr Tan|Ms|20/01/1996|Payroll Administrator|HR Corporate Admin|91765432|FullTime|1600
S1234567E|May Ng|Ms|25/05/1994|Training Coordinator|HR Corporate Admin|98767432|Hourly|1200
S1234567Y|Lea Law|Ms|11/07/1994|Corporate Recruiter|HR Corporate Admin|94445432|PartTime|1600

I want to reduce the Salary(the number at the end of each line) of each line, only if "PartTime" is in the line and after 1995, by 50%, and then add it up.我想将每行的薪水(每行末尾的数字)减少 50%,然后将其相加。

Currently I only know how to select only lines with "PartTime" in it, and my code looks like this:目前我只知道如何只选择包含“PartTime”的行,我的代码如下所示:

f = open("Masterlist.txt", "r")
for x in f:
    if "FullTime" in x:
        print(x)

How do I extract the Salary and reduce by 50% + add it up only if the year is after 1995?如果年份是 1995 年之后,我如何提取工资并减少 50% + 加起来?

Try using pandas library.尝试使用熊猫库。 From your question I suppose you want to reduce by 50% Salary if year is less than 1995, otherwise increase by 50%.从你的问题来看,我想如果年份小于 1995 年,你想减少 50% 的Salary ,否则增加 50%。

import pandas as pd

path = r'../Masterlist.txt' # path to your .txt file

df = pd.read_csv(path, sep='|',  names = [0,1,2,'Date',4,5,6,'Type', 'Salary'], parse_dates=['Date']) 
# Now column Date is treated as datetime object 
print(df.head())

         0         1   2       Date                      4  \
0  S1234567A   Jan Lee  Ms 1990-05-10     Software Architect   
1  S1234567B   Feb Tan  Mr 1991-10-12    Corporate Recruiter   
2  S1234567C  Mark Lim  Mr 1992-07-15     Benefit Specialist   
3  S1234567D   Apr Tan  Ms 1996-01-20  Payroll Administrator   
4  S1234567E    May Ng  Ms 1994-05-25   Training Coordinator   

                    5         6    Type      Salary  
0       IT Department  98785432  PartTime    3500  
1  HR Corporate Admin  98766432  PartTime    1500  
2  HR Corporate Admin  98265432  PartTime    2900  
3  HR Corporate Admin  91765432  FullTime    1600  
4  HR Corporate Admin  98767432    Hourly    1200  

df.Salary = df.apply(lambda row: row.Salary*0.5 if row['Date'].year < 1995 and row['Type'] == 'PartTime' else row.Salary + (row.Salary*0.5 ), axis=1)

print(df.Salary.head())

0    1750.0
1     750.0
2    1450.0
3    2400.0
4     600.0
Name: Salary, dtype: float64

Add some modifications to the if, else statement inside the apply function if you wanted something different.如果您想要不同的东西,请对apply函数内的if, else语句进行一些修改。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM