简体   繁体   中英

Trying to add a specific column from a split list

I have an input file that i need to split. The file could have any number of lines, but on each line there are 4 things. First is a region code, next is a number of fiction books sold in that region, next is the number of nonFiction books sold in region and last is the tax for that region (example: TX 493 515 0.055). I've figured out everything I need to do with the program that I need to do except summing up all the total of fiction books, nonfiction books and total sales. Say there were just three total lines and fictionBooks sold for each region were 493, 500, 489 and obviously they are each on separate lines. Here is what I wrote and was wondering what am I doing wrong:

while (myFile != ""):

    myFile = myFile.split()
    sumFiction = 0
    for i in range(myFile):
        sumFiction = sumFiction + eval(myFile[1])

If I split the file of (CO 493 515 0.055) wouldn't CO be myFile[0], 493 be myFile[1], etc. Any help would be appreciated.

Edit: Sorry, I should have been a bit more specific. I'm reading from a file, and lets say this file has 3 lines(but my code needs to be for an unlimited amount of lines):

TX 415 555 0.55
MN 330 999 0.78
HA 401 674 0.99

First is region code, then it's amount of fiction books sold, then amount of nonFiction books sold and then tax for that region. I need to figure out total amount of books sold in region etc, which I've done. Only thing I can't figure out is how to sum up all three lines of fiction books sold (ex: 415, 330, 401). Here is code so far:

def ComputeSales(fictionBooks,nonFictionBooks,areaTax):
    total = (fictionBooks * 14.95) + (nonFictionBooks * 9.95)
    tax = total * areaTax
    totalSales = total + tax

    return total,tax,totalSales

def main():
    #inFile = input("Please enter name of book data file:  ")
    #inFile = open(inFile,"r")
    inFile = open("pa7.books","r")
    myFile = inFile.readline()

    print()
    print("{0:14}{1:10}".format("","Units Sold"))
    print("{0:10}{1:11}{2:17}{3:12}{4:8}{5:11}".format(
                "Region","Fiction","Non-Fiction","Total","Tax","Total Sales"))
    print("---------------------------------------------------------------------")

    while (myFile != ""):
        myFile = myFile.split()
        sumFiction = 0
        #for i in range(myFile):
            #sumFiction = sumFiction + eval(myFile[1])

        total,tax,totalSales = ComputeSales(eval(myFile[1]),eval(myFile[2]),eval(myFile[3]))

        print("{0:2}{1:10}{2:13}{3:4}{4:14.2f}{5:10.2f}{6:16.2f}".format(
                   "",myFile[0],myFile[1],myFile[2],total,tax,totalSales))

        myFile = inFile.readline()

     print("---------------------------------------------------------------------")
    #print("{0:11}{1:13}{2:34}{3:2}{4:8}".format(
    #             "Total","15035","3155","$","272843.41"))
    print(sumFiction)

main()

Edit : okay, my previous answer was based on assuming that myFile was actually a file object, not a line of a file.

Your main problem seems to be that you're trying to do a loop inside the other loop, which doesn't really make sense: you only need one loop here, over the lines of the file, and add up to the total for each line.

Here's an edited version of your main function that does that. I've also:

  • switched to a for loop over the file, because it's much more natural.
  • used float instead of eval , as recommended in the comments, so that a malicious or mistaken data file will just crash your program instead of running arbitrary code.
  • switched to using a with statement to open the file: that guarantees that the file will be closed even if your program crashes halfway through, which is a good habit to get into, though it doesn't make a big difference here.
  • switched to standard Python style snake_case style for variable names instead of camelCase . (Also, ComputeSales would typically be compute_sales ; CamelCase names are usually only for class names.)
  • changed the filename to an argument, so that you can call it with eg main(sys.argv[1] if len(sys.argv) > 1 else "pa7.books") to support command-line arguments.

Here it is:

def main(filename="pa7.books"):
    sum_fiction = 0
    sum_nonfiction = 0
    sum_total = 0

    with open(filename) as in_file:
        for line in in_file:
            if not line.strip():
                 continue # skip any blank lines

            fields = line.split()
            region = fields[0]
            fiction, nonfiction, area_tax = [float(x) for x in fields[1:]]

            total, tax, total_sales = ComputeSales(fiction, nonfiction, area_tax)

            sum_fiction += fiction
            sum_nonfiction += nonfiction
            sum_total += total_sales

    print("{0:2}{1:10}{2:13}{3:4}{4:14.2f}{5:10.2f}{6:16.2f}".format(
           "", region, fiction, nonfiction, total, tax, total_sales))

    print("---------------------------------------------------------------------")
    print("{0:11}{1:13}{2:34}{3:2}{4:8}".format(
           "Total", sum_fiction, sum_nonfiction, "$", sum_total))

If you don't understand any of the changes I've suggested, feel free to ask!

Ugh. This is Ugly. Beautiful is Better Than Ugly . I'm not primarily a Python programmer anymore, so there may be better tooling for this. But let's attack the problem from a conceptual level.

This is standard imperative programming that over complicates the problem. It makes it easy to get lost in implementation noise, such the the problem you're having. It keeps you from seeing the forest for the trees. Let's try another approach.

Let's focus on what we need to do, and let the implementation arise from that. First we know we need to read from a file.

Read From File

Scenario: Calculate totals within region database
Feature: Read from database

As a user, in order to be able to view the total sales of my books and differentiate them by fiction and nonfiction, I want to be able to read data from a file.

Given: I have a file that has region data, for example data.text
When: I load data from it
Then: I should have associated region data available in my program.

Here is the Python implementation as a test case:

import unittest

class RegionTests(unittest.TestCase):
    def testLoadARegionDatabase(self):
        """Given a region file,when I load it, then it should be stored in memory"""
        # Given region database
        regionDatabase = []
        # When I load it
        with open('./regions.txt','r') as f:
            regionDatabase = f.readlines()
        # Then contents should be available
        self.assertTrue(len(regionDatabase) > 0)

Get Region Data from File

We know, conceptually, that every line in that file has a meaning. Every row is, fundamentally, a Region . We have stored a code, fiction sales, non fiction sales, and tax rates within our file. The concept of a Region should have an explicit, first class representation within our system, because Explicit is Better Than Implicit .

Feature: Create a Region

As a user, in order to be able to know a region is information--including nonfiction sales, fiction sales, and tax rate-- I want to be able to create a Region.

Given: I have data for fiction sales, non-fiction sales, and tax rate
When:  I create a Region
Then:  Its sales, non-fiction sales, and tax-rate should be set accordingly

Here is the Python implementation as a test case:

def testCreateRegionFromData(self):
        """Given a set of data, when I create a region, then its non-fiction sales, fiction sales,and tax rate should be set"""
        # Given a set of data
        texas = { "regionCode": "TX", "fiction" : 415, "nonfiction" : 555, "taxRate" : 0.55 }
        # When I create a region
        region = Region(texas["regionCode"], texas["fiction"], texas["nonfiction"], texas["taxRate"])
        # Then its attributes should be set
        self.assertEquals("TX", region.code)
        self.assertEquals(415, region.fiction)
        self.assertEquals(555, region.nonfiction)
        self.assertEquals(0.55, region.taxRate)

This fails. Let's make it pass.

class Region:
    def __init__(self, code, fiction, nonfiction,rate):
        self.code = code
        self.fiction = fiction
        self.nonfiction = nonfiction
        self.taxRate = rate

Analyze totals

Now we know that our system can represent regions. We want something that can analyze a bunch of regions and give us summary statistics on sales. Let's call that something an Analyst .

Feature: Calculate Total Sales

As a user, in order to be able to know what is going on, I want to be able to ask an Analyst what the total sales are for my region

Given: I have a set of regions
When : I ask my Analyst what the total sales are
Then : The analyst should return me the correct answers

Here is the Python implementation as a test case.

def testAnalyzeRegionsForTotalNonFictionSales(self):
    """Given a set of Region, When I ask an Analyst for total non-fiction sales, then I should get the sum of non-fiction sales"""
    # Given a set of regions
    regions = [ Region("TX", 415, 555, 0.55), Region("MN", 330, 999, 0.78), Region("HA", 401, 674, 0.99) ]
    # When I ask my analyst for the total non-fiction sales
    analyst = Analyst(regions)
    result = analyst.calculateTotalNonFictionSales()
    self.assertEquals(2228, result)

This fails. Let's make it pass.

class Analyst:
    def __init__(self,regions):
        self.regions = regions

    def calculateTotalNonFictionSales(self):
        return sum([reg.nonfiction for reg in self.regions])

You should be able to extrapolate for fiction sales from here.

Decisions, decisions

There's an interesting design decision to be made when it comes to total sales.

  • Should we have the Analyst directly read the fiction and non-fiction attributes of a Region and sum them up?

We could do it this way:

def calculateTotalSales(self):
    return sum([reg.fiction + reg.nonfiction for reg in self.regions])

But then, what happens if we add "historical drama" (fiction & non-fiction) or some other attribute? Then every time we change Region, we have to change our Analyst in order to take the new structure of Region into account.

No. That is a bad design decision. Region already knows all it needs to know about its total sales . Region should be able to report its totals.

Make Good Choices!

Feature: Report Total Sales
Given: I have a region with fiction and non-fiction sales
When : I ask the region for its total sales
Then: The region should tell me its total sales

Here is the Python implementation as a test case:

def testGetTotalSalesForRegion(self):
        """Given a region with fiction and nonfiction sales, when I ask for its total sales, then I should get the result"""
        # Given a set of data
        texas = { "regionCode": "TX", "fiction" : 415, "nonfiction" : 555, "taxRate" : 0.55 }
        region = Region("TX", 415, 555, 0.55)
        # When I ask the region for its total sales
        result = region.totalSales()
        # Then I should get the sum of the sales
        self.assertEquals(970,result)

The Analyst should Tell, Don't Ask

def calculateTotalSales(self):
        return sum([reg.totalSales() for reg in self.regions])

Now you have all you need in order to write this application. PLUS, you have an automated regression suite that you can use if you make changes later. It can tell you exactly what you've broken, and the tests explicitly specify what the application is and what it can do.

Result

Here's the resultant program:

from region import Region
from analyst import Analyst

def main():
   text = readFromRegionFile()
   regions = createRegionsFromText(text)
   analyst = Analyst(regions)
   printResults(analyst)

def readFromRegionFile():
    regionDatabase = []
    with open('./regions.txt','r') as f:
            regionDatabase = f.readlines()
    return regionDatabase

def createRegionsFromText(text):
    regions = []
    for line in text:
        data = line.split()
        regions.append(Region(data[0],data[1], data[2], data[3]))
    return regions

def printResults(analyst):
    totSales = analyst.calculateTotalSales()
    totFic = analyst.calculateTotalFictionSales()
    totNon = analyst.calculateTotalNonFictionSales()
    for r in analyst.regions:
        print("{0:2}{1:10}{2:13}{3:4}{4:14.2f}{5:10.2f}".format(
           "", r.code, r.fiction, r.nonfiction, r.totalSales(), r.taxRate))

    print("---------------------------------------------------------------------")
    print("{0:11}{1:13}{2:34}{3:2}{4:8}".format(
           "Total", totFic, totNon, "$", totSales))

if __name__ == "__main__":
    main()

Compare what you wrote with this. Which one is more understandable? Concise? What would you have to change in the two if:

  • You added music sales to each region?
  • You moved from a text file to a MySQL database or web service call?

Make your concepts manifest. Be clear, concise, and expressive with your code.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM