简体   繁体   中英

Sqlalchemy - How to execute this query and is this right?

I'm working with a legacy data set that was provided to me in text files. I have a table with businesses:

class Business(Base):
    __tablename__ = 'businesses'
    id = Column(Integer, primary_key=True)
    name = Column(String)
    county_fips = Column(String)
    state_code = Column(String(2), ForeignKey('states.code'))
    state_fips = Column(String(2), ForeignKey('states.state_fips'))
    city = Column(String)
    [... a bunch of other columns ...]

And I have two tables with categories of differing specificity.

class Category(Base):
    __tablename__ = 'categories'
    id = Column(Integer, primary_key=True)
    name = Column(String)

class SubCategory(Base):
    __tablename__ = 'sub_categories'
    id = Column(Integer, primary_key=True)
    name = Column(String)

I have a giant text file that matches Category and SubCategory ID's to Business ID's. I've mapped this as:

class BusinessHeadings(Base):
    __tablename__ = 'business_headings'
    id = Column(ForeignKey('businesses.id'), primary_key=True)
    cat_id = Column(ForeignKey('categories.id')
    subcat_id = Column(ForeignKey('sub_categories.id')
    [... another integer column ...]

On the geography side, I have a model for City, USCounty, and State.

City has a ForeignKey to states via the two-letter code - CA for California, so on. The only reference to county is by name, which is not able to have a unique constraint since there are collisions. USCounty has a primary key as county_FIPS, state_FIPS, and state_code (two letters). The state_FIPS and state_code are both ForeignKeys pointing to the state object.

MY TASK:

Select all businesses belonging to a specific category in a specific city in a specific county in a specific state.

If you've worked with a full list of US cities, states, and counties... there are towns with the same name in the same state, counties with the same name in different states, and so on.

Right now, I pull the specific county and state with:

specific_county = session.query(USCounty).filter_by(name=city.county).\
filter_by(state_code=city.state_code).first()

With this I can pull all the businesses in a specific city, county, state with no problem. Now it's picking through that to only pull out ones that match either a category or subcategory.

Is there a query that works, or have I defined the table incorrectly (specifically BusinessHeadings)?

So to answer your question at face value, you need to perform a JOIN to connect your business location data with your business category data. I think something like this might work:

session.query(Business, BusinessHeadings).
    join(BusinessHeadings).
    filter(Business.county_fips == county_fips).
    filter(Business.state_code == state_code).
    filter(Business.city == city).
    filter(BusinessHeadings.cat_id==cat_id).
    filter(BusinessHeadings.subcat_id==subcat_id).
    all()

I'm assuming you have access to all the codes you need prior to making this query. But thinking about how to construct that JOIN led me to deeper questions about your database setup.

Maybe there are business reasons I'm not aware of, but it seems like you have a lot of tables going on. It just seems like with these scheme you're going to have to do a lot of JOINS for even a basic query. Maybe this is the best layout but might be worth reconsidering.

Also, for the BusinessHeadings table I think you are using a foreign key as a primary key which I don't think is recommended:

Is it fine to have foreign key as primary key?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM