简体   繁体   English

使用Ruby on Rails导入CSV

[英]Import CSV with Ruby on Rails

I am want to Import an Outlook CSV file and check for duplicates based on Email (child) ideally before creating or saving any of the models (either the Contact - parent or Email- Child) 我想导入Outlook CSV文件并理想地基于电子邮件(子)检查重复项,然后再创建或保存任何模型(“联系人”-“父”或“电子邮件-子”)

Steps (how it currently works but is a flawed solution ) 步骤( 当前工作方式,但有缺陷的解决方案

  1. Import File - I save the file 导入文件-我保存文件
  2. Parse each Row to fields 将每个行解析为字段
  3. Perform Checks for uniqueness of email address (which I am saving email record to do this before saving parent record- Contact). 执行检查电子邮件地址的唯一性(在保存父记录-联系人之前,我正在保存电子邮件记录以进行此操作)。

*ISSUES: *问题:
4. SO at the point I save Email address record, it is missing the Contact Id/ owner- potentially creating Orphan situation if the process fails before Contact is created) as it is saving the child(Email) before the Contact is saved (and I don't think or want to be doing this) 4.因此,在我保存电子邮件地址记录的那一刻,它丢失了联系人ID /所有者-如果在创建联系人之前该过程失败,则可能会导致孤立情况发生,因为它在保存联系人之前保存了子(电子邮件)(和我不认为或不想这样做)

  1. However, if I save the Contact just based on name (eg David Smith), I may have: 但是,如果仅根据姓名(例如David Smith)保存联系人,则可能有:

    • and check based on name - there are scenarios where I know 2 people with same name (eg David Smith and will then be appending 2 different people together) 并根据姓名进行检查-在某些情况下,我认识2个同名的人(例如David Smith,然后会将2个不同的人加在一起)
    • If I save all Contacts (and then check Email uniqueness), I will have created a lot of extra Contacts. 如果我保存了所有联系人(然后检查电子邮件的唯一性),则将创建很多额外的联系人。
  2. As it currently works, the check for duplicate Email is on my entire database because I don't have the contact_id (to associated with the user_id aka owner_id) 由于当前有效,因此检查重复的电子邮件在我的整个数据库中,因为我没有contact_id(要与user_id或owner_id关联)

  3. I tried saving the Contact first but then realized this is causing me to have a lot of extra records (very messy). 我尝试先保存联系人,但后来意识到这导致我有很多额外的记录(非常混乱)。

Here is my code to initial process the row 这是我的代码来初始化行


  def process_row(smart_row)
    new_contact, existing_records = smart_row.to_contact

    self.contact = ContactMergingService.new(csv_file.user, new_contact, existing_records).perform
    log_processed_contacts new_contact
    init_contact_info self.contact
    self.contact.required_salutations_to_set = true # will be used for envelope/letter saluation
    if contact.first_name || contact.last_name || contact.email_addresses.first || contact.phone_numbers.first
      self.contact.save!
      csv_file.increment!(:total_imported_records)
    end
  end

This is the first method called above (to save Email before saving Contact) 这是上面调用的第一种方法(在保存联系人之前先保存电子邮件)


   def to_contact
      existing_emails = existing_phone_numbers = nil
      contact = Contact.new.tap do |contact|
        initiate_instance(contact, CONTACT_MAPPING)
        address = initiate_instance(Address.new, ADDRESS_MAPPING)
        contact.addresses << address if address
        email_addresses, existing_emails = initialize_emails(EMAIL_ADDRESS_FIELDS)
        contact.email_addresses << email_addresses
        phone_numbers, existing_phone_numbers = initialize_phone_numbers(PHONE_TYPE_MAPPINGS)
        contact.phone_numbers << phone_numbers
        contact
      end
      existing_records = []
      existing_records << existing_emails
      existing_records << existing_phone_numbers
      existing_records.flatten!
      existing_records.compact!
      [contact,  existing_records]
    end

Here is my code when I save the Email Address (after this I then am saving the Contact) 这是我保存电子邮件地址时的代码(此后我保存联系人)


def initialize_emails email_fields
  email_addresses = []
  email_fields.each do |field|
    value = evaluate_value field
    if value.present?
      new_email = EmailAddress.find_or_create_by(email: value, primary: (primary_email_field?(field)))
      if new_email.save
        email_addresses << new_email
      end
    end
  end
  existing_emails = email_addresses.select{ |email_address| email_address.owner_id.present?}
  [email_addresses, existing_emails]
end

I have 3 models: 我有3种型号:

User (has many)
  has_many :contacts
  has_many :email_campaigns  has_many :email_messages

Contacts:  First Name and Last Name
  belongs_to :user, counter_cache: true
  has_many :addresses, as: :owner, dependent: :destroy
  has_many :phone_numbers, as: :owner, dependent: :destroy
  has_many :email_addresses, as: :owner, dependent: :destroy
  accepts_nested_attributes_for :email_addresses, allow_destroy: true
``
Email:  email_address - polymorphic
  belongs_to :owner, polymorphic: true, touch: true

So my questions is: 所以我的问题是:

  • Do i need to save the records (either contact or email) to be able to do a check for duplicates? 我是否需要保存记录(联系人或电子邮件)才能检查重复项?
  • Is there a way I can process the CSV file in such a way that I can check for duplicates based on email_address before creating either the Contact record or the Email_address record? 有没有一种方法可以处理CSV文件,从而可以在创建联系人记录或电子邮件地址记录之前基于email_address检查重复项? I want to check for duplicates against my existing database and the other records in the file based on contact.first_name, contact.last_name, email.address 我想根据contact.first_name,contact.last_name,email.address对现有数据库和文件中的其他记录检查重复项

Any thoughts? 有什么想法吗? many thanks. 非常感谢。 Annie 安妮

Do i need to save the records (either contact or email) to be able to do a check for duplicates? 我是否需要保存记录(联系人或电子邮件)才能检查重复项?

No, just store them in a variable 不,只需将它们存储在变量中

Is there a way I can process the CSV file in such a way that I can check for duplicates based on email_address before creating either the Contact record or the Email_address record? 有没有一种方法可以处理CSV文件,从而可以在创建联系人记录或电子邮件地址记录之前基于email_address检查重复项? I want to check for duplicates against my existing database and the other records in the file based on contact.first_name, contact.last_name, email.address 我想根据contact.first_name,contact.last_name,email.address对现有数据库和文件中的其他记录检查重复项

find_or_create_by is meant to do specifically this. find_or_create_by专门用于执行此操作。 From the docs: 从文档:

Finds the first record with the given attributes, or creates a record with the attributes if one is not found 查找具有给定属性的第一条记录,或者如果找不到一条记录,则创建具有属性的记录

That is, you can pass an email address and the method will either find a record with that value or create a new one. 也就是说,您可以传递一个电子邮件地址,该方法将找到具有该值的记录或创建一个新记录。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM