如何在Azure搜索的主記錄下索引多個Blob？

Question

我按照本教程中描述的步驟進行操作。 我的情況有些不同：

我沒有索引旅館和客房，而是索引了候選人和簡歷。
我不是使用CosmosDB，而是使用Azure SQL數據庫。

遵循本教程，我能夠創建索引，兩個索引器（一個用於SQL DB，一個用於Blobs存儲）以及兩個數據源。

SQL DB包含我所有的候選項，而存儲中包含其所有簡歷（PDF / DOC / DOCX格式的文件）。 每個Blob都有一個元數據“ ResumeCandidateId”，該元數據包含與候選者“ CandidateId”相同的值。

我的索引包含以下字段：

    [SerializePropertyNamesAsCamelCase]
    public partial class Candidate
    {
        [Key]
        [IsFilterable, IsRetrievable(true), IsSearchable]
        public string CandidateId { get; set; }

        [IsFilterable, IsRetrievable(true), IsSearchable, IsSortable]
        public string LastName { get; set; }

        [IsFilterable, IsRetrievable(true), IsSearchable, IsSortable]
        public string FirstName { get; set; }

        [IsFilterable, IsRetrievable(true), IsSearchable, IsSortable]
        public string Notes { get; set; }

        public ResumeBlob[] ResumeBlobs { get; set; }
    }

    [SerializePropertyNamesAsCamelCase]
    public class ResumeBlob
    {
        [IsRetrievable(true), IsSearchable]
        [Analyzer(AnalyzerName.AsString.StandardLucene)]
        public string content { get; set; }

        [IsRetrievable(true)]
        public string metadata_storage_content_type { get; set; }

        public long metadata_storage_size { get; set; }

        public DateTime metadata_storage_last_modified { get; set; }

        public string metadata_storage_name { get; set; }

        [Key]
        [IsRetrievable(true)]
        public string metadata_storage_path { get; set; }

        [IsRetrievable(true)]
        public string metadata_content_type { get; set; }

        public string metadata_author { get; set; }

        public DateTime metadata_creation_date { get; set; }

        public DateTime metadata_last_modified { get; set; }

        public string ResumeCandidateId { get; set; }
    }

如您所見，一個候選人可以有多個簡歷。 挑戰在於如何填充ResumeBlobs屬性...

索引器正確索引和映射了來自SQL DB的數據。 當我運行Blobs索引器時，它會加載文檔，但是不會映射它們，並且它們永遠不會顯示在搜索中（ResumeBlobs始終為空）。 這是用於創建Blobs索引器的代碼：

var blobDataSource = DataSource.AzureBlobStorage(
                name: "azure-blob-test02",
                storageConnectionString: "DefaultEndpointsProtocol=https;AccountName=yyy;AccountKey=xxx;EndpointSuffix=core.windows.net",
                containerName: "2019");

            await searchService.DataSources.CreateOrUpdateAsync(blobDataSource);

            List<FieldMapping> map = new List<FieldMapping> {
                new FieldMapping("ResumeCandidateId", "CandidateId")
            };

            Indexer blobIndexer = new Indexer(
                name: "hotel-rooms-blobs-indexer",
                dataSourceName: blobDataSource.Name,
                targetIndexName: indexName,
                fieldMappings: map,
                //parameters: new IndexingParameters().SetBlobExtractionMode(BlobExtractionMode.ContentAndMetadata).IndexFileNameExtensions(".DOC", ".DOCX", ".PDF", ".HTML", ".HTM"),
                schedule: new IndexingSchedule(TimeSpan.FromDays(1)));

            bool exists = await searchService.Indexers.ExistsAsync(blobIndexer.Name);
            if (exists)
            {
                await searchService.Indexers.ResetAsync(blobIndexer.Name);
            }
            await searchService.Indexers.CreateOrUpdateAsync(blobIndexer);

            try
            {
                await searchService.Indexers.RunAsync(blobIndexer.Name);
            }
            catch (CloudException e) when (e.Response.StatusCode == (HttpStatusCode)429)
            {
                Console.WriteLine("Failed to run indexer: {0}", e.Response.Content);
            }

我注釋了blobIndexer的參數，但是即使未注釋也得到相同的結果。

當我運行搜索時，這是我得到的示例：

{
    "@odata.context": "https://yyy.search.windows.net/indexes('index-test01')/$metadata#docs(*)",
    "value": [
        {
            "@search.score": 1.2127206,
            "candidateId": "363933d1-7e81-4ed2-b82e-d7496d98db50",
            "lastName": "LAMLAST",
            "firstName": "ZFIRST",
            "notes": "MGA ; SQL ; T-SQL",
            "resumeBlobs": []
        }
    ]
}

“ resumeBlobs”為空。 任何想法如何做這樣的映射？

Answer 1

AFAIK，Azure搜索不支持實現方案所需的集合合並功能。

一種替代方法是為簡歷創建一個單獨的索引，然后將簡歷索引器指向該索引。 這意味着您的某些搜索方案將必須命中兩個索引，但這是一條前進的道路。

如何在Azure搜索的主記錄下索引多個Blob？

問題描述

1 個解決方案

解決方案1
1 2019-09-13 20:59:53

如何在Azure搜索的主記錄下索引多個Blob？

問題描述

1 個解決方案

解決方案1 1 2019-09-13 20:59:53

解決方案1
1 2019-09-13 20:59:53