简体   繁体   English

导入和存储URL到Sql数据库中的图像-需要建议

[英]Importing and storing URLs to images in Sql database - advice needed

I am working for a company that has thousands of documents, mostly pdfs, stored in folders on their webserver. 我在一家公司中工作,该公司的Web服务器上的文件夹中存储着成千上万的文档,大部分为pdf。 They are mostly user's manuals and tech documents for different products of the different brands they carry. 它们主要是它们所携带的不同品牌的不同产品的用户手册和技术文档。 They have a webpage that currenty displays links to all of the documents in the folders by iterating through them recursively, then generating an URL for each image based on the file path. 他们有一个网页,该网页当前通过递归遍历文件夹中所有文档的链接来显示链接,然后根据文件路径为每个图像生成一个URL。

The manager is concerned about the fact that anytime someone changes the name of a top level folder on the server where the images are kept, it basically "breaks" the code, as those top level names are hard-coded in the app. 经理担心这样的事实,即只要有人在保存图像的服务器上更改顶级文件夹的名称,就会基本上“破坏”代码,因为这些顶级名称在应用程序中进行了硬编码。 He wants all of the URLs to be stored in the database to alleviate this issue, and has tasked me with basically replicating the current folder structure on the web server in a SQL Database, and then getting all of the URLs into that database. 他希望将所有URL存储在数据库中以缓解此问题,并责成我的任务是基本上在SQL数据库中的Web服务器上复制当前文件夹结构,然后将所有URL放入该数据库。 From the research I have done, it is no trivial task to implement a hierarchical structure like that in a relational database, and I am not a DBA - I'm a web developer. 根据我所做的研究,实现像关系数据库中那样的层次结构并不是一件容易的事,而且我不是DBA,而是Web开发人员。

So my question now is really how can I get the URLs to all of the thousands of images that are currently on the web server into the database? 因此,我现在的问题实际上是如何将当前Web服务器上所有成千上万张图片的URL获取到数据库中? I was thinking maybe creating just a simple table called "Brands" that holds the root URLs for the brands, then another table called "Image links" or something like that, then writing a little utility to simply iterate through all of the image URLs and insert them into that table - does that sound like the way to go? 我在想,也许只是创建一个简单的表,即“ Brands”,其中包含品牌的根URL,然后再创建一个表,即“ Image links”或类似的名称,然后编写一个实用程序,简单地遍历所有图像URL,然后将它们插入该表-听起来像是要走的路吗?

Seems to me like you could import all of the folder and file names pretty easily with a simple command-line program written in C# or PowerShell, which starts at the root and iterates all of the folders and files. 在我看来,您可以使用一个简单的用C#或PowerShell编写的命令行程序轻松导入所有文件夹和文件名,该程序从根目录开始并迭代所有文件夹和文件。 You could have tables like these: 您可以有如下表格:

CREATE TABLE dbo.Folders
(
  FolderID   INT IDENTITY(1,1) PRIMARY KEY,
  ParentID   INT NULL FOREIGN KEY REFERENCES dbo.Folders(FolderID),
  FolderName NVARCHAR(255) NOT NULL
  -- , ... other columns ...
);

CREATE TABLE dbo.Documents
(
  DocumentID INT IDENTITY(1,1) PRIMARY KEY,
  FolderID   INT NOT NULL FOREIGN KEY REFERENCES dbo.Folders(FolderID),
  DocName    NVARCHAR(255) NOT NULL
  -- , ... other columns ...
);

You could populate the table with something like Directory.GetFiles() which will allow you to traverse the folders and files. 您可以使用Directory.GetFiles()之类的内容填充表,从而允许您遍历文件夹和文件。 You could also write a function to flatten out the path so you don't have to walk the whole hierarchy when building the path - but based on the above it should be pretty trivial to rename a folder and still always generate the correct path without updating each file (though you will have to change the folder and update the database and keep them in sync). 您还可以编写一个函数来展平路径,因此在构建路径时不必遍历整个层次结构-但基于上述内容,重命名文件夹应该很简单,而且仍然始终生成正确的路径而不进行更新每个文件(尽管您将不得不更改文件夹并更新数据库并使它们保持同步)。 Just an example with some fictional data: 只是一些虚构数据的示例:

INSERT dbo.Folders(ParentID, FolderName) SELECT NULL, 'root';
INSERT dbo.Folders(ParentID, FolderName) SELECT 1, 'sub1';
INSERT dbo.Folders(ParentID, FolderName) SELECT 1, 'sub2';
INSERT dbo.Folders(ParentID, FolderName) SELECT 2, 'subsub1';
INSERT dbo.Folders(ParentID, FolderName) SELECT 4, 'subsubsub1';

INSERT dbo.Documents(FolderID, DocName) SELECT 5, 'foo.pdf';
INSERT dbo.Documents(FolderID, DocName) SELECT 5, 'bar.pdf';

Here's a function that can flatten out the path, given a DocumentID: 在给定DocumentID的情况下,以下函数可以使路径变平:

CREATE FUNCTION dbo.ShowFullPath
(
    @FolderID INT,
    @DocName  NVARCHAR(255)
)
RETURNS NVARCHAR(MAX)
AS
BEGIN
    DECLARE @FullPath NVARCHAR(MAX);

    WITH cte AS 
    (
      SELECT FolderID, FolderName, ParentID, rn = 1
        FROM dbo.Folders
        WHERE FolderID = @FolderID
      UNION ALL
      SELECT parent.FolderID, parent.FolderName, parent.ParentID, child.rn + 1
        FROM dbo.Folders AS parent
        INNER JOIN cte AS child
        ON parent.FolderID = child.ParentID
    )
    SELECT @FullPath = STUFF((SELECT '/' + FolderName 
       FROM cte ORDER BY rn DESC 
       FOR XML PATH('')), 1, 1, '')
       + '/' + @DocName
    FROM cte;

    RETURN(@FullPath);
END
GO

So you could always call that at runtime when you need to retrieve the path of a given document (and you could change a function or add a wrapper that takes the DocumentID instead, and provides the FolderID and DocName parameters), but it may make more sense to just use a computed column for that (unfortunately you won't be able to persist the column). 因此,当您需要检索给定文档的路径时,您总是可以在运行时调用它(并且可以更改函数或添加采用DocumentID并提供FolderID和DocName参数的包装器),但是这样做可能会更多仅使用计算列即可(不幸的是,您将无法保留该列)。

ALTER TABLE dbo.Documents ADD FullPath 
  AS CONVERT(NVARCHAR(MAX), dbo.ShowFullPath(FolderID, DocName));

SELECT DocumentID, FolderID, DocName, FullPath FROM dbo.Documents;

Results: 结果:

DocumentID FolderID DocName FullPath
1          5        foo.pdf root/sub1/subsub1/subsubsub1/foo.pdf
2          5        bar.pdf root/sub1/subsub1/subsubsub1/bar.pdf

Or you could create a view: 或者,您可以创建一个视图:

CREATE VIEW dbo.vDocuments
AS
  SELECT DocumentID, FolderID, DocName, dbo.ShowFullPath(FolderID, DocName)
  FROM dbo.Documents;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM