简体   繁体   English

如何在包含冗长的“单词”或一组字符的列中查找字符串

[英]How to find a string in a column that contains a lengthy “word” or set of characters

I am looking for an unusually long word or grouping of characters in a specific column of data that contains notes written by users. 我正在寻找包含用户编写的注释的特定数据列中的异常长单词或字符分组。 For example, if something like this - 举例来说,如果是这样-

I am looking for an unusuallylongwordorgroupingofcharactersina specific column

  • exists, I need to find it so I can add spaces if necessary. 存在,我需要找到它,以便在必要时添加空格。 My question is: How do I find a word or set of characters that exceeds a certain number of characters? 我的问题是:如何查找超过一定数量字符的单词或一组字符?

The problem is that somewhere in this data, an unusually long word or grouping of characters is being parsed and causing an OutOfMemoryException , so I need to find the source and fix it. 问题在于,在此数据中的某个地方,正在解析异常长的单词或字符组,并导致OutOfMemoryException ,因此我需要找到源并进行修复。

You could use a regex in C# if the raw string fits in memory: \\w{15,} gives you words at least 15 characters in length. 如果原始字符串适合内存,则可以在C#中使用正则表达式: \\w{15,}给出的单词长度至少为15个字符。 There are many ways to tweak this (lookahead, lookbehind, more specific character classes, etc.). 有许多方法可以对此进行调整(超前,向后看,更具体的字符类等)。

You can write a C# stored procedure that can be run against the column in question. 您可以编写一个可以在相关列上运行的C#存储过程。 It would split the column into an array of strings containing a word Then you can easily find the largest word in the column. 它将列拆分成包含单词的字符串数组,然后您可以轻松地找到列中最大的单词。

see http://msdn.microsoft.com/en-us/library/vstudio/zxsa8hkf%28v=vs.100%29.aspx 参见http://msdn.microsoft.com/zh-cn/library/vstudio/zxsa8hkf%28v=vs.100%29.aspx

for details on how to, write install and debug a C# stored procedure in SQL Server 有关如何在SQL Server中编写,安装和调试C#存储过程的详细信息

Using the answers given, I created a program that pulls the data and tosses each word into a list. 使用给出的答案,我创建了一个程序来提取数据并将每个单词扔进一个列表中。 It then pulls words of a given length (in my case, I did greater than 20 characters) and found the bad "word". 然后,它拉出给定长度的单词(在我的情况下,我的单词长度超过20个字符),并发现了错误的“单词”。 Now I can fix the data. 现在,我可以修复数据了。

I appreciate all your help, guys. 谢谢大家的帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM