简体   繁体   English

用Java对齐文本

[英]Justify text in Java

I have to read in an integer which will be the length of the succeeding lines. 我必须读一个整数,它是后续行的长度。 (The lines of text will never be longer than the length provided). (文本行将永远不会超过提供的长度)。

I then have to read in each line of text and convert the spaces to an underscore as evenly as possible. 然后,我必须阅读每一行文本,并将空格尽可能均匀地转换为下划线。 For example: 例如:

I would enter the line length of 30. Then a line of text Hello this is a test string . 我输入的行长为30。然后一行文本Hello this is a test string Then all of the spaces will be converted to underscores and padded out so that the text fills the given line length like so: Hello__this__is__a_test_string . 然后,所有空格都将转换为下划线并填充,以便文本填充给定的Hello__this__is__a_test_string如下所示: Hello__this__is__a_test_string As you can see, the original text had a length of 27 characters, so to pad it out to 30 characters I had to add 3 extra spaces to the original text and then convert those spaces to the underscore character. 如您所见,原始文本的长度为27个字符,因此要将其填充为30个字符,我必须在原始文本上添加3个额外的空格,然后将这些空格转换为下划线字符。

Please can you advise a way that I can go about this? 请您提供一个我可以解决的方法吗?

What I do is split the sentence in to words. 我要做的是将句子分成单词。 Then figure out how many spaces need to be added. 然后找出需要添加多少空间。 Then iterate over the words and add a space to each one until you run out of spaces to add. 然后遍历单词并为每个单词添加一个空格,直到用完所有要添加的空格。 If you have enough spaces where you need to add more than one to the words (like you have 5 words, but need to add 13 spaces), simply divide the number of spaces left by the number of words, and add that number to each word first. 如果您有足够的空格需要在单词上添加多个(例如,您有5个单词,但需要添加13个空格),只需将剩余的空格数除以单词数,然后将该数字加到每个单词中单词第一。 Then you can take the remainder and iterate across the words adding a space until you're done. 然后,您可以取余数并遍历单词添加一个空格,直到完成。 Also make sure that you only add spaces to all but the last word in the sentence. 还要确保只对句子中的最后一个单词添加空格。

I had to do something similar to this in Java recently. 最近,我不得不在Java中执行与此类似的操作。 The code itself is relatively straightforward. 代码本身相对简单。 What I found took the longest, was getting my head around the justification process. 我发现花费了最长的时间,使我的头脑围绕了辩护过程。

I started by making a step by step process of how I would justify text manually. 我首先逐步进行了如何手动对齐文本的过程。

  1. Find out how long the line is 找出线有多长
  2. Find out how long the string is which is on said line 找出该行上的字符串多长时间
  3. Calculate the number of spaces required to add to the string to equal the line length 计算添加到字符串以等于行长所需的空格数
  4. Find out how many gaps there are between the words in the string 找出字符串中的单词之间有多少间隔
  5. Calculate how many spaces to add to each gap in the string 计算要添加到字符串中每个间隙的空格数
  6. Add result to each gap 将结果添加到每个间隙
  7. Calculate how many extra spaces there are to serially add to each gap (if the number of gaps is not divisible by the number of spaces to add. For example if you have 5 gaps but 6 spaces to add) 计算要向每个间隙顺序添加多少个额外的空格(如果不能将间隙数除以要添加的空格数。例如,如果您有5个间隙但要添加6个空格)
  8. Add extra spaces to gaps 为间隙增加额外的空间
  9. Convert spaces to underscores 将空格转换为下划线
  10. Return string 返回字符串

Doing this made coding the algorithm much simpler for me! 这样做使对我的算法编码更加简单!

Finding out how long the line and the string on said line are 找出行和该行上的字符串多长时间

You said you have read in the line length and the text on the line so 1 and 2 you have already done. 您说您已经读了行长和行上的文本,因此1和2已经完成。 With 2 being a simple string.length() call. 2是一个简单的string.length()调用。

Calculating the number of spaces required to add to the string to equal the line length is simply taking the line length and subtracting the length of the string. 计算添加到字符串以等于行长所需的空格数,只需简单地将行长减去字符串的长度即可。

lineLength - string.length() = noofspacestoadd;

Finding out how many gaps there are between all the words in the string 找出字符串中所有单词之间有多少间隔

There is probably more than one way of doing this. 可能有多种方法可以做到这一点。 I found that the easiest way of doing this was converting the string into a char[] and then iterating through the characters checking for ' ' and setting a count for when it does find a ' ' 我发现最简单的方法是将字符串转换为char [],然后遍历字符以检查'',并为找到''设置计数

Calculating how many spaces to add to each gap 计算要添加到每个间隙的空间

This is a simple division calculation! 这是一个简单的除法计算!

noofgaps / noofspacestoadd = noofspacestoaddtoeachgap;

Note: You have to make sure you're doing this division with integers! 注意:您必须确保使用整数进行除法! As 5 / 2 = 2.5, therefore you KNOW you have to add 2 spaces to each gap between the words, and divisions using int's truncates the decimal number to form an integer. 由于5/2 = 2.5,因此您知道必须在单词之间的每个空格处添加2个空格,并且使用int的除法会截断十进制数以形成整数。

Add the result to each gap 将结果添加到每个差距

Before being able to add the number of strings required to add to each gap, you need to convert this number into a string of spaces. 在能够添加添加到每个间隙所需的字符串数之前,您需要将此数字转换为空格字符串。 So you need to write a method for converting a given integer into a string of spaces equating to that given number. 因此,您需要编写一种用于将给定整数转换为等于该给定数字的空格字符串的方法。 Again, this can be done in different ways. 同样,这可以通过不同的方式完成。 The way I did it was something like this 我这样做的方式是这样的

String s = "";
for(int i=noofspacestoaddtoeachgap; i>0; i--)
{
    s+= " ";
}

return s;

The way I did this was to convert the string into an array of substrings, with the substrings being each word in the array. 我这样做的方法是将字符串转换为子字符串数组,其中子字符串是数组中的每个单词。 If you look up the String class in the javadoc you should find the methods in the String class you can use to achieve this! 如果您在javadoc中查找String类,则应该在String类中找到方法,可以用来实现这一点!

When you have your array of substrings, you can then add the string of spaces to the end of each substring to form your new substring! 当拥有子字符串数组时,可以在每个子字符串的末尾添加空格字符串以形成新的子字符串!

Calculating how many extra spaces there are extra 计算有多少多余的空间

This is again a simple calculation. 这又是一个简单的计算。 Using the % operator you can do a remainder division similar to the division we did earlier. 使用%运算符,您可以进行余数除法,这与我们之前做的除法相似。

noofgaps % noofspacestoadd = noofspacestoaddtoeachgap;

The result of this calculation gives us the number of extra spaces required to justify the text. 计算结果为我们提供了使文本对齐所需的额外空间的数量。

Add the extra spaces serially to each gap 将多余的空间依次添加到每个间隙

This is probably the most difficult part of the algorithm, as you have to work out a way of iterating through each gap between the words and add an extra space until there are no more extra spaces left to add! 这可能是算法中最困难的部分,因为您必须找到一种方法来遍历单词之间的每个间隙并添加一个额外的空间,直到不再需要添加额外的空间为止!

Return string 返回字符串

return String;

The hardest thing about this problem is defining "as evenly as possible". 关于此问题的最难的事情是定义“尽可能均匀”。

Your example: 你的例子:

 Hello__this__is__a_test_string

... makes all the longer gaps be at the left. ...使所有更长的间隙在左侧。 Wouldn't: 不会:

 Hello__this_is__a_test__string

... fit the imprecise description of the problem better, with the longer gaps spread evenly through the output string? ...较长的间隙均匀地分布在输出字符串中,因此更适合不精确的问题描述?

However, let's solve it so it gives the sample answer. 但是,让我们解决它,以便给出示例答案。

  • First you need to know how many extra characters you need to insert -- numNewChars == lengthWanted minus inputString.length() 首先,您需要知道需要插入多少个额外的字符numNewChars == lengthWanted减去inputString.length()
  • Next you need to count how many gaps there are to distribute these new characters between -- call that numGaps -- it's the number of words minus one. 接下来,您需要计算在这些新字符之间分配多少间距(称为numGaps ,即单词数减去1。
  • In each space you will insert either n or n+1 new spaces. 在每个空格中,您将插入nn+1新空格。 n is numNewChars / numGaps -- integer division; nnumNewChars / numGaps整数除法; rounds down. 四舍五入。
  • Now, how many times do you need to insert n+1 new spaces instead of n ? 现在,您需要插入n+1新空格而不是n It's the remainder: plusOnes = numNewChars % numGaps 剩下的就是: plusOnes = numNewChars % numGaps

That's all the numbers you need. 这就是您需要的所有数字。 Now using whatever method you've been taught (since this is evidently a homework problem, you don't want to use language features or libraries that haven't been covered in your lessons), go through the string: 现在,使用您教过的任何方法(由于这显然是一项家庭作业问题,因此您不想使用课程中未涉及的语言功能或库),请遍历以下字符串:

  • For the first plusOnes spaces, insert n+1 spaces, in addition to the space that's already there. 对于第一个plusOnes空间,除了已经存在的空间外,还要插入n+1空间。
  • For the remaining spaces, insert n spaces. 对于剩余的空格,请插入n空格。

One very basic method would be as follows: 一种非常基本的方法如下:

String output= "";
for(int i=0; i<input.length(); i++) {
    char c = input.charAt(i);
    if(c == ' ' {
        output += ...; // appropriate number of "_" chars
    } else {
        output += "" + c; // "" + just turns the char into a String.
    }
}

I wrote a simple method to justify text. 我写了一个简单的方法来证明文本合理。 Its not 100% accurate, but works for the most part (since it ignores punctuations completely, and there might be some edge cases missing too). 它不是100%准确的,但在大多数情况下都可以工作(因为它完全忽略了标点符号,并且可能还会遗漏一些边缘情况)。 Also, Word justifies text in a richer manner (by not adding spaces to fill up the gap, but evenly distributing the width of a whitespace, which is tricky to do here). 同样,Word以更丰富的方式对文本进行对齐(通过不添加空格来填充空白,而是平均分配空白的宽度,这在这里很难做到)。

public static void justifyText (String text) {
    int STR_LENGTH = 80;
    int end=STR_LENGTH, extraSpacesPerWord=0, spillOverSpace=0;
    String[] words;

    System.out.println("Original Text: \n" + text);
    System.out.println("Justified Text: ");

    while(end < text.length()) {

        if(text.charAt(STR_LENGTH) == ' ') {
            // Technically, this block is redundant
            System.out.println (text.substring(0, STR_LENGTH));
            text = text.substring(STR_LENGTH);
            continue;
        }

        end = text.lastIndexOf(" ", STR_LENGTH);
        words = text.substring(0, end).split(" ");
        extraSpacesPerWord = (STR_LENGTH - end) / words.length;
        spillOverSpace = STR_LENGTH - end + (extraSpacesPerWord * words.length);

        for(String word: words) {
            System.out.print(word + " ");
            System.out.print((extraSpacesPerWord-- > 0) ? " ": "");
            System.out.print((spillOverSpace-- > 0) ? " ": "");
        }
        System.out.print("\n");
        text = text.substring(end+1);

    }
    System.out.println(text);

}

You just need to call fullJustify() method where in list of words needs to be passed along with the max width of each line you want in output. 您只需要调用fullJustify()方法,即可在单词列表中传递您想要在输出中显示的每一行的最大宽度。

public List<String> fullJustify(String[] words, int maxWidth) {
    int n = words.length;
    List<String> justifiedText = new ArrayList<>();
    int currLineIndex = 0;
    int nextLineIndex = getNextLineIndex(currLineIndex, maxWidth, words);
    while (currLineIndex < n) {
        StringBuilder line = new StringBuilder();
        for (int i = currLineIndex; i < nextLineIndex; i++) {
            line.append(words[i] + " ");
        }
        currLineIndex = nextLineIndex;
        nextLineIndex = getNextLineIndex(currLineIndex, maxWidth, words);
        justifiedText.add(line.toString());
    }
    for (int i = 0; i < justifiedText.size() - 1; i++) {
        String fullJustifiedLine = getFullJustifiedString(justifiedText.get(i).trim(), maxWidth);
        justifiedText.remove(i);
        justifiedText.add(i, fullJustifiedLine);
    }
    String leftJustifiedLine = getLeftJustifiedLine(justifiedText.get(justifiedText.size() - 1).trim(), maxWidth);
    justifiedText.remove(justifiedText.size() - 1);
    justifiedText.add(leftJustifiedLine);
    return justifiedText;
}

public static int getNextLineIndex(int currLineIndex, int maxWidth, String[] words) {
    int n = words.length;
    int width = 0;
    while (currLineIndex < n && width < maxWidth) {
        width += words[currLineIndex++].length() + 1;
    }
    if (width > maxWidth + 1)
        currLineIndex--;
    return currLineIndex;
}

public String getFullJustifiedString(String line, int maxWidth) {
    StringBuilder justifiedLine = new StringBuilder();
    String[] words = line.split(" ");
    int occupiedCharLength = 0;
    for (String word : words) {
        occupiedCharLength += word.length();
    }
    int remainingSpace = maxWidth - occupiedCharLength;
    int spaceForEachWordSeparation = words.length > 1 ? remainingSpace / (words.length - 1) : remainingSpace;
    int extraSpace = remainingSpace - spaceForEachWordSeparation * (words.length - 1);
    for (int j = 0; j < words.length - 1; j++) {
        justifiedLine.append(words[j]);
        for (int i = 0; i < spaceForEachWordSeparation; i++)
            justifiedLine.append(" ");
        if (extraSpace > 0) {
            justifiedLine.append(" ");
            extraSpace--;
        }
    }
    justifiedLine.append(words[words.length - 1]);
    for (int i = 0; i < extraSpace; i++)
        justifiedLine.append(" ");
    return justifiedLine.toString();
}

public String getLeftJustifiedLine(String line, int maxWidth) {
    int lineWidth = line.length();
    StringBuilder justifiedLine = new StringBuilder(line);
    for (int i = 0; i < maxWidth - lineWidth; i++)
        justifiedLine.append(" ");
    return justifiedLine.toString();
}

Below is the sample conversion where maxWidth was 80 characters: The following paragraph contains 115 words exactly and it took 55 ms to write the converted text to external file. 下面是maxWidth为80个字符的转换示例:下一段准确地包含115个单词 ,并将转换后的文本写入外部文件花费了55 ms

I've tested this code for a paragraph of about 70k+ words and it took approx 400 ms to write the converted text to a file. 我已经对该代码进行了约70k +字的段落测试,将转换后的文本写入文件大约花费了400毫秒

Input 输入项

These features tend to make legal writing formal. 这些特征倾向于使法律写作正式化。 This formality can take the form of long sentences, complex constructions, archaic and hyper-formal vocabulary, and a focus on content to the exclusion of reader needs. 这种形式可以采用长句子,复杂结构,过时和过分正式的词汇,以及着眼于内容以排除读者需求的形式。 Some of this formality in legal writing is necessary and desirable, given the importance of some legal documents and the seriousness of the circumstances in which some legal documents are used. 鉴于某些法律文件的重要性以及使用某些法律文件的情况的严重性,法律写作中的某些手续是必要和可取的。 Yet not all formality in legal writing is justified. 但是,并非所有法律写作形式都是合理的。 To the extent that formality produces opacity and imprecision, it is undesirable. 就形式产生不透明和不精确的程度而言,这是不希望的。 To the extent that formality hinders reader comprehension, it is less desirable. 在某种程度上讲,形式化妨碍了读者的理解。 In particular, when legal content must be conveyed to nonlawyers, formality should give way to clear communication. 特别是当必须将法律内容传达给非律师时,应通过形式形式来明确沟通。

Output 输出量

These  features  tend  to make legal writing formal. This formality can take the
form   of  long  sentences,  complex  constructions,  archaic  and  hyper-formal
vocabulary,  and  a  focus  on content to the exclusion of reader needs. Some of
this formality in legal writing is necessary and desirable, given the importance
of  some  legal documents and the seriousness of the circumstances in which some
legal  documents  are used. Yet not all formality in legal writing is justified.
To   the   extent  that  formality  produces  opacity  and  imprecision,  it  is
undesirable.  To  the  extent that formality hinders reader comprehension, it is
less   desirable.  In  particular,  when  legal  content  must  be  conveyed  to
nonlawyers, formality should give way to clear communication.                   

I followed Shahroz Saleem's answer (but my rep is too low to comment :/) - however, I needed one minor change as it does not take into account words longer than the line length (such as URL's in the text.) 我遵循了Shahroz Saleem的回答(但是我的代表太低以至于无法评论:/)-但是,我需要进行一次细微的更改,因为它没有考虑长于行长的单词(例如文本中的URL)。

import java.util.ArrayList;
import java.util.List;

public class Utils {

    public static List<String> fullJustify(String words, int maxWidth) {

        return fullJustify(words.split(" "), maxWidth);
    }

    public static List<String> fullJustify(String[] words, int maxWidth) {
        int n = words.length;
        List<String> justifiedText = new ArrayList<>();
        int currLineIndex = 0;
        int nextLineIndex = getNextLineIndex(currLineIndex, maxWidth, words);
        while (currLineIndex < n) {
            StringBuilder line = new StringBuilder();
            for (int i = currLineIndex; i < nextLineIndex; i++) {
                line.append(words[i] + " ");
            }
            currLineIndex = nextLineIndex;
            nextLineIndex = getNextLineIndex(currLineIndex, maxWidth, words);
            justifiedText.add(line.toString());
        }
        for (int i = 0; i < justifiedText.size() - 1; i++) {
            String fullJustifiedLine = getFullJustifiedString(justifiedText.get(i).trim(), maxWidth);
            justifiedText.remove(i);
            justifiedText.add(i, fullJustifiedLine);
        }
        String leftJustifiedLine = getLeftJustifiedLine(justifiedText.get(justifiedText.size() - 1).trim(), maxWidth);
        justifiedText.remove(justifiedText.size() - 1);
        justifiedText.add(leftJustifiedLine);
        return justifiedText;
    }

    public static int getNextLineIndex(int currLineIndex, int maxWidth, String[] words) {
        int n = words.length;
        int width = 0;
        int count = 0;
        while (currLineIndex < n && width < maxWidth) {
            width += words[currLineIndex++].length() + 1;
            count++;
        }
        if (width > maxWidth + 1 && count > 1)
            currLineIndex--;

        return currLineIndex;
    }

    public static String getFullJustifiedString(String line, int maxWidth) {
        StringBuilder justifiedLine = new StringBuilder();
        String[] words = line.split(" ");
        int occupiedCharLength = 0;
        for (String word : words) {
            occupiedCharLength += word.length();
        }
        int remainingSpace = maxWidth - occupiedCharLength;
        int spaceForEachWordSeparation = words.length > 1 ? remainingSpace / (words.length - 1) : remainingSpace;
        int extraSpace = remainingSpace - spaceForEachWordSeparation * (words.length - 1);
        for (int j = 0; j < words.length - 1; j++) {
            justifiedLine.append(words[j]);
            for (int i = 0; i < spaceForEachWordSeparation; i++)
                justifiedLine.append(" ");
            if (extraSpace > 0) {
                justifiedLine.append(" ");
                extraSpace--;
            }
        }
        justifiedLine.append(words[words.length - 1]);
        for (int i = 0; i < extraSpace; i++)
            justifiedLine.append(" ");
        return justifiedLine.toString();
    }

    public static String getLeftJustifiedLine(String line, int maxWidth) {
        int lineWidth = line.length();
        StringBuilder justifiedLine = new StringBuilder(line);
        //for (int i = 0; i < maxWidth - lineWidth; i++)
        //    justifiedLine.append(" ");
        return justifiedLine.toString();
    }
}

Note I also commented out the spaces padding for the last line of each paragraph (in getLeftJustifiedLine) and made the methods static.. 注意,我还注释了每个段落的最后一行(在getLeftJustifiedLine中)的空格填充,并使方法静态化。

Let's try to break the problem down: 让我们尝试将问题分解:

Subtract the length of the string from 30 - that's the number of extra spaces you'll be adding somewhere (3 in this case). 从30减去字符串的长度-这就是要在某处添加的额外空格的数量(本例中为3)。

Count the number of existing spaces (5 in this case). 计算现有空间的数量(在这种情况下为5)。

Now you know that you need to distribute that first number of extra spaces into the existing spaces as evenly as possible (in this case, distribute 3 into 5). 现在您知道需要将第一个多余的空间尽可能均匀地分配到现有空间中(在这种情况下,将3分配为5)。

Think about how you would distribute something like this in real life, say balls into buckets. 想想如何在现实生活中分配这样的东西,比如说把球装进桶里。 You would probably rotate through your buckets, dropping a ball in each one until you ran out. 您可能会在水桶中旋转,将一个球丢进去,直到用完为止。 So consider how you might achieve this in your java code (hint: look at the different kinds of loops). 因此,请考虑如何在Java代码中实现此目标(提示:查看不同类型的循环)。

The way I would go about this is to use a loop with regular-expression replacements. 我要解决的方法是使用带有正则表达式替换的循环。

  1. Replace all spaces with underscores. 用下划线替换所有空格。
  2. For each char necessary to get the length up to the desired length, replace a single underscore with a two underscores. 对于使长度达到所需长度所需的每个字符,请用两个下划线替换单个下划线。 Use regular expressions to make sure that these replacements only happen where the desired number of underscores does not already exist. 使用正则表达式来确保仅在不存在所需数量的下划线的地方进行这些替换。 See JavaDoc for .ReplaceFirst() . 请参阅JavaDoc中的.ReplaceFirst() You'll also need to account for the possibility that you have to replace double-underscores with triples. 您还需要考虑必须用三元组替换双下划线的可能性。

After you do the initial replacement, I'd suggest you use a while loop, bounded on the length of the string being less than the target size. 进行初始替换后,建议您使用while循环,以字符串的长度小于目标大小为界。 Initialize int numUnderscores = 1; 初始化int numUnderscores = 1; outside of the while. 在外面。 Then the steps inside the loop will be: 然后,循环内的步骤将是:

  1. Build the replacement pattern. 建立替换模式。 This should be something like "/[^_](_{" + numUnderscores + "})[^_]/" which says "any char that is not an underscore, followed by numUnderscores instances of the underscore char, followed by any char that is not an underscore" 这应该类似于"/[^_](_{" + numUnderscores + "})[^_]/" ,其内容为:“不是下划线的任何字符,后跟下划线字符的numUnderscores实例,后跟任何不是下划线的字符”
  2. Call .ReplaceFirst() to perform the replacement 调用.ReplaceFirst()执行替换
  3. Check to see if the string contains any remaining instances of the current number of underscores; 检查字符串是否包含当前下划线数量的所有剩余实例; if it does not, then you must increment numUnderscores 如果不是,则必须递增numUnderscores

Obviously, since this is a homework problem, I'm leaving the actual process of writing the code as an exercise. 显然,由于这是一个家庭作业问题,因此我将编写代码的实际过程留给练习。 If you have specific questions about some piece of it, or about some component of the logic structure I described, just ask in comments! 如果您对其中的某些部分或我描述的逻辑结构的某些部分有特定的问题,请在评论中提问!

The benefit of doing things this way is that it will work for any size string, and is very configurable for different situations. 用这种方法做事的好处是,它适用于任何大小的字符串,并且可以针对不同情况进行配置。

本演示文稿的第一部分包含用于文本对齐的动态编程算法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM