How to split a text file into blocks with 10+ characters without dividing words using sed in Linux?

Question

I want to come up with a sed command where once every 10 character will look for the nearest space and substitute it with "|"

I tried sed -E -e 's/ /|/\( *?[0-9a-zA-Z]*\)\{10,\}' new.file , but it shows errors.

Example input:

Hello there! How are you? I am trying to figure this out.

Expected Output:

Hello there!|How are you?|I am trying|to figure this|out.

Answer 1

This works for given sample:

$ sed -E 's/(.{10}[^ ]*) /\1|/g' ip.txt
Hello there!|How are you?|I am trying|to figure this|out.

(.{10}[^ ]*) this matches 10 characters, followed by any non-space characters
then a space is matched
\1| put back captured portion and a | character

Answer 2

Building upon Sundeep's solution , you may

Add support for any whitespace by replacing spaces with [[:space:]] and non-space with [^[:space:]]
Replace any chunk of one or more whitespace with a pipe if you add + (POSIX ERE) or \{1,\} (POSIX BRE).

You can use

sed 's/\(.\{10\}[^[:space:]]*\)[[:space:]]\{1,\}/\1|/g' ip.txt
sed -E 's/(.{10}[^[:space:]]*)[[:space:]]+/\1|/g' ip.txt

See the online demo :

#!/bin/bash
s='Hello there! How are you? I am trying to figure this out.'
sed 's/\(.\{10\}[^[:space:]]*\)[[:space:]]\{1,\}/\1|/g' <<< "$s"
sed -E 's/(.{10}[^[:space:]]*)[[:space:]]+/\1|/g' <<< "$s"

Output:

Hello there!|How are you?|I am trying|to figure this|out.
Hello there!|How are you?|I am trying|to figure this|out.

How to split a text file into blocks with 10+ characters without dividing words using sed in Linux?

Question

2 answers

solution1
2 ACCPTED 2021-06-09 13:38:24

solution2
2 2021-06-09 20:56:26

How to split a text file into blocks with 10+ characters without dividing words using sed in Linux?

Question

2 answers

solution1 2 ACCPTED 2021-06-09 13:38:24

solution2 2 2021-06-09 20:56:26

solution1
2 ACCPTED 2021-06-09 13:38:24

solution2
2 2021-06-09 20:56:26