Line Tools
Remove blank lines, trim, dedupe, and sort.
Blog

Dedupe, Sort, and Clean Lines of Text in One Pass

Clean up blank lines and duplicates in a list without Excel. How whole and adjacent dedupe differ, and the order the operations run in.

A developer-style cover on a pink background with the large words 'Clean Lines' beside cards for remove blanks, dedupe, sort, and line count.

A list of emails or a log lands in your lap with blank lines mixed in, the same entry repeated, and invisible spaces clinging to the ends. Dropping it into a spreadsheet to run a dedupe, or writing a quick script, is more friction than the job deserves. Cleaning text by line takes a paste and a couple of checkboxes.

Clear blanks and duplicates in one go

Line cleaning rests on four operations: remove blank lines, trim whitespace, remove duplicates, and sort. They are far more useful together than apart.

Say you have hello , a blank line, world, hello, and a whitespace-only line. Turn on trim, remove blanks, dedupe, and sort ascending in the PiPi Worlds line tool, and the result collapses to hello and world. A live count shows how many lines remain, so you can see at a glance that the input went from several lines down to two.

Whole and adjacent dedupe are not the same

There are two kinds of dedupe, and the results differ more than you might expect.

Take the four lines apple, apple, banana, apple.

ModeResult
Whole-list dedupeapple, banana (2 lines)
Adjacent dedupeapple, banana, apple (3 lines)

Whole dedupe keeps the first occurrence of any line, no matter where the repeat sits. Adjacent dedupe removes a line only when it matches the one right above it, so the final apple survives because the line above it is banana. Reach for adjacent dedupe when you want to collapse runs without changing the order.

The order of operations decides the result

This is the part people miss. The operations run in a fixed order.

It is trim, then remove blanks, then dedupe, then sort, then numbering. The order matters because of whitespace. The lines apple and apple with a trailing space count as different, so a duplicate slips through. Trimming first removes that trap. To treat Apple and apple as the same line, turn on ignore case, and the result keeps whichever spelling appeared first.

Finish with sort and numbering

Two operations put the final shape on a cleaned list: sort and numbering.

Sort runs ascending (A→Z) or descending (Z→A). Turn on numbering and each line gets a 1., 2., 3. prefix in turn. Dedupe, sort, and number together, and a clean numbered list ready to paste into a doc or an email comes out in a single pass.

Where are you cleaning that list?

The last point is about where the data sits. The lines you paste in to clean often carry email addresses, internal IDs, or log entries that should not wander off your machine.

The PiPi Worlds line tool runs every operation in your browser, so the input is never transmitted. Text copied from a mix of systems, with CRLF and LF endings tangled together, normalizes in one pass. When you want the character count of the cleaned list, the word counter takes it from there, and if each line holds JSON, the JSON formatter does.

Frequently asked questions

What is the difference between whole and adjacent dedupe?
Whole-list dedupe removes every repeat anywhere and keeps the first occurrence of each line. Adjacent dedupe removes a line only when it matches the line directly above it. For example, apple, apple, banana, apple becomes apple, banana with whole dedupe, but apple, banana, apple with adjacent dedupe. Adjacent dedupe is handy for collapsing runs without reordering.
In what order are the operations applied?
Trim, then remove blank lines, then dedupe, then sort, then optional numbering. Because whitespace is cleaned first, a hidden trailing space never stops a duplicate from being detected.
Can lines that differ only in case be treated as the same?
Yes. Turn on 'ignore case' and Apple and apple count as the same line for dedupe and sort. The first spelling that appeared is the one kept in the result.
Does it handle text with Windows line endings?
Yes. CRLF, CR, and LF are all recognized, and the output is normalized to LF newlines. Text copied from a mix of Windows and Mac sources is cleaned in one pass.
Is the text I paste sent to a server?
No. The PiPi Worlds line tool runs every operation in your browser. Paste email lists, employee IDs, or log lines and the data never leaves your device.

Sources

Written by the PiFl Labs content team from public sources and reviewed in-house before publishing.

Last reviewed:

Back to the tool →