Remove duplicate lines from a file but leave 1 occurrence
I'm looking to remove duplicate lines from a file but leave 1 occurrence in the file.
Example of the file:
this is a string
test line
test line 2
this is a string
From the above example, I would want to remove 1 occurrence of "this is a string".
Best way to do this?
linux
|
show 4 more comments
I'm looking to remove duplicate lines from a file but leave 1 occurrence in the file.
Example of the file:
this is a string
test line
test line 2
this is a string
From the above example, I would want to remove 1 occurrence of "this is a string".
Best way to do this?
linux
1
With such questions you should always provide example input and output.
– Hauke Laging
May 19 '18 at 13:12
1
Possibly related: Remove duplicate lines while keeping the order of the lines
– steeldriver
May 19 '18 at 13:12
Are the duplicated lines adjacent to one another? Is the output to remain in the same order or would it be ok to sort the data?
– Kusalananda
May 19 '18 at 13:14
1
Keep one occurrence of a duplicate (ie two identical lines per match) or simply "remove all duplicate lines, leaving only one line per set of duplicates"? Does the final order matter?
– roaima
May 19 '18 at 13:17
1
it is not a problem for you that the lines will be sorted, then asort file|uniq
will do what you want.
– peterh
May 19 '18 at 19:03
|
show 4 more comments
I'm looking to remove duplicate lines from a file but leave 1 occurrence in the file.
Example of the file:
this is a string
test line
test line 2
this is a string
From the above example, I would want to remove 1 occurrence of "this is a string".
Best way to do this?
linux
I'm looking to remove duplicate lines from a file but leave 1 occurrence in the file.
Example of the file:
this is a string
test line
test line 2
this is a string
From the above example, I would want to remove 1 occurrence of "this is a string".
Best way to do this?
linux
linux
edited May 19 '18 at 18:33
Tom Bailey
asked May 19 '18 at 13:09
Tom BaileyTom Bailey
161
161
1
With such questions you should always provide example input and output.
– Hauke Laging
May 19 '18 at 13:12
1
Possibly related: Remove duplicate lines while keeping the order of the lines
– steeldriver
May 19 '18 at 13:12
Are the duplicated lines adjacent to one another? Is the output to remain in the same order or would it be ok to sort the data?
– Kusalananda
May 19 '18 at 13:14
1
Keep one occurrence of a duplicate (ie two identical lines per match) or simply "remove all duplicate lines, leaving only one line per set of duplicates"? Does the final order matter?
– roaima
May 19 '18 at 13:17
1
it is not a problem for you that the lines will be sorted, then asort file|uniq
will do what you want.
– peterh
May 19 '18 at 19:03
|
show 4 more comments
1
With such questions you should always provide example input and output.
– Hauke Laging
May 19 '18 at 13:12
1
Possibly related: Remove duplicate lines while keeping the order of the lines
– steeldriver
May 19 '18 at 13:12
Are the duplicated lines adjacent to one another? Is the output to remain in the same order or would it be ok to sort the data?
– Kusalananda
May 19 '18 at 13:14
1
Keep one occurrence of a duplicate (ie two identical lines per match) or simply "remove all duplicate lines, leaving only one line per set of duplicates"? Does the final order matter?
– roaima
May 19 '18 at 13:17
1
it is not a problem for you that the lines will be sorted, then asort file|uniq
will do what you want.
– peterh
May 19 '18 at 19:03
1
1
With such questions you should always provide example input and output.
– Hauke Laging
May 19 '18 at 13:12
With such questions you should always provide example input and output.
– Hauke Laging
May 19 '18 at 13:12
1
1
Possibly related: Remove duplicate lines while keeping the order of the lines
– steeldriver
May 19 '18 at 13:12
Possibly related: Remove duplicate lines while keeping the order of the lines
– steeldriver
May 19 '18 at 13:12
Are the duplicated lines adjacent to one another? Is the output to remain in the same order or would it be ok to sort the data?
– Kusalananda
May 19 '18 at 13:14
Are the duplicated lines adjacent to one another? Is the output to remain in the same order or would it be ok to sort the data?
– Kusalananda
May 19 '18 at 13:14
1
1
Keep one occurrence of a duplicate (ie two identical lines per match) or simply "remove all duplicate lines, leaving only one line per set of duplicates"? Does the final order matter?
– roaima
May 19 '18 at 13:17
Keep one occurrence of a duplicate (ie two identical lines per match) or simply "remove all duplicate lines, leaving only one line per set of duplicates"? Does the final order matter?
– roaima
May 19 '18 at 13:17
1
1
it is not a problem for you that the lines will be sorted, then a
sort file|uniq
will do what you want.– peterh
May 19 '18 at 19:03
it is not a problem for you that the lines will be sorted, then a
sort file|uniq
will do what you want.– peterh
May 19 '18 at 19:03
|
show 4 more comments
2 Answers
2
active
oldest
votes
This leaves the first occurrence:
awk '! a[$0]++' inputfile
start cmd:> echo 'this is a string
cont. cmd:> test line
cont. cmd:> test line 2
cont. cmd:> this is a string' | awk '! a[$0]++'
this is a string
test line
test line 2
It seems to just print out and not actually make in changes in the file.
– Tom Bailey
May 19 '18 at 15:49
@TomBailey That's why I told you to provide example input and output. I did test it and it works fine for me.
– Hauke Laging
May 19 '18 at 16:49
I have edited it now.
– Tom Bailey
May 19 '18 at 19:29
@TomBailey works fine for me.
– Hauke Laging
May 19 '18 at 20:16
add a comment |
Demo file stuff.txt
contains:
one
two
three
one
two
four
five
Remove duplicate lines from a file assuming you don't mind that lines are sorted
$ sort -u stuff.txt
five
four
one
three
two
Explanation: the u flag sent to sort says sort the lines of the file and force unique.
Remove duplicate lines from a file, preserve original ordering, keep the first:
$ cat -n stuff.txt | sort -uk2 | sort -nk1 | cut -f2-
one
two
three
four
five
Explanation: The n flag passed to cat appends line numbers to left of every line, plus space, then the first sort says sort by unique and but only after the first word, the second sort command says use the line numbers we stored in step 1 to resort by the original ordering, finally cut off the first word.
Remove duplicate lines from a file, preserve order, keep last.
tac stuff.txt > stuff2.txt; cat -n stuff2.txt | sort -uk2 | sort -nk1 | cut -f2- > stuff3.txt; tac stuff3.txt > stuff4.txt; cat stuff4.txt
three
one
two
four
five
Explanation: Same as before, but tac reverse the file, achieving the desired result.
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f444795%2fremove-duplicate-lines-from-a-file-but-leave-1-occurrence%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
This leaves the first occurrence:
awk '! a[$0]++' inputfile
start cmd:> echo 'this is a string
cont. cmd:> test line
cont. cmd:> test line 2
cont. cmd:> this is a string' | awk '! a[$0]++'
this is a string
test line
test line 2
It seems to just print out and not actually make in changes in the file.
– Tom Bailey
May 19 '18 at 15:49
@TomBailey That's why I told you to provide example input and output. I did test it and it works fine for me.
– Hauke Laging
May 19 '18 at 16:49
I have edited it now.
– Tom Bailey
May 19 '18 at 19:29
@TomBailey works fine for me.
– Hauke Laging
May 19 '18 at 20:16
add a comment |
This leaves the first occurrence:
awk '! a[$0]++' inputfile
start cmd:> echo 'this is a string
cont. cmd:> test line
cont. cmd:> test line 2
cont. cmd:> this is a string' | awk '! a[$0]++'
this is a string
test line
test line 2
It seems to just print out and not actually make in changes in the file.
– Tom Bailey
May 19 '18 at 15:49
@TomBailey That's why I told you to provide example input and output. I did test it and it works fine for me.
– Hauke Laging
May 19 '18 at 16:49
I have edited it now.
– Tom Bailey
May 19 '18 at 19:29
@TomBailey works fine for me.
– Hauke Laging
May 19 '18 at 20:16
add a comment |
This leaves the first occurrence:
awk '! a[$0]++' inputfile
start cmd:> echo 'this is a string
cont. cmd:> test line
cont. cmd:> test line 2
cont. cmd:> this is a string' | awk '! a[$0]++'
this is a string
test line
test line 2
This leaves the first occurrence:
awk '! a[$0]++' inputfile
start cmd:> echo 'this is a string
cont. cmd:> test line
cont. cmd:> test line 2
cont. cmd:> this is a string' | awk '! a[$0]++'
this is a string
test line
test line 2
edited May 19 '18 at 20:16
answered May 19 '18 at 13:16
Hauke LagingHauke Laging
57k1287135
57k1287135
It seems to just print out and not actually make in changes in the file.
– Tom Bailey
May 19 '18 at 15:49
@TomBailey That's why I told you to provide example input and output. I did test it and it works fine for me.
– Hauke Laging
May 19 '18 at 16:49
I have edited it now.
– Tom Bailey
May 19 '18 at 19:29
@TomBailey works fine for me.
– Hauke Laging
May 19 '18 at 20:16
add a comment |
It seems to just print out and not actually make in changes in the file.
– Tom Bailey
May 19 '18 at 15:49
@TomBailey That's why I told you to provide example input and output. I did test it and it works fine for me.
– Hauke Laging
May 19 '18 at 16:49
I have edited it now.
– Tom Bailey
May 19 '18 at 19:29
@TomBailey works fine for me.
– Hauke Laging
May 19 '18 at 20:16
It seems to just print out and not actually make in changes in the file.
– Tom Bailey
May 19 '18 at 15:49
It seems to just print out and not actually make in changes in the file.
– Tom Bailey
May 19 '18 at 15:49
@TomBailey That's why I told you to provide example input and output. I did test it and it works fine for me.
– Hauke Laging
May 19 '18 at 16:49
@TomBailey That's why I told you to provide example input and output. I did test it and it works fine for me.
– Hauke Laging
May 19 '18 at 16:49
I have edited it now.
– Tom Bailey
May 19 '18 at 19:29
I have edited it now.
– Tom Bailey
May 19 '18 at 19:29
@TomBailey works fine for me.
– Hauke Laging
May 19 '18 at 20:16
@TomBailey works fine for me.
– Hauke Laging
May 19 '18 at 20:16
add a comment |
Demo file stuff.txt
contains:
one
two
three
one
two
four
five
Remove duplicate lines from a file assuming you don't mind that lines are sorted
$ sort -u stuff.txt
five
four
one
three
two
Explanation: the u flag sent to sort says sort the lines of the file and force unique.
Remove duplicate lines from a file, preserve original ordering, keep the first:
$ cat -n stuff.txt | sort -uk2 | sort -nk1 | cut -f2-
one
two
three
four
five
Explanation: The n flag passed to cat appends line numbers to left of every line, plus space, then the first sort says sort by unique and but only after the first word, the second sort command says use the line numbers we stored in step 1 to resort by the original ordering, finally cut off the first word.
Remove duplicate lines from a file, preserve order, keep last.
tac stuff.txt > stuff2.txt; cat -n stuff2.txt | sort -uk2 | sort -nk1 | cut -f2- > stuff3.txt; tac stuff3.txt > stuff4.txt; cat stuff4.txt
three
one
two
four
five
Explanation: Same as before, but tac reverse the file, achieving the desired result.
add a comment |
Demo file stuff.txt
contains:
one
two
three
one
two
four
five
Remove duplicate lines from a file assuming you don't mind that lines are sorted
$ sort -u stuff.txt
five
four
one
three
two
Explanation: the u flag sent to sort says sort the lines of the file and force unique.
Remove duplicate lines from a file, preserve original ordering, keep the first:
$ cat -n stuff.txt | sort -uk2 | sort -nk1 | cut -f2-
one
two
three
four
five
Explanation: The n flag passed to cat appends line numbers to left of every line, plus space, then the first sort says sort by unique and but only after the first word, the second sort command says use the line numbers we stored in step 1 to resort by the original ordering, finally cut off the first word.
Remove duplicate lines from a file, preserve order, keep last.
tac stuff.txt > stuff2.txt; cat -n stuff2.txt | sort -uk2 | sort -nk1 | cut -f2- > stuff3.txt; tac stuff3.txt > stuff4.txt; cat stuff4.txt
three
one
two
four
five
Explanation: Same as before, but tac reverse the file, achieving the desired result.
add a comment |
Demo file stuff.txt
contains:
one
two
three
one
two
four
five
Remove duplicate lines from a file assuming you don't mind that lines are sorted
$ sort -u stuff.txt
five
four
one
three
two
Explanation: the u flag sent to sort says sort the lines of the file and force unique.
Remove duplicate lines from a file, preserve original ordering, keep the first:
$ cat -n stuff.txt | sort -uk2 | sort -nk1 | cut -f2-
one
two
three
four
five
Explanation: The n flag passed to cat appends line numbers to left of every line, plus space, then the first sort says sort by unique and but only after the first word, the second sort command says use the line numbers we stored in step 1 to resort by the original ordering, finally cut off the first word.
Remove duplicate lines from a file, preserve order, keep last.
tac stuff.txt > stuff2.txt; cat -n stuff2.txt | sort -uk2 | sort -nk1 | cut -f2- > stuff3.txt; tac stuff3.txt > stuff4.txt; cat stuff4.txt
three
one
two
four
five
Explanation: Same as before, but tac reverse the file, achieving the desired result.
Demo file stuff.txt
contains:
one
two
three
one
two
four
five
Remove duplicate lines from a file assuming you don't mind that lines are sorted
$ sort -u stuff.txt
five
four
one
three
two
Explanation: the u flag sent to sort says sort the lines of the file and force unique.
Remove duplicate lines from a file, preserve original ordering, keep the first:
$ cat -n stuff.txt | sort -uk2 | sort -nk1 | cut -f2-
one
two
three
four
five
Explanation: The n flag passed to cat appends line numbers to left of every line, plus space, then the first sort says sort by unique and but only after the first word, the second sort command says use the line numbers we stored in step 1 to resort by the original ordering, finally cut off the first word.
Remove duplicate lines from a file, preserve order, keep last.
tac stuff.txt > stuff2.txt; cat -n stuff2.txt | sort -uk2 | sort -nk1 | cut -f2- > stuff3.txt; tac stuff3.txt > stuff4.txt; cat stuff4.txt
three
one
two
four
five
Explanation: Same as before, but tac reverse the file, achieving the desired result.
answered 13 mins ago
Eric LeschinskiEric Leschinski
1,30711416
1,30711416
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f444795%2fremove-duplicate-lines-from-a-file-but-leave-1-occurrence%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
With such questions you should always provide example input and output.
– Hauke Laging
May 19 '18 at 13:12
1
Possibly related: Remove duplicate lines while keeping the order of the lines
– steeldriver
May 19 '18 at 13:12
Are the duplicated lines adjacent to one another? Is the output to remain in the same order or would it be ok to sort the data?
– Kusalananda
May 19 '18 at 13:14
1
Keep one occurrence of a duplicate (ie two identical lines per match) or simply "remove all duplicate lines, leaving only one line per set of duplicates"? Does the final order matter?
– roaima
May 19 '18 at 13:17
1
it is not a problem for you that the lines will be sorted, then a
sort file|uniq
will do what you want.– peterh
May 19 '18 at 19:03