Grep doesn't work when trying to match a word from a file in a second file












0















I've got 2 files, both with numerous lines that only contain one number. I'm trying to see if any number from file1 matches a number in file2. This is what I tried, and for some reason it doesn't work:



for i in $(cat file1); do grep ${i} file2; done



Fore reference here is data from file1 and file2



file1   file2
2134 1251
2135 5626
5342 4327
6456 8453
3413 4537
4525 3533
2347 5738
1235 1235
7453 3462


So shouldn't this command take each line from file 1 and grep it against the whole of file2? In that case, shouldn't a match be printed on screen?










share|improve this question

























  • It should - but you would probably be advised to use something like grep -Fwf file1 file2 instead

    – steeldriver
    7 hours ago











  • It should but it consistently doesn't, even if I do it on the same file like for i in $(cat file1); do grep ${i} file1; done it still doesn't work. I'll try your advice

    – user323587
    7 hours ago











  • Notice: I just tryed your code and works for me. Is there any chance that file1 contains hidden characters? ... may be or r or tabs?

    – Juan
    7 hours ago











  • if what you really wants is to compare the files, you might want to use sort, uniq and diff (or kompare or k3diff or any other file comparison tool)

    – Juan
    7 hours ago











  • With those columns of numbers in two files, and that command, I get the 1235 as output, it seems to be the lone duplicate. In other words, I can't see an issue with the result here. Of course if the data is broken, like CRLF line endings in file1 but not in file2, then you'd have problems.

    – ilkkachu
    5 hours ago
















0















I've got 2 files, both with numerous lines that only contain one number. I'm trying to see if any number from file1 matches a number in file2. This is what I tried, and for some reason it doesn't work:



for i in $(cat file1); do grep ${i} file2; done



Fore reference here is data from file1 and file2



file1   file2
2134 1251
2135 5626
5342 4327
6456 8453
3413 4537
4525 3533
2347 5738
1235 1235
7453 3462


So shouldn't this command take each line from file 1 and grep it against the whole of file2? In that case, shouldn't a match be printed on screen?










share|improve this question

























  • It should - but you would probably be advised to use something like grep -Fwf file1 file2 instead

    – steeldriver
    7 hours ago











  • It should but it consistently doesn't, even if I do it on the same file like for i in $(cat file1); do grep ${i} file1; done it still doesn't work. I'll try your advice

    – user323587
    7 hours ago











  • Notice: I just tryed your code and works for me. Is there any chance that file1 contains hidden characters? ... may be or r or tabs?

    – Juan
    7 hours ago











  • if what you really wants is to compare the files, you might want to use sort, uniq and diff (or kompare or k3diff or any other file comparison tool)

    – Juan
    7 hours ago











  • With those columns of numbers in two files, and that command, I get the 1235 as output, it seems to be the lone duplicate. In other words, I can't see an issue with the result here. Of course if the data is broken, like CRLF line endings in file1 but not in file2, then you'd have problems.

    – ilkkachu
    5 hours ago














0












0








0








I've got 2 files, both with numerous lines that only contain one number. I'm trying to see if any number from file1 matches a number in file2. This is what I tried, and for some reason it doesn't work:



for i in $(cat file1); do grep ${i} file2; done



Fore reference here is data from file1 and file2



file1   file2
2134 1251
2135 5626
5342 4327
6456 8453
3413 4537
4525 3533
2347 5738
1235 1235
7453 3462


So shouldn't this command take each line from file 1 and grep it against the whole of file2? In that case, shouldn't a match be printed on screen?










share|improve this question
















I've got 2 files, both with numerous lines that only contain one number. I'm trying to see if any number from file1 matches a number in file2. This is what I tried, and for some reason it doesn't work:



for i in $(cat file1); do grep ${i} file2; done



Fore reference here is data from file1 and file2



file1   file2
2134 1251
2135 5626
5342 4327
6456 8453
3413 4537
4525 3533
2347 5738
1235 1235
7453 3462


So shouldn't this command take each line from file 1 and grep it against the whole of file2? In that case, shouldn't a match be printed on screen?







bash grep






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 7 hours ago









Rui F Ribeiro

41.4k1481140




41.4k1481140










asked 7 hours ago









user323587user323587

132




132













  • It should - but you would probably be advised to use something like grep -Fwf file1 file2 instead

    – steeldriver
    7 hours ago











  • It should but it consistently doesn't, even if I do it on the same file like for i in $(cat file1); do grep ${i} file1; done it still doesn't work. I'll try your advice

    – user323587
    7 hours ago











  • Notice: I just tryed your code and works for me. Is there any chance that file1 contains hidden characters? ... may be or r or tabs?

    – Juan
    7 hours ago











  • if what you really wants is to compare the files, you might want to use sort, uniq and diff (or kompare or k3diff or any other file comparison tool)

    – Juan
    7 hours ago











  • With those columns of numbers in two files, and that command, I get the 1235 as output, it seems to be the lone duplicate. In other words, I can't see an issue with the result here. Of course if the data is broken, like CRLF line endings in file1 but not in file2, then you'd have problems.

    – ilkkachu
    5 hours ago



















  • It should - but you would probably be advised to use something like grep -Fwf file1 file2 instead

    – steeldriver
    7 hours ago











  • It should but it consistently doesn't, even if I do it on the same file like for i in $(cat file1); do grep ${i} file1; done it still doesn't work. I'll try your advice

    – user323587
    7 hours ago











  • Notice: I just tryed your code and works for me. Is there any chance that file1 contains hidden characters? ... may be or r or tabs?

    – Juan
    7 hours ago











  • if what you really wants is to compare the files, you might want to use sort, uniq and diff (or kompare or k3diff or any other file comparison tool)

    – Juan
    7 hours ago











  • With those columns of numbers in two files, and that command, I get the 1235 as output, it seems to be the lone duplicate. In other words, I can't see an issue with the result here. Of course if the data is broken, like CRLF line endings in file1 but not in file2, then you'd have problems.

    – ilkkachu
    5 hours ago

















It should - but you would probably be advised to use something like grep -Fwf file1 file2 instead

– steeldriver
7 hours ago





It should - but you would probably be advised to use something like grep -Fwf file1 file2 instead

– steeldriver
7 hours ago













It should but it consistently doesn't, even if I do it on the same file like for i in $(cat file1); do grep ${i} file1; done it still doesn't work. I'll try your advice

– user323587
7 hours ago





It should but it consistently doesn't, even if I do it on the same file like for i in $(cat file1); do grep ${i} file1; done it still doesn't work. I'll try your advice

– user323587
7 hours ago













Notice: I just tryed your code and works for me. Is there any chance that file1 contains hidden characters? ... may be or r or tabs?

– Juan
7 hours ago





Notice: I just tryed your code and works for me. Is there any chance that file1 contains hidden characters? ... may be or r or tabs?

– Juan
7 hours ago













if what you really wants is to compare the files, you might want to use sort, uniq and diff (or kompare or k3diff or any other file comparison tool)

– Juan
7 hours ago





if what you really wants is to compare the files, you might want to use sort, uniq and diff (or kompare or k3diff or any other file comparison tool)

– Juan
7 hours ago













With those columns of numbers in two files, and that command, I get the 1235 as output, it seems to be the lone duplicate. In other words, I can't see an issue with the result here. Of course if the data is broken, like CRLF line endings in file1 but not in file2, then you'd have problems.

– ilkkachu
5 hours ago





With those columns of numbers in two files, and that command, I get the 1235 as output, it seems to be the lone duplicate. In other words, I can't see an issue with the result here. Of course if the data is broken, like CRLF line endings in file1 but not in file2, then you'd have problems.

– ilkkachu
5 hours ago










2 Answers
2






active

oldest

votes


















0














Given two ordinary Unix text files, your shell loop prints



1235


since this is the line that occurs in both files. If it does not, then one of your files may be a DOS text file. You can convert DOS text files into Unix text files with the dos2unix utility.



There is nothing major wrong with your loop given the type of data that you have, apart from the fact that it calls grep once for every line in file1. It also would match substrings, for example 100 in 1001, and it would, if any line in file1 contained spaces or tabs, split these lines into multiple words (due to the for i in $(cat ...) where the $(cat ...) is unquoted).



If you want to solve your issue this way (with a loop), you would better do



while IFS= read -r word; do
grep -xF -e "$word" file2
done <file1


The -x and -F are explained later in my answer, and -e signifies that the next argument is the pattern to match with (otherwise, it may be taken as a command line option if it starts with a dash (-).



This would still execute grep once for each line in file1, but it would do it correctly.





To extract lines in file2 that exactly correspond to line in file1, without using a shell loop, you would use



$ grep -xF -f file1 file2
1235


This is assuming that file1 contains a reasonable number of lines, but not too many ("too many" will depend on the amount of memory that you have).



The command uses grep with -x, which forces matches across full lines only (no substring matches), and with -F which changes grep to do string comparisons rather than regular expression matches.



The -f file1 instructs grep to read the patterns (the strings to match with) from file1.





For really massive amounts of data, it would be hugely inefficient to use grep though. Instead, for this task and with this type of data (single words on individual lines), it would be better to do a relational join operation between the files:



$ join file1 file2
1235


This would, assuming that both files are lexicographically sorted, return the numbers that are the same between both files.





Using comm:



$ comm -1 -2 file1 file2
1235


comm also compares sorted files and can easily handle very large datasets. It prints three columns by default:




  1. lines that occur in the first file only

  2. lines that occur in the second file only

  3. lines that occurs in both files


With -1 we turn off the output of the first column, and with -2 we disable the second column, leaving comm to only output the lines that are the same in both files.






share|improve this answer

































    -1














    You simply need to use grep -f file1 file2 OR you may also use cat file1 | grep -f /dev/stdin file2






    share|improve this answer





















    • 1





      Thank you for contributing. Please note that a) a good answer explains what you do, so others can not just use it, but learn from it and b) without specifying -x or -w to grep you can get unwanted results if not all numbers are four-digit numbers like in the example (like 234 in file1 ould match 1234 in file2). That was probably the reason for someone to downvote your answer (sadly, without leaving a comment)

      – Philippos
      5 hours ago













    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "106"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f505501%2fgrep-doesnt-work-when-trying-to-match-a-word-from-a-file-in-a-second-file%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    Given two ordinary Unix text files, your shell loop prints



    1235


    since this is the line that occurs in both files. If it does not, then one of your files may be a DOS text file. You can convert DOS text files into Unix text files with the dos2unix utility.



    There is nothing major wrong with your loop given the type of data that you have, apart from the fact that it calls grep once for every line in file1. It also would match substrings, for example 100 in 1001, and it would, if any line in file1 contained spaces or tabs, split these lines into multiple words (due to the for i in $(cat ...) where the $(cat ...) is unquoted).



    If you want to solve your issue this way (with a loop), you would better do



    while IFS= read -r word; do
    grep -xF -e "$word" file2
    done <file1


    The -x and -F are explained later in my answer, and -e signifies that the next argument is the pattern to match with (otherwise, it may be taken as a command line option if it starts with a dash (-).



    This would still execute grep once for each line in file1, but it would do it correctly.





    To extract lines in file2 that exactly correspond to line in file1, without using a shell loop, you would use



    $ grep -xF -f file1 file2
    1235


    This is assuming that file1 contains a reasonable number of lines, but not too many ("too many" will depend on the amount of memory that you have).



    The command uses grep with -x, which forces matches across full lines only (no substring matches), and with -F which changes grep to do string comparisons rather than regular expression matches.



    The -f file1 instructs grep to read the patterns (the strings to match with) from file1.





    For really massive amounts of data, it would be hugely inefficient to use grep though. Instead, for this task and with this type of data (single words on individual lines), it would be better to do a relational join operation between the files:



    $ join file1 file2
    1235


    This would, assuming that both files are lexicographically sorted, return the numbers that are the same between both files.





    Using comm:



    $ comm -1 -2 file1 file2
    1235


    comm also compares sorted files and can easily handle very large datasets. It prints three columns by default:




    1. lines that occur in the first file only

    2. lines that occur in the second file only

    3. lines that occurs in both files


    With -1 we turn off the output of the first column, and with -2 we disable the second column, leaving comm to only output the lines that are the same in both files.






    share|improve this answer






























      0














      Given two ordinary Unix text files, your shell loop prints



      1235


      since this is the line that occurs in both files. If it does not, then one of your files may be a DOS text file. You can convert DOS text files into Unix text files with the dos2unix utility.



      There is nothing major wrong with your loop given the type of data that you have, apart from the fact that it calls grep once for every line in file1. It also would match substrings, for example 100 in 1001, and it would, if any line in file1 contained spaces or tabs, split these lines into multiple words (due to the for i in $(cat ...) where the $(cat ...) is unquoted).



      If you want to solve your issue this way (with a loop), you would better do



      while IFS= read -r word; do
      grep -xF -e "$word" file2
      done <file1


      The -x and -F are explained later in my answer, and -e signifies that the next argument is the pattern to match with (otherwise, it may be taken as a command line option if it starts with a dash (-).



      This would still execute grep once for each line in file1, but it would do it correctly.





      To extract lines in file2 that exactly correspond to line in file1, without using a shell loop, you would use



      $ grep -xF -f file1 file2
      1235


      This is assuming that file1 contains a reasonable number of lines, but not too many ("too many" will depend on the amount of memory that you have).



      The command uses grep with -x, which forces matches across full lines only (no substring matches), and with -F which changes grep to do string comparisons rather than regular expression matches.



      The -f file1 instructs grep to read the patterns (the strings to match with) from file1.





      For really massive amounts of data, it would be hugely inefficient to use grep though. Instead, for this task and with this type of data (single words on individual lines), it would be better to do a relational join operation between the files:



      $ join file1 file2
      1235


      This would, assuming that both files are lexicographically sorted, return the numbers that are the same between both files.





      Using comm:



      $ comm -1 -2 file1 file2
      1235


      comm also compares sorted files and can easily handle very large datasets. It prints three columns by default:




      1. lines that occur in the first file only

      2. lines that occur in the second file only

      3. lines that occurs in both files


      With -1 we turn off the output of the first column, and with -2 we disable the second column, leaving comm to only output the lines that are the same in both files.






      share|improve this answer




























        0












        0








        0







        Given two ordinary Unix text files, your shell loop prints



        1235


        since this is the line that occurs in both files. If it does not, then one of your files may be a DOS text file. You can convert DOS text files into Unix text files with the dos2unix utility.



        There is nothing major wrong with your loop given the type of data that you have, apart from the fact that it calls grep once for every line in file1. It also would match substrings, for example 100 in 1001, and it would, if any line in file1 contained spaces or tabs, split these lines into multiple words (due to the for i in $(cat ...) where the $(cat ...) is unquoted).



        If you want to solve your issue this way (with a loop), you would better do



        while IFS= read -r word; do
        grep -xF -e "$word" file2
        done <file1


        The -x and -F are explained later in my answer, and -e signifies that the next argument is the pattern to match with (otherwise, it may be taken as a command line option if it starts with a dash (-).



        This would still execute grep once for each line in file1, but it would do it correctly.





        To extract lines in file2 that exactly correspond to line in file1, without using a shell loop, you would use



        $ grep -xF -f file1 file2
        1235


        This is assuming that file1 contains a reasonable number of lines, but not too many ("too many" will depend on the amount of memory that you have).



        The command uses grep with -x, which forces matches across full lines only (no substring matches), and with -F which changes grep to do string comparisons rather than regular expression matches.



        The -f file1 instructs grep to read the patterns (the strings to match with) from file1.





        For really massive amounts of data, it would be hugely inefficient to use grep though. Instead, for this task and with this type of data (single words on individual lines), it would be better to do a relational join operation between the files:



        $ join file1 file2
        1235


        This would, assuming that both files are lexicographically sorted, return the numbers that are the same between both files.





        Using comm:



        $ comm -1 -2 file1 file2
        1235


        comm also compares sorted files and can easily handle very large datasets. It prints three columns by default:




        1. lines that occur in the first file only

        2. lines that occur in the second file only

        3. lines that occurs in both files


        With -1 we turn off the output of the first column, and with -2 we disable the second column, leaving comm to only output the lines that are the same in both files.






        share|improve this answer















        Given two ordinary Unix text files, your shell loop prints



        1235


        since this is the line that occurs in both files. If it does not, then one of your files may be a DOS text file. You can convert DOS text files into Unix text files with the dos2unix utility.



        There is nothing major wrong with your loop given the type of data that you have, apart from the fact that it calls grep once for every line in file1. It also would match substrings, for example 100 in 1001, and it would, if any line in file1 contained spaces or tabs, split these lines into multiple words (due to the for i in $(cat ...) where the $(cat ...) is unquoted).



        If you want to solve your issue this way (with a loop), you would better do



        while IFS= read -r word; do
        grep -xF -e "$word" file2
        done <file1


        The -x and -F are explained later in my answer, and -e signifies that the next argument is the pattern to match with (otherwise, it may be taken as a command line option if it starts with a dash (-).



        This would still execute grep once for each line in file1, but it would do it correctly.





        To extract lines in file2 that exactly correspond to line in file1, without using a shell loop, you would use



        $ grep -xF -f file1 file2
        1235


        This is assuming that file1 contains a reasonable number of lines, but not too many ("too many" will depend on the amount of memory that you have).



        The command uses grep with -x, which forces matches across full lines only (no substring matches), and with -F which changes grep to do string comparisons rather than regular expression matches.



        The -f file1 instructs grep to read the patterns (the strings to match with) from file1.





        For really massive amounts of data, it would be hugely inefficient to use grep though. Instead, for this task and with this type of data (single words on individual lines), it would be better to do a relational join operation between the files:



        $ join file1 file2
        1235


        This would, assuming that both files are lexicographically sorted, return the numbers that are the same between both files.





        Using comm:



        $ comm -1 -2 file1 file2
        1235


        comm also compares sorted files and can easily handle very large datasets. It prints three columns by default:




        1. lines that occur in the first file only

        2. lines that occur in the second file only

        3. lines that occurs in both files


        With -1 we turn off the output of the first column, and with -2 we disable the second column, leaving comm to only output the lines that are the same in both files.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited 4 hours ago

























        answered 4 hours ago









        KusalanandaKusalananda

        135k17255418




        135k17255418

























            -1














            You simply need to use grep -f file1 file2 OR you may also use cat file1 | grep -f /dev/stdin file2






            share|improve this answer





















            • 1





              Thank you for contributing. Please note that a) a good answer explains what you do, so others can not just use it, but learn from it and b) without specifying -x or -w to grep you can get unwanted results if not all numbers are four-digit numbers like in the example (like 234 in file1 ould match 1234 in file2). That was probably the reason for someone to downvote your answer (sadly, without leaving a comment)

              – Philippos
              5 hours ago


















            -1














            You simply need to use grep -f file1 file2 OR you may also use cat file1 | grep -f /dev/stdin file2






            share|improve this answer





















            • 1





              Thank you for contributing. Please note that a) a good answer explains what you do, so others can not just use it, but learn from it and b) without specifying -x or -w to grep you can get unwanted results if not all numbers are four-digit numbers like in the example (like 234 in file1 ould match 1234 in file2). That was probably the reason for someone to downvote your answer (sadly, without leaving a comment)

              – Philippos
              5 hours ago
















            -1












            -1








            -1







            You simply need to use grep -f file1 file2 OR you may also use cat file1 | grep -f /dev/stdin file2






            share|improve this answer















            You simply need to use grep -f file1 file2 OR you may also use cat file1 | grep -f /dev/stdin file2







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited 5 hours ago









            Philippos

            6,06711647




            6,06711647










            answered 7 hours ago









            user335735user335735

            1




            1








            • 1





              Thank you for contributing. Please note that a) a good answer explains what you do, so others can not just use it, but learn from it and b) without specifying -x or -w to grep you can get unwanted results if not all numbers are four-digit numbers like in the example (like 234 in file1 ould match 1234 in file2). That was probably the reason for someone to downvote your answer (sadly, without leaving a comment)

              – Philippos
              5 hours ago
















            • 1





              Thank you for contributing. Please note that a) a good answer explains what you do, so others can not just use it, but learn from it and b) without specifying -x or -w to grep you can get unwanted results if not all numbers are four-digit numbers like in the example (like 234 in file1 ould match 1234 in file2). That was probably the reason for someone to downvote your answer (sadly, without leaving a comment)

              – Philippos
              5 hours ago










            1




            1





            Thank you for contributing. Please note that a) a good answer explains what you do, so others can not just use it, but learn from it and b) without specifying -x or -w to grep you can get unwanted results if not all numbers are four-digit numbers like in the example (like 234 in file1 ould match 1234 in file2). That was probably the reason for someone to downvote your answer (sadly, without leaving a comment)

            – Philippos
            5 hours ago







            Thank you for contributing. Please note that a) a good answer explains what you do, so others can not just use it, but learn from it and b) without specifying -x or -w to grep you can get unwanted results if not all numbers are four-digit numbers like in the example (like 234 in file1 ould match 1234 in file2). That was probably the reason for someone to downvote your answer (sadly, without leaving a comment)

            – Philippos
            5 hours ago




















            draft saved

            draft discarded




















































            Thanks for contributing an answer to Unix & Linux Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f505501%2fgrep-doesnt-work-when-trying-to-match-a-word-from-a-file-in-a-second-file%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Loup dans la culture

            How to solve the problem of ntp “Unable to contact time server” from KDE?

            ASUS Zenbook UX433/UX333 — Configure Touchpad-embedded numpad on Linux