Getting a match count of objects in a file












1















I have a large file that has entries that look like this:



entry-id: 1
sn: John
cn: Smith
empType: A
ADID: 123456

entry-id: 2
sn: James
cn: Smith
empType: B
ADID: 123456

entry-id: 3
sn: Jobu
cn: Smith
empType: A
ADID: 123456

entry-id: 4
sn: Jobu
cn: Smith
empType: A
ADID:


Each entry is separated by a new line. I need a count of entries that have an empType of A, and MUST ALSO have a value after ADID(total of 2). I've tried to use awk and grep and egrep, and still having no luck. Any ideas?










share|improve this question

























  • What exactly did you try in awk? I would think something like awk -vRS= '/empType: A/ && /ADID: [0-9]+/ {n++} END {print n}' file should work

    – steeldriver
    Dec 14 '17 at 3:39











  • running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....

    – King of NES
    Dec 14 '17 at 3:54













  • You did include the correct filename to read as input?

    – bu5hman
    Dec 14 '17 at 4:15











  • it was the correct file...

    – King of NES
    Dec 14 '17 at 4:18
















1















I have a large file that has entries that look like this:



entry-id: 1
sn: John
cn: Smith
empType: A
ADID: 123456

entry-id: 2
sn: James
cn: Smith
empType: B
ADID: 123456

entry-id: 3
sn: Jobu
cn: Smith
empType: A
ADID: 123456

entry-id: 4
sn: Jobu
cn: Smith
empType: A
ADID:


Each entry is separated by a new line. I need a count of entries that have an empType of A, and MUST ALSO have a value after ADID(total of 2). I've tried to use awk and grep and egrep, and still having no luck. Any ideas?










share|improve this question

























  • What exactly did you try in awk? I would think something like awk -vRS= '/empType: A/ && /ADID: [0-9]+/ {n++} END {print n}' file should work

    – steeldriver
    Dec 14 '17 at 3:39











  • running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....

    – King of NES
    Dec 14 '17 at 3:54













  • You did include the correct filename to read as input?

    – bu5hman
    Dec 14 '17 at 4:15











  • it was the correct file...

    – King of NES
    Dec 14 '17 at 4:18














1












1








1








I have a large file that has entries that look like this:



entry-id: 1
sn: John
cn: Smith
empType: A
ADID: 123456

entry-id: 2
sn: James
cn: Smith
empType: B
ADID: 123456

entry-id: 3
sn: Jobu
cn: Smith
empType: A
ADID: 123456

entry-id: 4
sn: Jobu
cn: Smith
empType: A
ADID:


Each entry is separated by a new line. I need a count of entries that have an empType of A, and MUST ALSO have a value after ADID(total of 2). I've tried to use awk and grep and egrep, and still having no luck. Any ideas?










share|improve this question
















I have a large file that has entries that look like this:



entry-id: 1
sn: John
cn: Smith
empType: A
ADID: 123456

entry-id: 2
sn: James
cn: Smith
empType: B
ADID: 123456

entry-id: 3
sn: Jobu
cn: Smith
empType: A
ADID: 123456

entry-id: 4
sn: Jobu
cn: Smith
empType: A
ADID:


Each entry is separated by a new line. I need a count of entries that have an empType of A, and MUST ALSO have a value after ADID(total of 2). I've tried to use awk and grep and egrep, and still having no luck. Any ideas?







linux text-processing command-line






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited 11 mins ago









PRY

2,43031025




2,43031025










asked Dec 14 '17 at 3:29









King of NESKing of NES

1163




1163













  • What exactly did you try in awk? I would think something like awk -vRS= '/empType: A/ && /ADID: [0-9]+/ {n++} END {print n}' file should work

    – steeldriver
    Dec 14 '17 at 3:39











  • running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....

    – King of NES
    Dec 14 '17 at 3:54













  • You did include the correct filename to read as input?

    – bu5hman
    Dec 14 '17 at 4:15











  • it was the correct file...

    – King of NES
    Dec 14 '17 at 4:18



















  • What exactly did you try in awk? I would think something like awk -vRS= '/empType: A/ && /ADID: [0-9]+/ {n++} END {print n}' file should work

    – steeldriver
    Dec 14 '17 at 3:39











  • running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....

    – King of NES
    Dec 14 '17 at 3:54













  • You did include the correct filename to read as input?

    – bu5hman
    Dec 14 '17 at 4:15











  • it was the correct file...

    – King of NES
    Dec 14 '17 at 4:18

















What exactly did you try in awk? I would think something like awk -vRS= '/empType: A/ && /ADID: [0-9]+/ {n++} END {print n}' file should work

– steeldriver
Dec 14 '17 at 3:39





What exactly did you try in awk? I would think something like awk -vRS= '/empType: A/ && /ADID: [0-9]+/ {n++} END {print n}' file should work

– steeldriver
Dec 14 '17 at 3:39













running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....

– King of NES
Dec 14 '17 at 3:54







running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....

– King of NES
Dec 14 '17 at 3:54















You did include the correct filename to read as input?

– bu5hman
Dec 14 '17 at 4:15





You did include the correct filename to read as input?

– bu5hman
Dec 14 '17 at 4:15













it was the correct file...

– King of NES
Dec 14 '17 at 4:18





it was the correct file...

– King of NES
Dec 14 '17 at 4:18










6 Answers
6






active

oldest

votes


















0














Awk solution:



awk '/empType: /{ f=($2=="A"? 1:0) }f && /ADID: [0-9]+/{ c++ }END{ print c }' file




  • f - flag indicating empType: A section processing


  • c - count of empType: A entries with filled ADID key




The output:



2





share|improve this answer

































    0














    Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS



    BEGIN {RS=""; FS="n"}
    {
    split($4,a,": ")
    split($5,b,": ")
    }
    a[2]=="A" && b[2]!="" {c++}
    END {print c}


    the script can be executed with



    awk -f main.awk file





    share|improve this answer































      0














      Simple two grep method, where data is the input file:



      grep -A1 'empType: A' data | grep -c 'ADID: .+'


      Output:



      2





      share|improve this answer

































        0














        I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:



        #!/usr/bin/env awk
        # getids.awk

        BEGIN{
        RS="";
        FS="n"
        }

        /ADID: [0-9]/ && /empType: A/{print $1}


        And here it is in action:



        user@host:~$ awk -f getids.awk data.txt
        entry-id: 1
        entry-id: 3

        user@host:~$ awk -f getids.awk data.txt | wc -l
        2


        Of course if you just want the count we can do that too:



        #!/usr/bin/env awk
        # count.awk

        BEGIN {
        RS="";
        FS="n";
        count=0;
        }

        /ADID: [0-9]/ && /empType: A/{count++}

        END {
        print count
        }


        And because I love Python, here is a Python script that does the same thing:



        #!/usr/bin/env python2
        # -*- coding: ascii -*-
        """getids.py"""

        import sys

        # Create a list to store the matched records
        records =

        # Iterate over the lines of the input file
        with open(sys.argv[1]) as data:
        for line in data:

        # When an "entry-id" is reached, create a new record
        if line.startswith('entry-id'):
        entry_id = line.split(':')[1].strip()
        records.append({'entry-id': entry_id})

        # For other lines, update the current record
        elif line.strip():
        key = line.partition(':')[0].strip()
        value = line.partition(':')[2].strip()
        records[-1][key] = value

        # Extract the list of records meeting the desired critera
        matches = [record for record in records if record['empType'] == 'A' and record['ADID']]

        # Print out the entry-ids for all of the matches
        for match in matches:
        print('entry-id: ' + match['entry-id'])


        And here's the Python script in action:



        user@host:~$ python getids.py data.txt
        entry-id: 1
        entry-id: 3

        user@host:~$ python getids.py data.txt | wc -l
        2


        And if we really do just want the counts:



        #!/usr/bin/env python2
        # -*- coding: ascii -*-
        """count.py"""

        import sys

        # Keep a count of the number of matches
        count = 0

        # Use flags to keep track of the current record
        emptype_flag = False
        adid_flag = False

        # Iterate over the lines of the input file
        with open(sys.argv[1]) as data:
        for line in data:

        # When an "entry-id" is reached, reset the flags
        if line.startswith('entry-id'):
        emptype_flag = False
        adid_flag = False
        elif line.strip() == "empType: A":
        emptype_flag = True
        elif line.startswith("ADID") and line.strip().split(':')[1]:
        adid_flag = True

        # If both conditions hold the increment the counter
        # and reset the flags
        if emptype_flag and adid_flag:
        count = count + 1
        emptype_flag = False
        adid_flag = False

        # Print the number of matches
        print(count)


        And, while we're at it, how about a pure Bash script? Here's one:



        #!/usr/bin/env bash

        # getids.bash

        while read line; do
        if [[ "${line}" =~ "entry-id:" ]]; then
        entry_id="${line}"
        emptype=false
        adid=false
        elif [[ "${line}" =~ "empType: A" ]]; then
        emptype=true
        elif [[ "${line}" =~ ADID: [0-9] ]]; then
        adid=true
        fi
        if [[ "${emptype}" == true && "${adid}" == true ]]; then
        echo "${entry_id}"
        emptype=false
        adid=false
        fi
        done < "$1"


        And running the bash script:



        user@host:~$ bash getids.bash data.txt
        entry-id: 1
        entry-id: 3


        And finally, here's something using just grep and wc:



        user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l

        2





        share|improve this answer

































          0














          With perl, that could be:



          perl -l -00ne '
          my %f = /(.*?):s*(.*)/g;
          ++$n if $f{empType} eq "A" && $f{ADID} ne "";
          END {print 0+$n}' < file




          • -n causes the code given to -e to be applied to each input record


          • -00 for records to be paragraphs.

          • We build a %f associative array where key and values are mapped to each (key):spaces(value) in the record.

          • and increment $n where the conditions are met.

          • we print $n in the END (adding 0 to make sure we get 0 and not an empty string if there's no match).






          share|improve this answer

































            0














            I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
            What I did find that worked was



            perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"


            Now I didn't know what perl -000 does, but i think it's saying search multiple lines within a paragraph,
            -n while loop
            e one line of program??
            print paragraph if you find empType: A
            now pipe those matched paragraphs to |
            grep -i -c "^ADID:" find ignore cased and count number of ADIDs.
            I'm not sure if the other commands failed because of my Linux version, but the above command worked pretty well, not sure how to make the empType an ignored case though....






            share|improve this answer























              Your Answer








              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "106"
              };
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              createEditor();
              });
              }
              else {
              createEditor();
              }
              });

              function createEditor() {
              StackExchange.prepareEditor({
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: false,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
              allowUrls: true
              },
              onDemand: true,
              discardSelector: ".discard-answer"
              ,immediatelyShowMarkdownHelp:true
              });


              }
              });














              draft saved

              draft discarded


















              StackExchange.ready(
              function () {
              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f410792%2fgetting-a-match-count-of-objects-in-a-file%23new-answer', 'question_page');
              }
              );

              Post as a guest















              Required, but never shown

























              6 Answers
              6






              active

              oldest

              votes








              6 Answers
              6






              active

              oldest

              votes









              active

              oldest

              votes






              active

              oldest

              votes









              0














              Awk solution:



              awk '/empType: /{ f=($2=="A"? 1:0) }f && /ADID: [0-9]+/{ c++ }END{ print c }' file




              • f - flag indicating empType: A section processing


              • c - count of empType: A entries with filled ADID key




              The output:



              2





              share|improve this answer






























                0














                Awk solution:



                awk '/empType: /{ f=($2=="A"? 1:0) }f && /ADID: [0-9]+/{ c++ }END{ print c }' file




                • f - flag indicating empType: A section processing


                • c - count of empType: A entries with filled ADID key




                The output:



                2





                share|improve this answer




























                  0












                  0








                  0







                  Awk solution:



                  awk '/empType: /{ f=($2=="A"? 1:0) }f && /ADID: [0-9]+/{ c++ }END{ print c }' file




                  • f - flag indicating empType: A section processing


                  • c - count of empType: A entries with filled ADID key




                  The output:



                  2





                  share|improve this answer















                  Awk solution:



                  awk '/empType: /{ f=($2=="A"? 1:0) }f && /ADID: [0-9]+/{ c++ }END{ print c }' file




                  • f - flag indicating empType: A section processing


                  • c - count of empType: A entries with filled ADID key




                  The output:



                  2






                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited Dec 14 '17 at 6:12

























                  answered Dec 14 '17 at 6:00









                  RomanPerekhrestRomanPerekhrest

                  23.1k12447




                  23.1k12447

























                      0














                      Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS



                      BEGIN {RS=""; FS="n"}
                      {
                      split($4,a,": ")
                      split($5,b,": ")
                      }
                      a[2]=="A" && b[2]!="" {c++}
                      END {print c}


                      the script can be executed with



                      awk -f main.awk file





                      share|improve this answer




























                        0














                        Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS



                        BEGIN {RS=""; FS="n"}
                        {
                        split($4,a,": ")
                        split($5,b,": ")
                        }
                        a[2]=="A" && b[2]!="" {c++}
                        END {print c}


                        the script can be executed with



                        awk -f main.awk file





                        share|improve this answer


























                          0












                          0








                          0







                          Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS



                          BEGIN {RS=""; FS="n"}
                          {
                          split($4,a,": ")
                          split($5,b,": ")
                          }
                          a[2]=="A" && b[2]!="" {c++}
                          END {print c}


                          the script can be executed with



                          awk -f main.awk file





                          share|improve this answer













                          Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS



                          BEGIN {RS=""; FS="n"}
                          {
                          split($4,a,": ")
                          split($5,b,": ")
                          }
                          a[2]=="A" && b[2]!="" {c++}
                          END {print c}


                          the script can be executed with



                          awk -f main.awk file






                          share|improve this answer












                          share|improve this answer



                          share|improve this answer










                          answered Dec 14 '17 at 6:39









                          etopylightetopylight

                          383127




                          383127























                              0














                              Simple two grep method, where data is the input file:



                              grep -A1 'empType: A' data | grep -c 'ADID: .+'


                              Output:



                              2





                              share|improve this answer






























                                0














                                Simple two grep method, where data is the input file:



                                grep -A1 'empType: A' data | grep -c 'ADID: .+'


                                Output:



                                2





                                share|improve this answer




























                                  0












                                  0








                                  0







                                  Simple two grep method, where data is the input file:



                                  grep -A1 'empType: A' data | grep -c 'ADID: .+'


                                  Output:



                                  2





                                  share|improve this answer















                                  Simple two grep method, where data is the input file:



                                  grep -A1 'empType: A' data | grep -c 'ADID: .+'


                                  Output:



                                  2






                                  share|improve this answer














                                  share|improve this answer



                                  share|improve this answer








                                  edited Dec 14 '17 at 7:15

























                                  answered Dec 14 '17 at 7:09









                                  agcagc

                                  4,71111137




                                  4,71111137























                                      0














                                      I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:



                                      #!/usr/bin/env awk
                                      # getids.awk

                                      BEGIN{
                                      RS="";
                                      FS="n"
                                      }

                                      /ADID: [0-9]/ && /empType: A/{print $1}


                                      And here it is in action:



                                      user@host:~$ awk -f getids.awk data.txt
                                      entry-id: 1
                                      entry-id: 3

                                      user@host:~$ awk -f getids.awk data.txt | wc -l
                                      2


                                      Of course if you just want the count we can do that too:



                                      #!/usr/bin/env awk
                                      # count.awk

                                      BEGIN {
                                      RS="";
                                      FS="n";
                                      count=0;
                                      }

                                      /ADID: [0-9]/ && /empType: A/{count++}

                                      END {
                                      print count
                                      }


                                      And because I love Python, here is a Python script that does the same thing:



                                      #!/usr/bin/env python2
                                      # -*- coding: ascii -*-
                                      """getids.py"""

                                      import sys

                                      # Create a list to store the matched records
                                      records =

                                      # Iterate over the lines of the input file
                                      with open(sys.argv[1]) as data:
                                      for line in data:

                                      # When an "entry-id" is reached, create a new record
                                      if line.startswith('entry-id'):
                                      entry_id = line.split(':')[1].strip()
                                      records.append({'entry-id': entry_id})

                                      # For other lines, update the current record
                                      elif line.strip():
                                      key = line.partition(':')[0].strip()
                                      value = line.partition(':')[2].strip()
                                      records[-1][key] = value

                                      # Extract the list of records meeting the desired critera
                                      matches = [record for record in records if record['empType'] == 'A' and record['ADID']]

                                      # Print out the entry-ids for all of the matches
                                      for match in matches:
                                      print('entry-id: ' + match['entry-id'])


                                      And here's the Python script in action:



                                      user@host:~$ python getids.py data.txt
                                      entry-id: 1
                                      entry-id: 3

                                      user@host:~$ python getids.py data.txt | wc -l
                                      2


                                      And if we really do just want the counts:



                                      #!/usr/bin/env python2
                                      # -*- coding: ascii -*-
                                      """count.py"""

                                      import sys

                                      # Keep a count of the number of matches
                                      count = 0

                                      # Use flags to keep track of the current record
                                      emptype_flag = False
                                      adid_flag = False

                                      # Iterate over the lines of the input file
                                      with open(sys.argv[1]) as data:
                                      for line in data:

                                      # When an "entry-id" is reached, reset the flags
                                      if line.startswith('entry-id'):
                                      emptype_flag = False
                                      adid_flag = False
                                      elif line.strip() == "empType: A":
                                      emptype_flag = True
                                      elif line.startswith("ADID") and line.strip().split(':')[1]:
                                      adid_flag = True

                                      # If both conditions hold the increment the counter
                                      # and reset the flags
                                      if emptype_flag and adid_flag:
                                      count = count + 1
                                      emptype_flag = False
                                      adid_flag = False

                                      # Print the number of matches
                                      print(count)


                                      And, while we're at it, how about a pure Bash script? Here's one:



                                      #!/usr/bin/env bash

                                      # getids.bash

                                      while read line; do
                                      if [[ "${line}" =~ "entry-id:" ]]; then
                                      entry_id="${line}"
                                      emptype=false
                                      adid=false
                                      elif [[ "${line}" =~ "empType: A" ]]; then
                                      emptype=true
                                      elif [[ "${line}" =~ ADID: [0-9] ]]; then
                                      adid=true
                                      fi
                                      if [[ "${emptype}" == true && "${adid}" == true ]]; then
                                      echo "${entry_id}"
                                      emptype=false
                                      adid=false
                                      fi
                                      done < "$1"


                                      And running the bash script:



                                      user@host:~$ bash getids.bash data.txt
                                      entry-id: 1
                                      entry-id: 3


                                      And finally, here's something using just grep and wc:



                                      user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l

                                      2





                                      share|improve this answer






























                                        0














                                        I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:



                                        #!/usr/bin/env awk
                                        # getids.awk

                                        BEGIN{
                                        RS="";
                                        FS="n"
                                        }

                                        /ADID: [0-9]/ && /empType: A/{print $1}


                                        And here it is in action:



                                        user@host:~$ awk -f getids.awk data.txt
                                        entry-id: 1
                                        entry-id: 3

                                        user@host:~$ awk -f getids.awk data.txt | wc -l
                                        2


                                        Of course if you just want the count we can do that too:



                                        #!/usr/bin/env awk
                                        # count.awk

                                        BEGIN {
                                        RS="";
                                        FS="n";
                                        count=0;
                                        }

                                        /ADID: [0-9]/ && /empType: A/{count++}

                                        END {
                                        print count
                                        }


                                        And because I love Python, here is a Python script that does the same thing:



                                        #!/usr/bin/env python2
                                        # -*- coding: ascii -*-
                                        """getids.py"""

                                        import sys

                                        # Create a list to store the matched records
                                        records =

                                        # Iterate over the lines of the input file
                                        with open(sys.argv[1]) as data:
                                        for line in data:

                                        # When an "entry-id" is reached, create a new record
                                        if line.startswith('entry-id'):
                                        entry_id = line.split(':')[1].strip()
                                        records.append({'entry-id': entry_id})

                                        # For other lines, update the current record
                                        elif line.strip():
                                        key = line.partition(':')[0].strip()
                                        value = line.partition(':')[2].strip()
                                        records[-1][key] = value

                                        # Extract the list of records meeting the desired critera
                                        matches = [record for record in records if record['empType'] == 'A' and record['ADID']]

                                        # Print out the entry-ids for all of the matches
                                        for match in matches:
                                        print('entry-id: ' + match['entry-id'])


                                        And here's the Python script in action:



                                        user@host:~$ python getids.py data.txt
                                        entry-id: 1
                                        entry-id: 3

                                        user@host:~$ python getids.py data.txt | wc -l
                                        2


                                        And if we really do just want the counts:



                                        #!/usr/bin/env python2
                                        # -*- coding: ascii -*-
                                        """count.py"""

                                        import sys

                                        # Keep a count of the number of matches
                                        count = 0

                                        # Use flags to keep track of the current record
                                        emptype_flag = False
                                        adid_flag = False

                                        # Iterate over the lines of the input file
                                        with open(sys.argv[1]) as data:
                                        for line in data:

                                        # When an "entry-id" is reached, reset the flags
                                        if line.startswith('entry-id'):
                                        emptype_flag = False
                                        adid_flag = False
                                        elif line.strip() == "empType: A":
                                        emptype_flag = True
                                        elif line.startswith("ADID") and line.strip().split(':')[1]:
                                        adid_flag = True

                                        # If both conditions hold the increment the counter
                                        # and reset the flags
                                        if emptype_flag and adid_flag:
                                        count = count + 1
                                        emptype_flag = False
                                        adid_flag = False

                                        # Print the number of matches
                                        print(count)


                                        And, while we're at it, how about a pure Bash script? Here's one:



                                        #!/usr/bin/env bash

                                        # getids.bash

                                        while read line; do
                                        if [[ "${line}" =~ "entry-id:" ]]; then
                                        entry_id="${line}"
                                        emptype=false
                                        adid=false
                                        elif [[ "${line}" =~ "empType: A" ]]; then
                                        emptype=true
                                        elif [[ "${line}" =~ ADID: [0-9] ]]; then
                                        adid=true
                                        fi
                                        if [[ "${emptype}" == true && "${adid}" == true ]]; then
                                        echo "${entry_id}"
                                        emptype=false
                                        adid=false
                                        fi
                                        done < "$1"


                                        And running the bash script:



                                        user@host:~$ bash getids.bash data.txt
                                        entry-id: 1
                                        entry-id: 3


                                        And finally, here's something using just grep and wc:



                                        user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l

                                        2





                                        share|improve this answer




























                                          0












                                          0








                                          0







                                          I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:



                                          #!/usr/bin/env awk
                                          # getids.awk

                                          BEGIN{
                                          RS="";
                                          FS="n"
                                          }

                                          /ADID: [0-9]/ && /empType: A/{print $1}


                                          And here it is in action:



                                          user@host:~$ awk -f getids.awk data.txt
                                          entry-id: 1
                                          entry-id: 3

                                          user@host:~$ awk -f getids.awk data.txt | wc -l
                                          2


                                          Of course if you just want the count we can do that too:



                                          #!/usr/bin/env awk
                                          # count.awk

                                          BEGIN {
                                          RS="";
                                          FS="n";
                                          count=0;
                                          }

                                          /ADID: [0-9]/ && /empType: A/{count++}

                                          END {
                                          print count
                                          }


                                          And because I love Python, here is a Python script that does the same thing:



                                          #!/usr/bin/env python2
                                          # -*- coding: ascii -*-
                                          """getids.py"""

                                          import sys

                                          # Create a list to store the matched records
                                          records =

                                          # Iterate over the lines of the input file
                                          with open(sys.argv[1]) as data:
                                          for line in data:

                                          # When an "entry-id" is reached, create a new record
                                          if line.startswith('entry-id'):
                                          entry_id = line.split(':')[1].strip()
                                          records.append({'entry-id': entry_id})

                                          # For other lines, update the current record
                                          elif line.strip():
                                          key = line.partition(':')[0].strip()
                                          value = line.partition(':')[2].strip()
                                          records[-1][key] = value

                                          # Extract the list of records meeting the desired critera
                                          matches = [record for record in records if record['empType'] == 'A' and record['ADID']]

                                          # Print out the entry-ids for all of the matches
                                          for match in matches:
                                          print('entry-id: ' + match['entry-id'])


                                          And here's the Python script in action:



                                          user@host:~$ python getids.py data.txt
                                          entry-id: 1
                                          entry-id: 3

                                          user@host:~$ python getids.py data.txt | wc -l
                                          2


                                          And if we really do just want the counts:



                                          #!/usr/bin/env python2
                                          # -*- coding: ascii -*-
                                          """count.py"""

                                          import sys

                                          # Keep a count of the number of matches
                                          count = 0

                                          # Use flags to keep track of the current record
                                          emptype_flag = False
                                          adid_flag = False

                                          # Iterate over the lines of the input file
                                          with open(sys.argv[1]) as data:
                                          for line in data:

                                          # When an "entry-id" is reached, reset the flags
                                          if line.startswith('entry-id'):
                                          emptype_flag = False
                                          adid_flag = False
                                          elif line.strip() == "empType: A":
                                          emptype_flag = True
                                          elif line.startswith("ADID") and line.strip().split(':')[1]:
                                          adid_flag = True

                                          # If both conditions hold the increment the counter
                                          # and reset the flags
                                          if emptype_flag and adid_flag:
                                          count = count + 1
                                          emptype_flag = False
                                          adid_flag = False

                                          # Print the number of matches
                                          print(count)


                                          And, while we're at it, how about a pure Bash script? Here's one:



                                          #!/usr/bin/env bash

                                          # getids.bash

                                          while read line; do
                                          if [[ "${line}" =~ "entry-id:" ]]; then
                                          entry_id="${line}"
                                          emptype=false
                                          adid=false
                                          elif [[ "${line}" =~ "empType: A" ]]; then
                                          emptype=true
                                          elif [[ "${line}" =~ ADID: [0-9] ]]; then
                                          adid=true
                                          fi
                                          if [[ "${emptype}" == true && "${adid}" == true ]]; then
                                          echo "${entry_id}"
                                          emptype=false
                                          adid=false
                                          fi
                                          done < "$1"


                                          And running the bash script:



                                          user@host:~$ bash getids.bash data.txt
                                          entry-id: 1
                                          entry-id: 3


                                          And finally, here's something using just grep and wc:



                                          user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l

                                          2





                                          share|improve this answer















                                          I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:



                                          #!/usr/bin/env awk
                                          # getids.awk

                                          BEGIN{
                                          RS="";
                                          FS="n"
                                          }

                                          /ADID: [0-9]/ && /empType: A/{print $1}


                                          And here it is in action:



                                          user@host:~$ awk -f getids.awk data.txt
                                          entry-id: 1
                                          entry-id: 3

                                          user@host:~$ awk -f getids.awk data.txt | wc -l
                                          2


                                          Of course if you just want the count we can do that too:



                                          #!/usr/bin/env awk
                                          # count.awk

                                          BEGIN {
                                          RS="";
                                          FS="n";
                                          count=0;
                                          }

                                          /ADID: [0-9]/ && /empType: A/{count++}

                                          END {
                                          print count
                                          }


                                          And because I love Python, here is a Python script that does the same thing:



                                          #!/usr/bin/env python2
                                          # -*- coding: ascii -*-
                                          """getids.py"""

                                          import sys

                                          # Create a list to store the matched records
                                          records =

                                          # Iterate over the lines of the input file
                                          with open(sys.argv[1]) as data:
                                          for line in data:

                                          # When an "entry-id" is reached, create a new record
                                          if line.startswith('entry-id'):
                                          entry_id = line.split(':')[1].strip()
                                          records.append({'entry-id': entry_id})

                                          # For other lines, update the current record
                                          elif line.strip():
                                          key = line.partition(':')[0].strip()
                                          value = line.partition(':')[2].strip()
                                          records[-1][key] = value

                                          # Extract the list of records meeting the desired critera
                                          matches = [record for record in records if record['empType'] == 'A' and record['ADID']]

                                          # Print out the entry-ids for all of the matches
                                          for match in matches:
                                          print('entry-id: ' + match['entry-id'])


                                          And here's the Python script in action:



                                          user@host:~$ python getids.py data.txt
                                          entry-id: 1
                                          entry-id: 3

                                          user@host:~$ python getids.py data.txt | wc -l
                                          2


                                          And if we really do just want the counts:



                                          #!/usr/bin/env python2
                                          # -*- coding: ascii -*-
                                          """count.py"""

                                          import sys

                                          # Keep a count of the number of matches
                                          count = 0

                                          # Use flags to keep track of the current record
                                          emptype_flag = False
                                          adid_flag = False

                                          # Iterate over the lines of the input file
                                          with open(sys.argv[1]) as data:
                                          for line in data:

                                          # When an "entry-id" is reached, reset the flags
                                          if line.startswith('entry-id'):
                                          emptype_flag = False
                                          adid_flag = False
                                          elif line.strip() == "empType: A":
                                          emptype_flag = True
                                          elif line.startswith("ADID") and line.strip().split(':')[1]:
                                          adid_flag = True

                                          # If both conditions hold the increment the counter
                                          # and reset the flags
                                          if emptype_flag and adid_flag:
                                          count = count + 1
                                          emptype_flag = False
                                          adid_flag = False

                                          # Print the number of matches
                                          print(count)


                                          And, while we're at it, how about a pure Bash script? Here's one:



                                          #!/usr/bin/env bash

                                          # getids.bash

                                          while read line; do
                                          if [[ "${line}" =~ "entry-id:" ]]; then
                                          entry_id="${line}"
                                          emptype=false
                                          adid=false
                                          elif [[ "${line}" =~ "empType: A" ]]; then
                                          emptype=true
                                          elif [[ "${line}" =~ ADID: [0-9] ]]; then
                                          adid=true
                                          fi
                                          if [[ "${emptype}" == true && "${adid}" == true ]]; then
                                          echo "${entry_id}"
                                          emptype=false
                                          adid=false
                                          fi
                                          done < "$1"


                                          And running the bash script:



                                          user@host:~$ bash getids.bash data.txt
                                          entry-id: 1
                                          entry-id: 3


                                          And finally, here's something using just grep and wc:



                                          user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l

                                          2






                                          share|improve this answer














                                          share|improve this answer



                                          share|improve this answer








                                          edited Dec 14 '17 at 13:36

























                                          answered Dec 14 '17 at 5:39









                                          igaligal

                                          5,2211233




                                          5,2211233























                                              0














                                              With perl, that could be:



                                              perl -l -00ne '
                                              my %f = /(.*?):s*(.*)/g;
                                              ++$n if $f{empType} eq "A" && $f{ADID} ne "";
                                              END {print 0+$n}' < file




                                              • -n causes the code given to -e to be applied to each input record


                                              • -00 for records to be paragraphs.

                                              • We build a %f associative array where key and values are mapped to each (key):spaces(value) in the record.

                                              • and increment $n where the conditions are met.

                                              • we print $n in the END (adding 0 to make sure we get 0 and not an empty string if there's no match).






                                              share|improve this answer






























                                                0














                                                With perl, that could be:



                                                perl -l -00ne '
                                                my %f = /(.*?):s*(.*)/g;
                                                ++$n if $f{empType} eq "A" && $f{ADID} ne "";
                                                END {print 0+$n}' < file




                                                • -n causes the code given to -e to be applied to each input record


                                                • -00 for records to be paragraphs.

                                                • We build a %f associative array where key and values are mapped to each (key):spaces(value) in the record.

                                                • and increment $n where the conditions are met.

                                                • we print $n in the END (adding 0 to make sure we get 0 and not an empty string if there's no match).






                                                share|improve this answer




























                                                  0












                                                  0








                                                  0







                                                  With perl, that could be:



                                                  perl -l -00ne '
                                                  my %f = /(.*?):s*(.*)/g;
                                                  ++$n if $f{empType} eq "A" && $f{ADID} ne "";
                                                  END {print 0+$n}' < file




                                                  • -n causes the code given to -e to be applied to each input record


                                                  • -00 for records to be paragraphs.

                                                  • We build a %f associative array where key and values are mapped to each (key):spaces(value) in the record.

                                                  • and increment $n where the conditions are met.

                                                  • we print $n in the END (adding 0 to make sure we get 0 and not an empty string if there's no match).






                                                  share|improve this answer















                                                  With perl, that could be:



                                                  perl -l -00ne '
                                                  my %f = /(.*?):s*(.*)/g;
                                                  ++$n if $f{empType} eq "A" && $f{ADID} ne "";
                                                  END {print 0+$n}' < file




                                                  • -n causes the code given to -e to be applied to each input record


                                                  • -00 for records to be paragraphs.

                                                  • We build a %f associative array where key and values are mapped to each (key):spaces(value) in the record.

                                                  • and increment $n where the conditions are met.

                                                  • we print $n in the END (adding 0 to make sure we get 0 and not an empty string if there's no match).







                                                  share|improve this answer














                                                  share|improve this answer



                                                  share|improve this answer








                                                  edited Dec 14 '17 at 14:57

























                                                  answered Dec 14 '17 at 14:14









                                                  Stéphane ChazelasStéphane Chazelas

                                                  306k57577932




                                                  306k57577932























                                                      0














                                                      I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
                                                      What I did find that worked was



                                                      perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"


                                                      Now I didn't know what perl -000 does, but i think it's saying search multiple lines within a paragraph,
                                                      -n while loop
                                                      e one line of program??
                                                      print paragraph if you find empType: A
                                                      now pipe those matched paragraphs to |
                                                      grep -i -c "^ADID:" find ignore cased and count number of ADIDs.
                                                      I'm not sure if the other commands failed because of my Linux version, but the above command worked pretty well, not sure how to make the empType an ignored case though....






                                                      share|improve this answer




























                                                        0














                                                        I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
                                                        What I did find that worked was



                                                        perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"


                                                        Now I didn't know what perl -000 does, but i think it's saying search multiple lines within a paragraph,
                                                        -n while loop
                                                        e one line of program??
                                                        print paragraph if you find empType: A
                                                        now pipe those matched paragraphs to |
                                                        grep -i -c "^ADID:" find ignore cased and count number of ADIDs.
                                                        I'm not sure if the other commands failed because of my Linux version, but the above command worked pretty well, not sure how to make the empType an ignored case though....






                                                        share|improve this answer


























                                                          0












                                                          0








                                                          0







                                                          I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
                                                          What I did find that worked was



                                                          perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"


                                                          Now I didn't know what perl -000 does, but i think it's saying search multiple lines within a paragraph,
                                                          -n while loop
                                                          e one line of program??
                                                          print paragraph if you find empType: A
                                                          now pipe those matched paragraphs to |
                                                          grep -i -c "^ADID:" find ignore cased and count number of ADIDs.
                                                          I'm not sure if the other commands failed because of my Linux version, but the above command worked pretty well, not sure how to make the empType an ignored case though....






                                                          share|improve this answer













                                                          I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
                                                          What I did find that worked was



                                                          perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"


                                                          Now I didn't know what perl -000 does, but i think it's saying search multiple lines within a paragraph,
                                                          -n while loop
                                                          e one line of program??
                                                          print paragraph if you find empType: A
                                                          now pipe those matched paragraphs to |
                                                          grep -i -c "^ADID:" find ignore cased and count number of ADIDs.
                                                          I'm not sure if the other commands failed because of my Linux version, but the above command worked pretty well, not sure how to make the empType an ignored case though....







                                                          share|improve this answer












                                                          share|improve this answer



                                                          share|improve this answer










                                                          answered Dec 14 '17 at 16:13









                                                          King of NESKing of NES

                                                          1163




                                                          1163






























                                                              draft saved

                                                              draft discarded




















































                                                              Thanks for contributing an answer to Unix & Linux Stack Exchange!


                                                              • Please be sure to answer the question. Provide details and share your research!

                                                              But avoid



                                                              • Asking for help, clarification, or responding to other answers.

                                                              • Making statements based on opinion; back them up with references or personal experience.


                                                              To learn more, see our tips on writing great answers.




                                                              draft saved


                                                              draft discarded














                                                              StackExchange.ready(
                                                              function () {
                                                              StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f410792%2fgetting-a-match-count-of-objects-in-a-file%23new-answer', 'question_page');
                                                              }
                                                              );

                                                              Post as a guest















                                                              Required, but never shown





















































                                                              Required, but never shown














                                                              Required, but never shown












                                                              Required, but never shown







                                                              Required, but never shown

































                                                              Required, but never shown














                                                              Required, but never shown












                                                              Required, but never shown







                                                              Required, but never shown







                                                              Popular posts from this blog

                                                              Loup dans la culture

                                                              How to solve the problem of ntp “Unable to contact time server” from KDE?

                                                              ASUS Zenbook UX433/UX333 — Configure Touchpad-embedded numpad on Linux