Getting a match count of objects in a file


I have a large file that has entries that look like this:

entry-id: 1
sn: John
cn: Smith
empType: A
ADID: 123456

entry-id: 2
sn: James
cn: Smith
empType: B
ADID: 123456

entry-id: 3
sn: Jobu
cn: Smith
empType: A
ADID: 123456

entry-id: 4
sn: Jobu
cn: Smith
empType: A

Each entry is separated by a new line. I need a count of entries that have an empType of A, and MUST ALSO have a value after ADID(total of 2). I've tried to use awk and grep and egrep, and still having no luck. Any ideas?

share|improve this question

  • What exactly did you try in awk? I would think something like awk -vRS= '/empType: A/ && /ADID: [0-9]+/ {n++} END {print n}' file should work

    – steeldriver
    Dec 14 '17 at 3:39

  • running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....

    – King of NES
    Dec 14 '17 at 3:54

  • You did include the correct filename to read as input?

    – bu5hman
    Dec 14 '17 at 4:15

  • it was the correct file...

    – King of NES
    Dec 14 '17 at 4:18


I have a large file that has entries that look like this:

entry-id: 1
sn: John
cn: Smith
empType: A
ADID: 123456

entry-id: 2
sn: James
cn: Smith
empType: B
ADID: 123456

entry-id: 3
sn: Jobu
cn: Smith
empType: A
ADID: 123456

entry-id: 4
sn: Jobu
cn: Smith
empType: A

Each entry is separated by a new line. I need a count of entries that have an empType of A, and MUST ALSO have a value after ADID(total of 2). I've tried to use awk and grep and egrep, and still having no luck. Any ideas?

share|improve this question

  • What exactly did you try in awk? I would think something like awk -vRS= '/empType: A/ && /ADID: [0-9]+/ {n++} END {print n}' file should work

    – steeldriver
    Dec 14 '17 at 3:39

  • running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....

    – King of NES
    Dec 14 '17 at 3:54

  • You did include the correct filename to read as input?

    – bu5hman
    Dec 14 '17 at 4:15

  • it was the correct file...

    – King of NES
    Dec 14 '17 at 4:18




I have a large file that has entries that look like this:

entry-id: 1
sn: John
cn: Smith
empType: A
ADID: 123456

entry-id: 2
sn: James
cn: Smith
empType: B
ADID: 123456

entry-id: 3
sn: Jobu
cn: Smith
empType: A
ADID: 123456

entry-id: 4
sn: Jobu
cn: Smith
empType: A

Each entry is separated by a new line. I need a count of entries that have an empType of A, and MUST ALSO have a value after ADID(total of 2). I've tried to use awk and grep and egrep, and still having no luck. Any ideas?

share|improve this question

I have a large file that has entries that look like this:

entry-id: 1
sn: John
cn: Smith
empType: A
ADID: 123456

entry-id: 2
sn: James
cn: Smith
empType: B
ADID: 123456

entry-id: 3
sn: Jobu
cn: Smith
empType: A
ADID: 123456

entry-id: 4
sn: Jobu
cn: Smith
empType: A

Each entry is separated by a new line. I need a count of entries that have an empType of A, and MUST ALSO have a value after ADID(total of 2). I've tried to use awk and grep and egrep, and still having no luck. Any ideas?

linux text-processing command-line

share|improve this question

share|improve this question

share|improve this question

share|improve this question

edited 11 mins ago




asked Dec 14 '17 at 3:29

King of NESKing of NES



  • What exactly did you try in awk? I would think something like awk -vRS= '/empType: A/ && /ADID: [0-9]+/ {n++} END {print n}' file should work

    – steeldriver
    Dec 14 '17 at 3:39

  • running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....

    – King of NES
    Dec 14 '17 at 3:54

  • You did include the correct filename to read as input?

    – bu5hman
    Dec 14 '17 at 4:15

  • it was the correct file...

    – King of NES
    Dec 14 '17 at 4:18

  • What exactly did you try in awk? I would think something like awk -vRS= '/empType: A/ && /ADID: [0-9]+/ {n++} END {print n}' file should work

    – steeldriver
    Dec 14 '17 at 3:39

  • running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....

    – King of NES
    Dec 14 '17 at 3:54

  • You did include the correct filename to read as input?

    – bu5hman
    Dec 14 '17 at 4:15

  • it was the correct file...

    – King of NES
    Dec 14 '17 at 4:18

What exactly did you try in awk? I would think something like awk -vRS= '/empType: A/ && /ADID: [0-9]+/ {n++} END {print n}' file should work

– steeldriver
Dec 14 '17 at 3:39

What exactly did you try in awk? I would think something like awk -vRS= '/empType: A/ && /ADID: [0-9]+/ {n++} END {print n}' file should work

– steeldriver
Dec 14 '17 at 3:39

running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....

– King of NES
Dec 14 '17 at 3:54

running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....

– King of NES
Dec 14 '17 at 3:54

You did include the correct filename to read as input?

– bu5hman
Dec 14 '17 at 4:15

You did include the correct filename to read as input?

– bu5hman
Dec 14 '17 at 4:15

it was the correct file...

– King of NES
Dec 14 '17 at 4:18

it was the correct file...

– King of NES
Dec 14 '17 at 4:18

6 Answers





Awk solution:

awk '/empType: /{ f=($2=="A"? 1:0) }f && /ADID: [0-9]+/{ c++ }END{ print c }' file

  • f - flag indicating empType: A section processing

  • c - count of empType: A entries with filled ADID key

The output:


share|improve this answer


    Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS

    BEGIN {RS=""; FS="n"}
    split($4,a,": ")
    split($5,b,": ")
    a[2]=="A" && b[2]!="" {c++}
    END {print c}

    the script can be executed with

    awk -f main.awk file

    share|improve this answer


      Simple two grep method, where data is the input file:

      grep -A1 'empType: A' data | grep -c 'ADID: .+'



      share|improve this answer


        I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:

        #!/usr/bin/env awk
        # getids.awk


        /ADID: [0-9]/ && /empType: A/{print $1}

        And here it is in action:

        user@host:~$ awk -f getids.awk data.txt
        entry-id: 1
        entry-id: 3

        user@host:~$ awk -f getids.awk data.txt | wc -l

        Of course if you just want the count we can do that too:

        #!/usr/bin/env awk
        # count.awk

        BEGIN {

        /ADID: [0-9]/ && /empType: A/{count++}

        END {
        print count

        And because I love Python, here is a Python script that does the same thing:

        #!/usr/bin/env python2
        # -*- coding: ascii -*-

        import sys

        # Create a list to store the matched records
        records =

        # Iterate over the lines of the input file
        with open(sys.argv[1]) as data:
        for line in data:

        # When an "entry-id" is reached, create a new record
        if line.startswith('entry-id'):
        entry_id = line.split(':')[1].strip()
        records.append({'entry-id': entry_id})

        # For other lines, update the current record
        elif line.strip():
        key = line.partition(':')[0].strip()
        value = line.partition(':')[2].strip()
        records[-1][key] = value

        # Extract the list of records meeting the desired critera
        matches = [record for record in records if record['empType'] == 'A' and record['ADID']]

        # Print out the entry-ids for all of the matches
        for match in matches:
        print('entry-id: ' + match['entry-id'])

        And here's the Python script in action:

        user@host:~$ python data.txt
        entry-id: 1
        entry-id: 3

        user@host:~$ python data.txt | wc -l

        And if we really do just want the counts:

        #!/usr/bin/env python2
        # -*- coding: ascii -*-

        import sys

        # Keep a count of the number of matches
        count = 0

        # Use flags to keep track of the current record
        emptype_flag = False
        adid_flag = False

        # Iterate over the lines of the input file
        with open(sys.argv[1]) as data:
        for line in data:

        # When an "entry-id" is reached, reset the flags
        if line.startswith('entry-id'):
        emptype_flag = False
        adid_flag = False
        elif line.strip() == "empType: A":
        emptype_flag = True
        elif line.startswith("ADID") and line.strip().split(':')[1]:
        adid_flag = True

        # If both conditions hold the increment the counter
        # and reset the flags
        if emptype_flag and adid_flag:
        count = count + 1
        emptype_flag = False
        adid_flag = False

        # Print the number of matches

        And, while we're at it, how about a pure Bash script? Here's one:

        #!/usr/bin/env bash

        # getids.bash

        while read line; do
        if [[ "${line}" =~ "entry-id:" ]]; then
        elif [[ "${line}" =~ "empType: A" ]]; then
        elif [[ "${line}" =~ ADID: [0-9] ]]; then
        if [[ "${emptype}" == true && "${adid}" == true ]]; then
        echo "${entry_id}"
        done < "$1"

        And running the bash script:

        user@host:~$ bash getids.bash data.txt
        entry-id: 1
        entry-id: 3

        And finally, here's something using just grep and wc:

        user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l


        share|improve this answer


          With perl, that could be:

          perl -l -00ne '
          my %f = /(.*?):s*(.*)/g;
          ++$n if $f{empType} eq "A" && $f{ADID} ne "";
          END {print 0+$n}' < file

          • -n causes the code given to -e to be applied to each input record

          • -00 for records to be paragraphs.

          • We build a %f associative array where key and values are mapped to each (key):spaces(value) in the record.

          • and increment $n where the conditions are met.

          • we print $n in the END (adding 0 to make sure we get 0 and not an empty string if there's no match).

          share|improve this answer


            I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
            What I did find that worked was

            perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"

            Now I didn't know what perl -000 does, but i think it's saying search multiple lines within a paragraph,
            -n while loop
            e one line of program??
            print paragraph if you find empType: A
            now pipe those matched paragraphs to |
            grep -i -c "^ADID:" find ignore cased and count number of ADIDs.
            I'm not sure if the other commands failed because of my Linux version, but the above command worked pretty well, not sure how to make the empType an ignored case though....

            share|improve this answer

              Your Answer

              StackExchange.ready(function() {
              var channelOptions = {
              tags: "".split(" "),
              id: "106"
              initTagRenderer("".split(" "), "".split(" "), channelOptions);

              StackExchange.using("externalEditor", function() {
              // Have to fire editor after snippets, if snippets enabled
              if (StackExchange.settings.snippets.snippetsEnabled) {
              StackExchange.using("snippets", function() {
              else {

              function createEditor() {
              heartbeatType: 'answer',
              autoActivateHeartbeat: false,
              convertImagesToLinks: false,
              noModals: true,
              showLowRepImageUploadWarning: true,
              reputationToPostImages: null,
              bindNavPrevention: true,
              postfix: "",
              imageUploader: {
              brandingHtml: "Powered by u003ca class="icon-imgur-white" href=""u003eu003c/au003e",
              contentPolicyHtml: "User contributions licensed under u003ca href=""u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href=""u003e(content policy)u003c/au003e",
              allowUrls: true
              onDemand: true,
              discardSelector: ".discard-answer"


              draft saved

              draft discarded

              function () {
              StackExchange.openid.initPostLogin('.new-post-login', '', 'question_page');

              Post as a guest

              Required, but never shown

              6 Answers




              6 Answers











              Awk solution:

              awk '/empType: /{ f=($2=="A"? 1:0) }f && /ADID: [0-9]+/{ c++ }END{ print c }' file

              • f - flag indicating empType: A section processing

              • c - count of empType: A entries with filled ADID key

              The output:


              share|improve this answer


                Awk solution:

                awk '/empType: /{ f=($2=="A"? 1:0) }f && /ADID: [0-9]+/{ c++ }END{ print c }' file

                • f - flag indicating empType: A section processing

                • c - count of empType: A entries with filled ADID key

                The output:


                share|improve this answer




                  Awk solution:

                  awk '/empType: /{ f=($2=="A"? 1:0) }f && /ADID: [0-9]+/{ c++ }END{ print c }' file

                  • f - flag indicating empType: A section processing

                  • c - count of empType: A entries with filled ADID key

                  The output:


                  share|improve this answer

                  Awk solution:

                  awk '/empType: /{ f=($2=="A"? 1:0) }f && /ADID: [0-9]+/{ c++ }END{ print c }' file

                  • f - flag indicating empType: A section processing

                  • c - count of empType: A entries with filled ADID key

                  The output:


                  share|improve this answer

                  share|improve this answer

                  share|improve this answer

                  edited Dec 14 '17 at 6:12

                  answered Dec 14 '17 at 6:00





                      Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS

                      BEGIN {RS=""; FS="n"}
                      split($4,a,": ")
                      split($5,b,": ")
                      a[2]=="A" && b[2]!="" {c++}
                      END {print c}

                      the script can be executed with

                      awk -f main.awk file

                      share|improve this answer


                        Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS

                        BEGIN {RS=""; FS="n"}
                        split($4,a,": ")
                        split($5,b,": ")
                        a[2]=="A" && b[2]!="" {c++}
                        END {print c}

                        the script can be executed with

                        awk -f main.awk file

                        share|improve this answer




                          Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS

                          BEGIN {RS=""; FS="n"}
                          split($4,a,": ")
                          split($5,b,": ")
                          a[2]=="A" && b[2]!="" {c++}
                          END {print c}

                          the script can be executed with

                          awk -f main.awk file

                          share|improve this answer

                          Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS

                          BEGIN {RS=""; FS="n"}
                          split($4,a,": ")
                          split($5,b,": ")
                          a[2]=="A" && b[2]!="" {c++}
                          END {print c}

                          the script can be executed with

                          awk -f main.awk file

                          share|improve this answer

                          share|improve this answer

                          share|improve this answer

                          answered Dec 14 '17 at 6:39





                              Simple two grep method, where data is the input file:

                              grep -A1 'empType: A' data | grep -c 'ADID: .+'



                              share|improve this answer


                                Simple two grep method, where data is the input file:

                                grep -A1 'empType: A' data | grep -c 'ADID: .+'



                                share|improve this answer




                                  Simple two grep method, where data is the input file:

                                  grep -A1 'empType: A' data | grep -c 'ADID: .+'



                                  share|improve this answer

                                  Simple two grep method, where data is the input file:

                                  grep -A1 'empType: A' data | grep -c 'ADID: .+'



                                  share|improve this answer

                                  share|improve this answer

                                  share|improve this answer

                                  edited Dec 14 '17 at 7:15

                                  answered Dec 14 '17 at 7:09





                                      I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:

                                      #!/usr/bin/env awk
                                      # getids.awk


                                      /ADID: [0-9]/ && /empType: A/{print $1}

                                      And here it is in action:

                                      user@host:~$ awk -f getids.awk data.txt
                                      entry-id: 1
                                      entry-id: 3

                                      user@host:~$ awk -f getids.awk data.txt | wc -l

                                      Of course if you just want the count we can do that too:

                                      #!/usr/bin/env awk
                                      # count.awk

                                      BEGIN {

                                      /ADID: [0-9]/ && /empType: A/{count++}

                                      END {
                                      print count

                                      And because I love Python, here is a Python script that does the same thing:

                                      #!/usr/bin/env python2
                                      # -*- coding: ascii -*-

                                      import sys

                                      # Create a list to store the matched records
                                      records =

                                      # Iterate over the lines of the input file
                                      with open(sys.argv[1]) as data:
                                      for line in data:

                                      # When an "entry-id" is reached, create a new record
                                      if line.startswith('entry-id'):
                                      entry_id = line.split(':')[1].strip()
                                      records.append({'entry-id': entry_id})

                                      # For other lines, update the current record
                                      elif line.strip():
                                      key = line.partition(':')[0].strip()
                                      value = line.partition(':')[2].strip()
                                      records[-1][key] = value

                                      # Extract the list of records meeting the desired critera
                                      matches = [record for record in records if record['empType'] == 'A' and record['ADID']]

                                      # Print out the entry-ids for all of the matches
                                      for match in matches:
                                      print('entry-id: ' + match['entry-id'])

                                      And here's the Python script in action:

                                      user@host:~$ python data.txt
                                      entry-id: 1
                                      entry-id: 3

                                      user@host:~$ python data.txt | wc -l

                                      And if we really do just want the counts:

                                      #!/usr/bin/env python2
                                      # -*- coding: ascii -*-

                                      import sys

                                      # Keep a count of the number of matches
                                      count = 0

                                      # Use flags to keep track of the current record
                                      emptype_flag = False
                                      adid_flag = False

                                      # Iterate over the lines of the input file
                                      with open(sys.argv[1]) as data:
                                      for line in data:

                                      # When an "entry-id" is reached, reset the flags
                                      if line.startswith('entry-id'):
                                      emptype_flag = False
                                      adid_flag = False
                                      elif line.strip() == "empType: A":
                                      emptype_flag = True
                                      elif line.startswith("ADID") and line.strip().split(':')[1]:
                                      adid_flag = True

                                      # If both conditions hold the increment the counter
                                      # and reset the flags
                                      if emptype_flag and adid_flag:
                                      count = count + 1
                                      emptype_flag = False
                                      adid_flag = False

                                      # Print the number of matches

                                      And, while we're at it, how about a pure Bash script? Here's one:

                                      #!/usr/bin/env bash

                                      # getids.bash

                                      while read line; do
                                      if [[ "${line}" =~ "entry-id:" ]]; then
                                      elif [[ "${line}" =~ "empType: A" ]]; then
                                      elif [[ "${line}" =~ ADID: [0-9] ]]; then
                                      if [[ "${emptype}" == true && "${adid}" == true ]]; then
                                      echo "${entry_id}"
                                      done < "$1"

                                      And running the bash script:

                                      user@host:~$ bash getids.bash data.txt
                                      entry-id: 1
                                      entry-id: 3

                                      And finally, here's something using just grep and wc:

                                      user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l


                                      share|improve this answer


                                        I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:

                                        #!/usr/bin/env awk
                                        # getids.awk


                                        /ADID: [0-9]/ && /empType: A/{print $1}

                                        And here it is in action:

                                        user@host:~$ awk -f getids.awk data.txt
                                        entry-id: 1
                                        entry-id: 3

                                        user@host:~$ awk -f getids.awk data.txt | wc -l

                                        Of course if you just want the count we can do that too:

                                        #!/usr/bin/env awk
                                        # count.awk

                                        BEGIN {

                                        /ADID: [0-9]/ && /empType: A/{count++}

                                        END {
                                        print count

                                        And because I love Python, here is a Python script that does the same thing:

                                        #!/usr/bin/env python2
                                        # -*- coding: ascii -*-

                                        import sys

                                        # Create a list to store the matched records
                                        records =

                                        # Iterate over the lines of the input file
                                        with open(sys.argv[1]) as data:
                                        for line in data:

                                        # When an "entry-id" is reached, create a new record
                                        if line.startswith('entry-id'):
                                        entry_id = line.split(':')[1].strip()
                                        records.append({'entry-id': entry_id})

                                        # For other lines, update the current record
                                        elif line.strip():
                                        key = line.partition(':')[0].strip()
                                        value = line.partition(':')[2].strip()
                                        records[-1][key] = value

                                        # Extract the list of records meeting the desired critera
                                        matches = [record for record in records if record['empType'] == 'A' and record['ADID']]

                                        # Print out the entry-ids for all of the matches
                                        for match in matches:
                                        print('entry-id: ' + match['entry-id'])

                                        And here's the Python script in action:

                                        user@host:~$ python data.txt
                                        entry-id: 1
                                        entry-id: 3

                                        user@host:~$ python data.txt | wc -l

                                        And if we really do just want the counts:

                                        #!/usr/bin/env python2
                                        # -*- coding: ascii -*-

                                        import sys

                                        # Keep a count of the number of matches
                                        count = 0

                                        # Use flags to keep track of the current record
                                        emptype_flag = False
                                        adid_flag = False

                                        # Iterate over the lines of the input file
                                        with open(sys.argv[1]) as data:
                                        for line in data:

                                        # When an "entry-id" is reached, reset the flags
                                        if line.startswith('entry-id'):
                                        emptype_flag = False
                                        adid_flag = False
                                        elif line.strip() == "empType: A":
                                        emptype_flag = True
                                        elif line.startswith("ADID") and line.strip().split(':')[1]:
                                        adid_flag = True

                                        # If both conditions hold the increment the counter
                                        # and reset the flags
                                        if emptype_flag and adid_flag:
                                        count = count + 1
                                        emptype_flag = False
                                        adid_flag = False

                                        # Print the number of matches

                                        And, while we're at it, how about a pure Bash script? Here's one:

                                        #!/usr/bin/env bash

                                        # getids.bash

                                        while read line; do
                                        if [[ "${line}" =~ "entry-id:" ]]; then
                                        elif [[ "${line}" =~ "empType: A" ]]; then
                                        elif [[ "${line}" =~ ADID: [0-9] ]]; then
                                        if [[ "${emptype}" == true && "${adid}" == true ]]; then
                                        echo "${entry_id}"
                                        done < "$1"

                                        And running the bash script:

                                        user@host:~$ bash getids.bash data.txt
                                        entry-id: 1
                                        entry-id: 3

                                        And finally, here's something using just grep and wc:

                                        user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l


                                        share|improve this answer




                                          I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:

                                          #!/usr/bin/env awk
                                          # getids.awk


                                          /ADID: [0-9]/ && /empType: A/{print $1}

                                          And here it is in action:

                                          user@host:~$ awk -f getids.awk data.txt
                                          entry-id: 1
                                          entry-id: 3

                                          user@host:~$ awk -f getids.awk data.txt | wc -l

                                          Of course if you just want the count we can do that too:

                                          #!/usr/bin/env awk
                                          # count.awk

                                          BEGIN {

                                          /ADID: [0-9]/ && /empType: A/{count++}

                                          END {
                                          print count

                                          And because I love Python, here is a Python script that does the same thing:

                                          #!/usr/bin/env python2
                                          # -*- coding: ascii -*-

                                          import sys

                                          # Create a list to store the matched records
                                          records =

                                          # Iterate over the lines of the input file
                                          with open(sys.argv[1]) as data:
                                          for line in data:

                                          # When an "entry-id" is reached, create a new record
                                          if line.startswith('entry-id'):
                                          entry_id = line.split(':')[1].strip()
                                          records.append({'entry-id': entry_id})

                                          # For other lines, update the current record
                                          elif line.strip():
                                          key = line.partition(':')[0].strip()
                                          value = line.partition(':')[2].strip()
                                          records[-1][key] = value

                                          # Extract the list of records meeting the desired critera
                                          matches = [record for record in records if record['empType'] == 'A' and record['ADID']]

                                          # Print out the entry-ids for all of the matches
                                          for match in matches:
                                          print('entry-id: ' + match['entry-id'])

                                          And here's the Python script in action:

                                          user@host:~$ python data.txt
                                          entry-id: 1
                                          entry-id: 3

                                          user@host:~$ python data.txt | wc -l

                                          And if we really do just want the counts:

                                          #!/usr/bin/env python2
                                          # -*- coding: ascii -*-

                                          import sys

                                          # Keep a count of the number of matches
                                          count = 0

                                          # Use flags to keep track of the current record
                                          emptype_flag = False
                                          adid_flag = False

                                          # Iterate over the lines of the input file
                                          with open(sys.argv[1]) as data:
                                          for line in data:

                                          # When an "entry-id" is reached, reset the flags
                                          if line.startswith('entry-id'):
                                          emptype_flag = False
                                          adid_flag = False
                                          elif line.strip() == "empType: A":
                                          emptype_flag = True
                                          elif line.startswith("ADID") and line.strip().split(':')[1]:
                                          adid_flag = True

                                          # If both conditions hold the increment the counter
                                          # and reset the flags
                                          if emptype_flag and adid_flag:
                                          count = count + 1
                                          emptype_flag = False
                                          adid_flag = False

                                          # Print the number of matches

                                          And, while we're at it, how about a pure Bash script? Here's one:

                                          #!/usr/bin/env bash

                                          # getids.bash

                                          while read line; do
                                          if [[ "${line}" =~ "entry-id:" ]]; then
                                          elif [[ "${line}" =~ "empType: A" ]]; then
                                          elif [[ "${line}" =~ ADID: [0-9] ]]; then
                                          if [[ "${emptype}" == true && "${adid}" == true ]]; then
                                          echo "${entry_id}"
                                          done < "$1"

                                          And running the bash script:

                                          user@host:~$ bash getids.bash data.txt
                                          entry-id: 1
                                          entry-id: 3

                                          And finally, here's something using just grep and wc:

                                          user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l


                                          share|improve this answer

                                          I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:

                                          #!/usr/bin/env awk
                                          # getids.awk


                                          /ADID: [0-9]/ && /empType: A/{print $1}

                                          And here it is in action:

                                          user@host:~$ awk -f getids.awk data.txt
                                          entry-id: 1
                                          entry-id: 3

                                          user@host:~$ awk -f getids.awk data.txt | wc -l

                                          Of course if you just want the count we can do that too:

                                          #!/usr/bin/env awk
                                          # count.awk

                                          BEGIN {

                                          /ADID: [0-9]/ && /empType: A/{count++}

                                          END {
                                          print count

                                          And because I love Python, here is a Python script that does the same thing:

                                          #!/usr/bin/env python2
                                          # -*- coding: ascii -*-

                                          import sys

                                          # Create a list to store the matched records
                                          records =

                                          # Iterate over the lines of the input file
                                          with open(sys.argv[1]) as data:
                                          for line in data:

                                          # When an "entry-id" is reached, create a new record
                                          if line.startswith('entry-id'):
                                          entry_id = line.split(':')[1].strip()
                                          records.append({'entry-id': entry_id})

                                          # For other lines, update the current record
                                          elif line.strip():
                                          key = line.partition(':')[0].strip()
                                          value = line.partition(':')[2].strip()
                                          records[-1][key] = value

                                          # Extract the list of records meeting the desired critera
                                          matches = [record for record in records if record['empType'] == 'A' and record['ADID']]

                                          # Print out the entry-ids for all of the matches
                                          for match in matches:
                                          print('entry-id: ' + match['entry-id'])

                                          And here's the Python script in action:

                                          user@host:~$ python data.txt
                                          entry-id: 1
                                          entry-id: 3

                                          user@host:~$ python data.txt | wc -l

                                          And if we really do just want the counts:

                                          #!/usr/bin/env python2
                                          # -*- coding: ascii -*-

                                          import sys

                                          # Keep a count of the number of matches
                                          count = 0

                                          # Use flags to keep track of the current record
                                          emptype_flag = False
                                          adid_flag = False

                                          # Iterate over the lines of the input file
                                          with open(sys.argv[1]) as data:
                                          for line in data:

                                          # When an "entry-id" is reached, reset the flags
                                          if line.startswith('entry-id'):
                                          emptype_flag = False
                                          adid_flag = False
                                          elif line.strip() == "empType: A":
                                          emptype_flag = True
                                          elif line.startswith("ADID") and line.strip().split(':')[1]:
                                          adid_flag = True

                                          # If both conditions hold the increment the counter
                                          # and reset the flags
                                          if emptype_flag and adid_flag:
                                          count = count + 1
                                          emptype_flag = False
                                          adid_flag = False

                                          # Print the number of matches

                                          And, while we're at it, how about a pure Bash script? Here's one:

                                          #!/usr/bin/env bash

                                          # getids.bash

                                          while read line; do
                                          if [[ "${line}" =~ "entry-id:" ]]; then
                                          elif [[ "${line}" =~ "empType: A" ]]; then
                                          elif [[ "${line}" =~ ADID: [0-9] ]]; then
                                          if [[ "${emptype}" == true && "${adid}" == true ]]; then
                                          echo "${entry_id}"
                                          done < "$1"

                                          And running the bash script:

                                          user@host:~$ bash getids.bash data.txt
                                          entry-id: 1
                                          entry-id: 3

                                          And finally, here's something using just grep and wc:

                                          user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l


                                          share|improve this answer

                                          share|improve this answer

                                          share|improve this answer

                                          edited Dec 14 '17 at 13:36

                                          answered Dec 14 '17 at 5:39





                                              With perl, that could be:

                                              perl -l -00ne '
                                              my %f = /(.*?):s*(.*)/g;
                                              ++$n if $f{empType} eq "A" && $f{ADID} ne "";
                                              END {print 0+$n}' < file

                                              • -n causes the code given to -e to be applied to each input record

                                              • -00 for records to be paragraphs.

                                              • We build a %f associative array where key and values are mapped to each (key):spaces(value) in the record.

                                              • and increment $n where the conditions are met.

                                              • we print $n in the END (adding 0 to make sure we get 0 and not an empty string if there's no match).

                                              share|improve this answer


                                                With perl, that could be:

                                                perl -l -00ne '
                                                my %f = /(.*?):s*(.*)/g;
                                                ++$n if $f{empType} eq "A" && $f{ADID} ne "";
                                                END {print 0+$n}' < file

                                                • -n causes the code given to -e to be applied to each input record

                                                • -00 for records to be paragraphs.

                                                • We build a %f associative array where key and values are mapped to each (key):spaces(value) in the record.

                                                • and increment $n where the conditions are met.

                                                • we print $n in the END (adding 0 to make sure we get 0 and not an empty string if there's no match).

                                                share|improve this answer




                                                  With perl, that could be:

                                                  perl -l -00ne '
                                                  my %f = /(.*?):s*(.*)/g;
                                                  ++$n if $f{empType} eq "A" && $f{ADID} ne "";
                                                  END {print 0+$n}' < file

                                                  • -n causes the code given to -e to be applied to each input record

                                                  • -00 for records to be paragraphs.

                                                  • We build a %f associative array where key and values are mapped to each (key):spaces(value) in the record.

                                                  • and increment $n where the conditions are met.

                                                  • we print $n in the END (adding 0 to make sure we get 0 and not an empty string if there's no match).

                                                  share|improve this answer

                                                  With perl, that could be:

                                                  perl -l -00ne '
                                                  my %f = /(.*?):s*(.*)/g;
                                                  ++$n if $f{empType} eq "A" && $f{ADID} ne "";
                                                  END {print 0+$n}' < file

                                                  • -n causes the code given to -e to be applied to each input record

                                                  • -00 for records to be paragraphs.

                                                  • We build a %f associative array where key and values are mapped to each (key):spaces(value) in the record.

                                                  • and increment $n where the conditions are met.

                                                  • we print $n in the END (adding 0 to make sure we get 0 and not an empty string if there's no match).

                                                  share|improve this answer

                                                  share|improve this answer

                                                  share|improve this answer

                                                  edited Dec 14 '17 at 14:57

                                                  answered Dec 14 '17 at 14:14

                                                  Stéphane ChazelasStéphane Chazelas




                                                      I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
                                                      What I did find that worked was

                                                      perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"

                                                      Now I didn't know what perl -000 does, but i think it's saying search multiple lines within a paragraph,
                                                      -n while loop
                                                      e one line of program??
                                                      print paragraph if you find empType: A
                                                      now pipe those matched paragraphs to |
                                                      grep -i -c "^ADID:" find ignore cased and count number of ADIDs.
                                                      I'm not sure if the other commands failed because of my Linux version, but the above command worked pretty well, not sure how to make the empType an ignored case though....

                                                      share|improve this answer


                                                        I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
                                                        What I did find that worked was

                                                        perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"

                                                        Now I didn't know what perl -000 does, but i think it's saying search multiple lines within a paragraph,
                                                        -n while loop
                                                        e one line of program??
                                                        print paragraph if you find empType: A
                                                        now pipe those matched paragraphs to |
                                                        grep -i -c "^ADID:" find ignore cased and count number of ADIDs.
                                                        I'm not sure if the other commands failed because of my Linux version, but the above command worked pretty well, not sure how to make the empType an ignored case though....

                                                        share|improve this answer




                                                          I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
                                                          What I did find that worked was

                                                          perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"

                                                          Now I didn't know what perl -000 does, but i think it's saying search multiple lines within a paragraph,
                                                          -n while loop
                                                          e one line of program??
                                                          print paragraph if you find empType: A
                                                          now pipe those matched paragraphs to |
                                                          grep -i -c "^ADID:" find ignore cased and count number of ADIDs.
                                                          I'm not sure if the other commands failed because of my Linux version, but the above command worked pretty well, not sure how to make the empType an ignored case though....

                                                          share|improve this answer

                                                          I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
                                                          What I did find that worked was

                                                          perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"

                                                          Now I didn't know what perl -000 does, but i think it's saying search multiple lines within a paragraph,
                                                          -n while loop
                                                          e one line of program??
                                                          print paragraph if you find empType: A
                                                          now pipe those matched paragraphs to |
                                                          grep -i -c "^ADID:" find ignore cased and count number of ADIDs.
                                                          I'm not sure if the other commands failed because of my Linux version, but the above command worked pretty well, not sure how to make the empType an ignored case though....

                                                          share|improve this answer

                                                          share|improve this answer

                                                          share|improve this answer

                                                          answered Dec 14 '17 at 16:13

                                                          King of NESKing of NES



                                                              draft saved

                                                              draft discarded

                                                              Thanks for contributing an answer to Unix & Linux Stack Exchange!

                                                              • Please be sure to answer the question. Provide details and share your research!

                                                              But avoid

                                                              • Asking for help, clarification, or responding to other answers.

                                                              • Making statements based on opinion; back them up with references or personal experience.

                                                              To learn more, see our tips on writing great answers.

                                                              draft saved

                                                              draft discarded

                                                              function () {
                                                              StackExchange.openid.initPostLogin('.new-post-login', '', 'question_page');

                                                              Post as a guest

                                                              Required, but never shown

                                                              Required, but never shown

                                                              Required, but never shown

                                                              Required, but never shown

                                                              Required, but never shown

                                                              Required, but never shown

                                                              Required, but never shown

                                                              Required, but never shown

                                                              Required, but never shown

                                                              Popular posts from this blog

                                                              Loup dans la culture

                                                              How to solve the problem of ntp “Unable to contact time server” from KDE?

                                                              Connection limited (no internet access)