Getting a match count of objects in a file

I have a large file that has entries that look like this:

entry-id: 1

sn: John

cn: Smith

empType: A

ADID: 123456



entry-id: 2

sn: James

cn: Smith

empType: B

ADID: 123456



entry-id: 3

sn: Jobu

cn: Smith

empType: A

ADID: 123456



entry-id: 4

sn: Jobu

cn: Smith

empType: A

ADID:

Each entry is separated by a new line. I need a count of entries that have an empType of A, and MUST ALSO have a value after ADID(total of 2). I've tried to use awk and grep and egrep, and still having no luck. Any ideas?

edited 11 mins ago

PRY

2,43031025

asked Dec 14 '17 at 3:29

King of NES

1163

What exactly did you try in awk? I would think something like awk -vRS= '/empType: A/ && /ADID: [0-9]+/ {n++} END {print n}' file should work

– steeldriver
Dec 14 '17 at 3:39

running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....

– King of NES
Dec 14 '17 at 3:54

You did include the correct filename to read as input?

– bu5hman
Dec 14 '17 at 4:15

it was the correct file...

– King of NES
Dec 14 '17 at 4:18

add a comment |

I have a large file that has entries that look like this:

entry-id: 1

sn: John

cn: Smith

empType: A

ADID: 123456



entry-id: 2

sn: James

cn: Smith

empType: B

ADID: 123456



entry-id: 3

sn: Jobu

cn: Smith

empType: A

ADID: 123456



entry-id: 4

sn: Jobu

cn: Smith

empType: A

ADID:

edited 11 mins ago

PRY

2,43031025

asked Dec 14 '17 at 3:29

King of NES

1163

What exactly did you try in awk? I would think something like awk -vRS= '/empType: A/ && /ADID: [0-9]+/ {n++} END {print n}' file should work

– steeldriver
Dec 14 '17 at 3:39

running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....

– King of NES
Dec 14 '17 at 3:54

You did include the correct filename to read as input?

– bu5hman
Dec 14 '17 at 4:15

it was the correct file...

– King of NES
Dec 14 '17 at 4:18

add a comment |

I have a large file that has entries that look like this:

entry-id: 1

sn: John

cn: Smith

empType: A

ADID: 123456



entry-id: 2

sn: James

cn: Smith

empType: B

ADID: 123456



entry-id: 3

sn: Jobu

cn: Smith

empType: A

ADID: 123456



entry-id: 4

sn: Jobu

cn: Smith

empType: A

ADID:

edited 11 mins ago

PRY

2,43031025

asked Dec 14 '17 at 3:29

King of NES

1163

I have a large file that has entries that look like this:

entry-id: 1

sn: John

cn: Smith

empType: A

ADID: 123456



entry-id: 2

sn: James

cn: Smith

empType: B

ADID: 123456



entry-id: 3

sn: Jobu

cn: Smith

empType: A

ADID: 123456



entry-id: 4

sn: Jobu

cn: Smith

empType: A

ADID:

linux text-processing command-line

edited 11 mins ago

PRY

2,43031025

asked Dec 14 '17 at 3:29

King of NES

1163

edited 11 mins ago

PRY

2,43031025

asked Dec 14 '17 at 3:29

King of NES

1163

edited 11 mins ago

PRY

2,43031025

edited 11 mins ago

PRY

2,43031025

edited 11 mins ago

PRY

2,43031025

asked Dec 14 '17 at 3:29

King of NES

1163

asked Dec 14 '17 at 3:29

King of NES

1163

asked Dec 14 '17 at 3:29

King of NES

1163

What exactly did you try in awk? I would think something like awk -vRS= '/empType: A/ && /ADID: [0-9]+/ {n++} END {print n}' file should work

– steeldriver
Dec 14 '17 at 3:39

running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....

– King of NES
Dec 14 '17 at 3:54

You did include the correct filename to read as input?

– bu5hman
Dec 14 '17 at 4:15

it was the correct file...

– King of NES
Dec 14 '17 at 4:18

add a comment |

What exactly did you try in awk? I would think something like awk -vRS= '/empType: A/ && /ADID: [0-9]+/ {n++} END {print n}' file should work

– steeldriver
Dec 14 '17 at 3:39

running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....

– King of NES
Dec 14 '17 at 3:54

You did include the correct filename to read as input?

– bu5hman
Dec 14 '17 at 4:15

it was the correct file...

– King of NES
Dec 14 '17 at 4:18

What exactly did you try in awk? I would think something like awk -vRS= '/empType: A/ && /ADID: [0-9]+/ {n++} END {print n}' file should work

– steeldriver
Dec 14 '17 at 3:39

running your command, I got "awk: record `smapsHistory: [NDSEn...' too long record number 213244" there are only like 100 records with an employeeType of C, and it's going crazy....

– King of NES
Dec 14 '17 at 3:54

You did include the correct filename to read as input?

– bu5hman
Dec 14 '17 at 4:15

it was the correct file...

– King of NES
Dec 14 '17 at 4:18

add a comment |

6 Answers
6

active

oldest

votes

Awk solution:

awk '/empType: /{ f=($2=="A"? 1:0) }f && /ADID: [0-9]+/{ c++ }END{ print c }' file

f - flag indicating empType: A section processing

c - count of empType: A entries with filled ADID key

The output:

edited Dec 14 '17 at 6:12

answered Dec 14 '17 at 6:00

RomanPerekhrest

23.1k12447

add a comment |

Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS

BEGIN {RS=""; FS="n"}

{

    split($4,a,": ")

    split($5,b,": ")

}

a[2]=="A" && b[2]!="" {c++}

END {print c}

the script can be executed with

awk -f main.awk file

answered Dec 14 '17 at 6:39

etopylight

383127

add a comment |

Simple two grep method, where data is the input file:

grep -A1 'empType: A' data | grep -c 'ADID: .+'

Output:

edited Dec 14 '17 at 7:15

answered Dec 14 '17 at 7:09

agc

4,71111137

add a comment |

I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:

#!/usr/bin/env awk

# getids.awk



BEGIN{

  RS="";

  FS="n"

}



/ADID: [0-9]/ && /empType: A/{print $1}

And here it is in action:

user@host:~$ awk -f getids.awk data.txt

entry-id: 1

entry-id: 3



user@host:~$ awk -f getids.awk data.txt | wc -l

2

Of course if you just want the count we can do that too:

#!/usr/bin/env awk

# count.awk



BEGIN {

  RS="";

  FS="n";

  count=0;

}



/ADID: [0-9]/ && /empType: A/{count++}



END {

  print count

}

And because I love Python, here is a Python script that does the same thing:

#!/usr/bin/env python2

# -*- coding: ascii -*-

"""getids.py"""



import sys



# Create a list to store the matched records

records =  



# Iterate over the lines of the input file

with open(sys.argv[1]) as data:

    for line in data:



        # When an "entry-id" is reached, create a new record

        if line.startswith('entry-id'):

            entry_id = line.split(':')[1].strip()

            records.append({'entry-id': entry_id})



        # For other lines, update the current record

        elif line.strip():

            key = line.partition(':')[0].strip()

            value = line.partition(':')[2].strip()

            records[-1][key] = value



    # Extract the list of records meeting the desired critera

    matches = [record for record in records if record['empType'] == 'A' and record['ADID']]



    # Print out the entry-ids for all of the matches

    for match in matches:

        print('entry-id: ' + match['entry-id'])

And here's the Python script in action:

user@host:~$ python getids.py data.txt

entry-id: 1

entry-id: 3



user@host:~$ python getids.py data.txt | wc -l

2

And if we really do just want the counts:

#!/usr/bin/env python2

# -*- coding: ascii -*-

"""count.py"""



import sys



# Keep a count of the number of matches 

count = 0



# Use flags to keep track of the current record

emptype_flag = False

adid_flag = False



# Iterate over the lines of the input file

with open(sys.argv[1]) as data:

    for line in data:



        # When an "entry-id" is reached, reset the flags 

        if line.startswith('entry-id'):

            emptype_flag = False

            adid_flag = False

        elif line.strip() == "empType: A":

            emptype_flag = True

        elif line.startswith("ADID") and line.strip().split(':')[1]:

            adid_flag = True



        # If both conditions hold the increment the counter

        # and reset the flags

        if emptype_flag and adid_flag:

            count = count + 1

            emptype_flag = False

            adid_flag = False



    # Print the number of matches

    print(count)

And, while we're at it, how about a pure Bash script? Here's one:

#!/usr/bin/env bash



# getids.bash



while read line; do

if [[ "${line}" =~ "entry-id:" ]]; then

    entry_id="${line}"

    emptype=false

    adid=false

elif [[ "${line}" =~ "empType: A" ]]; then

    emptype=true

elif [[ "${line}" =~ ADID: [0-9] ]]; then

    adid=true

fi

if [[ "${emptype}" == true && "${adid}" == true ]]; then

    echo "${entry_id}"

    emptype=false

    adid=false

fi

done < "$1"

And running the bash script:

user@host:~$ bash getids.bash data.txt

entry-id: 1

entry-id: 3

And finally, here's something using just grep and wc:

user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l



2

edited Dec 14 '17 at 13:36

answered Dec 14 '17 at 5:39

igal

5,2211233

add a comment |

With perl, that could be:

perl -l -00ne '

  my %f = /(.*?):s*(.*)/g;

  ++$n if $f{empType} eq "A" && $f{ADID} ne "";

  END {print 0+$n}' < file

-n causes the code given to -e to be applied to each input record

-00 for records to be paragraphs.

We build a %f associative array where key and values are mapped to each (key):spaces(value) in the record.

and increment $n where the conditions are met.

we print $n in the END (adding 0 to make sure we get 0 and not an empty string if there's no match).

edited Dec 14 '17 at 14:57

answered Dec 14 '17 at 14:14

Stéphane Chazelas

306k57577932

add a comment |

I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
What I did find that worked was

perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"

Now I didn't know what perl -000 does, but i think it's saying search multiple lines within a paragraph,
-n while loop
e one line of program??
print paragraph if you find empType: A
now pipe those matched paragraphs to |
grep -i -c "^ADID:" find ignore cased and count number of ADIDs.
I'm not sure if the other commands failed because of my Linux version, but the above command worked pretty well, not sure how to make the empType an ignored case though....

answered Dec 14 '17 at 16:13

King of NES

1163

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f410792%2fgetting-a-match-count-of-objects-in-a-file%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

6 Answers
6

active

oldest

votes

6 Answers
6

active

oldest

votes

Awk solution:

awk '/empType: /{ f=($2=="A"? 1:0) }f && /ADID: [0-9]+/{ c++ }END{ print c }' file

f - flag indicating empType: A section processing

c - count of empType: A entries with filled ADID key

The output:

edited Dec 14 '17 at 6:12

answered Dec 14 '17 at 6:00

RomanPerekhrest

23.1k12447

add a comment |

Awk solution:

awk '/empType: /{ f=($2=="A"? 1:0) }f && /ADID: [0-9]+/{ c++ }END{ print c }' file

f - flag indicating empType: A section processing

c - count of empType: A entries with filled ADID key

The output:

edited Dec 14 '17 at 6:12

answered Dec 14 '17 at 6:00

RomanPerekhrest

23.1k12447

add a comment |

Awk solution:

awk '/empType: /{ f=($2=="A"? 1:0) }f && /ADID: [0-9]+/{ c++ }END{ print c }' file

f - flag indicating empType: A section processing

c - count of empType: A entries with filled ADID key

The output:

edited Dec 14 '17 at 6:12

answered Dec 14 '17 at 6:00

RomanPerekhrest

23.1k12447

Awk solution:

awk '/empType: /{ f=($2=="A"? 1:0) }f && /ADID: [0-9]+/{ c++ }END{ print c }' file

f - flag indicating empType: A section processing

c - count of empType: A entries with filled ADID key

The output:

edited Dec 14 '17 at 6:12

answered Dec 14 '17 at 6:00

RomanPerekhrest

23.1k12447

edited Dec 14 '17 at 6:12

answered Dec 14 '17 at 6:00

RomanPerekhrest

23.1k12447

answered Dec 14 '17 at 6:00

RomanPerekhrest

23.1k12447

answered Dec 14 '17 at 6:00

RomanPerekhrest

23.1k12447

add a comment |

Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS

BEGIN {RS=""; FS="n"}

{

    split($4,a,": ")

    split($5,b,": ")

}

a[2]=="A" && b[2]!="" {c++}

END {print c}

the script can be executed with

awk -f main.awk file

answered Dec 14 '17 at 6:39

etopylight

383127

add a comment |

Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS

BEGIN {RS=""; FS="n"}

{

    split($4,a,": ")

    split($5,b,": ")

}

a[2]=="A" && b[2]!="" {c++}

END {print c}

the script can be executed with

awk -f main.awk file

answered Dec 14 '17 at 6:39

etopylight

383127

add a comment |

Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS

BEGIN {RS=""; FS="n"}

{

    split($4,a,": ")

    split($5,b,": ")

}

a[2]=="A" && b[2]!="" {c++}

END {print c}

the script can be executed with

awk -f main.awk file

answered Dec 14 '17 at 6:39

etopylight

383127

Here is an alternative awk solution that uses blank line "" as record separator RS and new line n as field separator FS

BEGIN {RS=""; FS="n"}

{

    split($4,a,": ")

    split($5,b,": ")

}

a[2]=="A" && b[2]!="" {c++}

END {print c}

the script can be executed with

awk -f main.awk file

answered Dec 14 '17 at 6:39

etopylight

383127

answered Dec 14 '17 at 6:39

etopylight

383127

answered Dec 14 '17 at 6:39

etopylight

383127

answered Dec 14 '17 at 6:39

etopylight

383127

add a comment |

Simple two grep method, where data is the input file:

grep -A1 'empType: A' data | grep -c 'ADID: .+'

Output:

edited Dec 14 '17 at 7:15

answered Dec 14 '17 at 7:09

agc

4,71111137

add a comment |

Simple two grep method, where data is the input file:

grep -A1 'empType: A' data | grep -c 'ADID: .+'

Output:

edited Dec 14 '17 at 7:15

answered Dec 14 '17 at 7:09

agc

4,71111137

add a comment |

Simple two grep method, where data is the input file:

grep -A1 'empType: A' data | grep -c 'ADID: .+'

Output:

edited Dec 14 '17 at 7:15

answered Dec 14 '17 at 7:09

agc

4,71111137

Simple two grep method, where data is the input file:

grep -A1 'empType: A' data | grep -c 'ADID: .+'

Output:

edited Dec 14 '17 at 7:15

answered Dec 14 '17 at 7:09

agc

4,71111137

edited Dec 14 '17 at 7:15

answered Dec 14 '17 at 7:09

agc

4,71111137

answered Dec 14 '17 at 7:09

agc

4,71111137

answered Dec 14 '17 at 7:09

agc

4,71111137

add a comment |

I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:

#!/usr/bin/env awk

# getids.awk



BEGIN{

  RS="";

  FS="n"

}



/ADID: [0-9]/ && /empType: A/{print $1}

And here it is in action:

user@host:~$ awk -f getids.awk data.txt

entry-id: 1

entry-id: 3



user@host:~$ awk -f getids.awk data.txt | wc -l

2

Of course if you just want the count we can do that too:

#!/usr/bin/env awk

# count.awk



BEGIN {

  RS="";

  FS="n";

  count=0;

}



/ADID: [0-9]/ && /empType: A/{count++}



END {

  print count

}

And because I love Python, here is a Python script that does the same thing:

#!/usr/bin/env python2

# -*- coding: ascii -*-

"""getids.py"""



import sys



# Create a list to store the matched records

records =  



# Iterate over the lines of the input file

with open(sys.argv[1]) as data:

    for line in data:



        # When an "entry-id" is reached, create a new record

        if line.startswith('entry-id'):

            entry_id = line.split(':')[1].strip()

            records.append({'entry-id': entry_id})



        # For other lines, update the current record

        elif line.strip():

            key = line.partition(':')[0].strip()

            value = line.partition(':')[2].strip()

            records[-1][key] = value



    # Extract the list of records meeting the desired critera

    matches = [record for record in records if record['empType'] == 'A' and record['ADID']]



    # Print out the entry-ids for all of the matches

    for match in matches:

        print('entry-id: ' + match['entry-id'])

And here's the Python script in action:

user@host:~$ python getids.py data.txt

entry-id: 1

entry-id: 3



user@host:~$ python getids.py data.txt | wc -l

2

And if we really do just want the counts:

#!/usr/bin/env python2

# -*- coding: ascii -*-

"""count.py"""



import sys



# Keep a count of the number of matches 

count = 0



# Use flags to keep track of the current record

emptype_flag = False

adid_flag = False



# Iterate over the lines of the input file

with open(sys.argv[1]) as data:

    for line in data:



        # When an "entry-id" is reached, reset the flags 

        if line.startswith('entry-id'):

            emptype_flag = False

            adid_flag = False

        elif line.strip() == "empType: A":

            emptype_flag = True

        elif line.startswith("ADID") and line.strip().split(':')[1]:

            adid_flag = True



        # If both conditions hold the increment the counter

        # and reset the flags

        if emptype_flag and adid_flag:

            count = count + 1

            emptype_flag = False

            adid_flag = False



    # Print the number of matches

    print(count)

And, while we're at it, how about a pure Bash script? Here's one:

#!/usr/bin/env bash



# getids.bash



while read line; do

if [[ "${line}" =~ "entry-id:" ]]; then

    entry_id="${line}"

    emptype=false

    adid=false

elif [[ "${line}" =~ "empType: A" ]]; then

    emptype=true

elif [[ "${line}" =~ ADID: [0-9] ]]; then

    adid=true

fi

if [[ "${emptype}" == true && "${adid}" == true ]]; then

    echo "${entry_id}"

    emptype=false

    adid=false

fi

done < "$1"

And running the bash script:

user@host:~$ bash getids.bash data.txt

entry-id: 1

entry-id: 3

And finally, here's something using just grep and wc:

user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l



2

edited Dec 14 '17 at 13:36

answered Dec 14 '17 at 5:39

igal

5,2211233

add a comment |

I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:

#!/usr/bin/env awk

# getids.awk



BEGIN{

  RS="";

  FS="n"

}



/ADID: [0-9]/ && /empType: A/{print $1}

And here it is in action:

user@host:~$ awk -f getids.awk data.txt

entry-id: 1

entry-id: 3



user@host:~$ awk -f getids.awk data.txt | wc -l

2

Of course if you just want the count we can do that too:

#!/usr/bin/env awk

# count.awk



BEGIN {

  RS="";

  FS="n";

  count=0;

}



/ADID: [0-9]/ && /empType: A/{count++}



END {

  print count

}

And because I love Python, here is a Python script that does the same thing:

#!/usr/bin/env python2

# -*- coding: ascii -*-

"""getids.py"""



import sys



# Create a list to store the matched records

records =  



# Iterate over the lines of the input file

with open(sys.argv[1]) as data:

    for line in data:



        # When an "entry-id" is reached, create a new record

        if line.startswith('entry-id'):

            entry_id = line.split(':')[1].strip()

            records.append({'entry-id': entry_id})



        # For other lines, update the current record

        elif line.strip():

            key = line.partition(':')[0].strip()

            value = line.partition(':')[2].strip()

            records[-1][key] = value



    # Extract the list of records meeting the desired critera

    matches = [record for record in records if record['empType'] == 'A' and record['ADID']]



    # Print out the entry-ids for all of the matches

    for match in matches:

        print('entry-id: ' + match['entry-id'])

And here's the Python script in action:

user@host:~$ python getids.py data.txt

entry-id: 1

entry-id: 3



user@host:~$ python getids.py data.txt | wc -l

2

And if we really do just want the counts:

#!/usr/bin/env python2

# -*- coding: ascii -*-

"""count.py"""



import sys



# Keep a count of the number of matches 

count = 0



# Use flags to keep track of the current record

emptype_flag = False

adid_flag = False



# Iterate over the lines of the input file

with open(sys.argv[1]) as data:

    for line in data:



        # When an "entry-id" is reached, reset the flags 

        if line.startswith('entry-id'):

            emptype_flag = False

            adid_flag = False

        elif line.strip() == "empType: A":

            emptype_flag = True

        elif line.startswith("ADID") and line.strip().split(':')[1]:

            adid_flag = True



        # If both conditions hold the increment the counter

        # and reset the flags

        if emptype_flag and adid_flag:

            count = count + 1

            emptype_flag = False

            adid_flag = False



    # Print the number of matches

    print(count)

And, while we're at it, how about a pure Bash script? Here's one:

#!/usr/bin/env bash



# getids.bash



while read line; do

if [[ "${line}" =~ "entry-id:" ]]; then

    entry_id="${line}"

    emptype=false

    adid=false

elif [[ "${line}" =~ "empType: A" ]]; then

    emptype=true

elif [[ "${line}" =~ ADID: [0-9] ]]; then

    adid=true

fi

if [[ "${emptype}" == true && "${adid}" == true ]]; then

    echo "${entry_id}"

    emptype=false

    adid=false

fi

done < "$1"

And running the bash script:

user@host:~$ bash getids.bash data.txt

entry-id: 1

entry-id: 3

And finally, here's something using just grep and wc:

user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l



2

edited Dec 14 '17 at 13:36

answered Dec 14 '17 at 5:39

igal

5,2211233

add a comment |

I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:

#!/usr/bin/env awk

# getids.awk



BEGIN{

  RS="";

  FS="n"

}



/ADID: [0-9]/ && /empType: A/{print $1}

And here it is in action:

user@host:~$ awk -f getids.awk data.txt

entry-id: 1

entry-id: 3



user@host:~$ awk -f getids.awk data.txt | wc -l

2

Of course if you just want the count we can do that too:

#!/usr/bin/env awk

# count.awk



BEGIN {

  RS="";

  FS="n";

  count=0;

}



/ADID: [0-9]/ && /empType: A/{count++}



END {

  print count

}

And because I love Python, here is a Python script that does the same thing:

#!/usr/bin/env python2

# -*- coding: ascii -*-

"""getids.py"""



import sys



# Create a list to store the matched records

records =  



# Iterate over the lines of the input file

with open(sys.argv[1]) as data:

    for line in data:



        # When an "entry-id" is reached, create a new record

        if line.startswith('entry-id'):

            entry_id = line.split(':')[1].strip()

            records.append({'entry-id': entry_id})



        # For other lines, update the current record

        elif line.strip():

            key = line.partition(':')[0].strip()

            value = line.partition(':')[2].strip()

            records[-1][key] = value



    # Extract the list of records meeting the desired critera

    matches = [record for record in records if record['empType'] == 'A' and record['ADID']]



    # Print out the entry-ids for all of the matches

    for match in matches:

        print('entry-id: ' + match['entry-id'])

And here's the Python script in action:

user@host:~$ python getids.py data.txt

entry-id: 1

entry-id: 3



user@host:~$ python getids.py data.txt | wc -l

2

And if we really do just want the counts:

#!/usr/bin/env python2

# -*- coding: ascii -*-

"""count.py"""



import sys



# Keep a count of the number of matches 

count = 0



# Use flags to keep track of the current record

emptype_flag = False

adid_flag = False



# Iterate over the lines of the input file

with open(sys.argv[1]) as data:

    for line in data:



        # When an "entry-id" is reached, reset the flags 

        if line.startswith('entry-id'):

            emptype_flag = False

            adid_flag = False

        elif line.strip() == "empType: A":

            emptype_flag = True

        elif line.startswith("ADID") and line.strip().split(':')[1]:

            adid_flag = True



        # If both conditions hold the increment the counter

        # and reset the flags

        if emptype_flag and adid_flag:

            count = count + 1

            emptype_flag = False

            adid_flag = False



    # Print the number of matches

    print(count)

And, while we're at it, how about a pure Bash script? Here's one:

#!/usr/bin/env bash



# getids.bash



while read line; do

if [[ "${line}" =~ "entry-id:" ]]; then

    entry_id="${line}"

    emptype=false

    adid=false

elif [[ "${line}" =~ "empType: A" ]]; then

    emptype=true

elif [[ "${line}" =~ ADID: [0-9] ]]; then

    adid=true

fi

if [[ "${emptype}" == true && "${adid}" == true ]]; then

    echo "${entry_id}"

    emptype=false

    adid=false

fi

done < "$1"

And running the bash script:

user@host:~$ bash getids.bash data.txt

entry-id: 1

entry-id: 3

And finally, here's something using just grep and wc:

user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l



2

edited Dec 14 '17 at 13:36

answered Dec 14 '17 at 5:39

igal

5,2211233

I like the idea of get the records that satisfy your requirements (better for e.g. testing) and counting them with wc -l. So here is an awk script that does just that:

#!/usr/bin/env awk

# getids.awk



BEGIN{

  RS="";

  FS="n"

}



/ADID: [0-9]/ && /empType: A/{print $1}

And here it is in action:

user@host:~$ awk -f getids.awk data.txt

entry-id: 1

entry-id: 3



user@host:~$ awk -f getids.awk data.txt | wc -l

2

Of course if you just want the count we can do that too:

#!/usr/bin/env awk

# count.awk



BEGIN {

  RS="";

  FS="n";

  count=0;

}



/ADID: [0-9]/ && /empType: A/{count++}



END {

  print count

}

And because I love Python, here is a Python script that does the same thing:

#!/usr/bin/env python2

# -*- coding: ascii -*-

"""getids.py"""



import sys



# Create a list to store the matched records

records =  



# Iterate over the lines of the input file

with open(sys.argv[1]) as data:

    for line in data:



        # When an "entry-id" is reached, create a new record

        if line.startswith('entry-id'):

            entry_id = line.split(':')[1].strip()

            records.append({'entry-id': entry_id})



        # For other lines, update the current record

        elif line.strip():

            key = line.partition(':')[0].strip()

            value = line.partition(':')[2].strip()

            records[-1][key] = value



    # Extract the list of records meeting the desired critera

    matches = [record for record in records if record['empType'] == 'A' and record['ADID']]



    # Print out the entry-ids for all of the matches

    for match in matches:

        print('entry-id: ' + match['entry-id'])

And here's the Python script in action:

user@host:~$ python getids.py data.txt

entry-id: 1

entry-id: 3



user@host:~$ python getids.py data.txt | wc -l

2

And if we really do just want the counts:

#!/usr/bin/env python2

# -*- coding: ascii -*-

"""count.py"""



import sys



# Keep a count of the number of matches 

count = 0



# Use flags to keep track of the current record

emptype_flag = False

adid_flag = False



# Iterate over the lines of the input file

with open(sys.argv[1]) as data:

    for line in data:



        # When an "entry-id" is reached, reset the flags 

        if line.startswith('entry-id'):

            emptype_flag = False

            adid_flag = False

        elif line.strip() == "empType: A":

            emptype_flag = True

        elif line.startswith("ADID") and line.strip().split(':')[1]:

            adid_flag = True



        # If both conditions hold the increment the counter

        # and reset the flags

        if emptype_flag and adid_flag:

            count = count + 1

            emptype_flag = False

            adid_flag = False



    # Print the number of matches

    print(count)

And, while we're at it, how about a pure Bash script? Here's one:

#!/usr/bin/env bash



# getids.bash



while read line; do

if [[ "${line}" =~ "entry-id:" ]]; then

    entry_id="${line}"

    emptype=false

    adid=false

elif [[ "${line}" =~ "empType: A" ]]; then

    emptype=true

elif [[ "${line}" =~ ADID: [0-9] ]]; then

    adid=true

fi

if [[ "${emptype}" == true && "${adid}" == true ]]; then

    echo "${entry_id}"

    emptype=false

    adid=false

fi

done < "$1"

And running the bash script:

user@host:~$ bash getids.bash data.txt

entry-id: 1

entry-id: 3

And finally, here's something using just grep and wc:

user@host:~$ cat data.txt | grep -A1 'empType: A' | grep "ADID: S" | wc -l



2

edited Dec 14 '17 at 13:36

answered Dec 14 '17 at 5:39

igal

5,2211233

edited Dec 14 '17 at 13:36

answered Dec 14 '17 at 5:39

igal

5,2211233

answered Dec 14 '17 at 5:39

igal

5,2211233

answered Dec 14 '17 at 5:39

igal

5,2211233

add a comment |

With perl, that could be:

perl -l -00ne '

  my %f = /(.*?):s*(.*)/g;

  ++$n if $f{empType} eq "A" && $f{ADID} ne "";

  END {print 0+$n}' < file

-n causes the code given to -e to be applied to each input record

-00 for records to be paragraphs.

We build a %f associative array where key and values are mapped to each (key):spaces(value) in the record.

and increment $n where the conditions are met.

we print $n in the END (adding 0 to make sure we get 0 and not an empty string if there's no match).

edited Dec 14 '17 at 14:57

answered Dec 14 '17 at 14:14

Stéphane Chazelas

306k57577932

add a comment |

With perl, that could be:

perl -l -00ne '

  my %f = /(.*?):s*(.*)/g;

  ++$n if $f{empType} eq "A" && $f{ADID} ne "";

  END {print 0+$n}' < file

-n causes the code given to -e to be applied to each input record

-00 for records to be paragraphs.

We build a %f associative array where key and values are mapped to each (key):spaces(value) in the record.

and increment $n where the conditions are met.

we print $n in the END (adding 0 to make sure we get 0 and not an empty string if there's no match).

edited Dec 14 '17 at 14:57

answered Dec 14 '17 at 14:14

Stéphane Chazelas

306k57577932

add a comment |

With perl, that could be:

perl -l -00ne '

  my %f = /(.*?):s*(.*)/g;

  ++$n if $f{empType} eq "A" && $f{ADID} ne "";

  END {print 0+$n}' < file

-n causes the code given to -e to be applied to each input record

-00 for records to be paragraphs.

We build a %f associative array where key and values are mapped to each (key):spaces(value) in the record.

and increment $n where the conditions are met.

we print $n in the END (adding 0 to make sure we get 0 and not an empty string if there's no match).

edited Dec 14 '17 at 14:57

answered Dec 14 '17 at 14:14

Stéphane Chazelas

306k57577932

With perl, that could be:

perl -l -00ne '

  my %f = /(.*?):s*(.*)/g;

  ++$n if $f{empType} eq "A" && $f{ADID} ne "";

  END {print 0+$n}' < file

-n causes the code given to -e to be applied to each input record

-00 for records to be paragraphs.

We build a %f associative array where key and values are mapped to each (key):spaces(value) in the record.

and increment $n where the conditions are met.

we print $n in the END (adding 0 to make sure we get 0 and not an empty string if there's no match).

edited Dec 14 '17 at 14:57

answered Dec 14 '17 at 14:14

Stéphane Chazelas

306k57577932

edited Dec 14 '17 at 14:57

answered Dec 14 '17 at 14:14

Stéphane Chazelas

306k57577932

answered Dec 14 '17 at 14:14

Stéphane Chazelas

306k57577932

answered Dec 14 '17 at 14:14

Stéphane Chazelas

306k57577932

add a comment |

I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
What I did find that worked was

perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"

answered Dec 14 '17 at 16:13

King of NES

1163

add a comment |

I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
What I did find that worked was

perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"

answered Dec 14 '17 at 16:13

King of NES

1163

add a comment |

I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
What I did find that worked was

perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"

answered Dec 14 '17 at 16:13

King of NES

1163

I wasn't able to do anything with the -A on a grep, and the other answers returned label too long or some other error.
What I did find that worked was

perl -000 -ne 'print if/empType: A/' file.ldif|grep -i -c "^ADID: [0-9A-Za-z]"

answered Dec 14 '17 at 16:13

King of NES

1163

answered Dec 14 '17 at 16:13

King of NES

1163

answered Dec 14 '17 at 16:13

King of NES

1163

answered Dec 14 '17 at 16:13

King of NES

1163

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Unix & Linux Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

P,rjq45yS4 cxJ30eAa8,8 ES6 gKhslm cNlzgB3Wz,D bEnIPE4X1EjcCJG189Y,Tk4EqG,Al3mZoG7OtQ,yeVu1 jh0j6w WoFkWgLlDj6aV

搜尋此網誌

Cdtjkyj