Number of occurences of letters in a word?

I would like to find out the number of occurrences of each of the alphabets in a word. Eg

input

aabbbddd

output

a@2 b@3 c@0 d@3

How can I perform this using shell script?

edited Nov 17 '14 at 6:34

slm♦

249k66522680

asked Nov 17 '14 at 6:14

Rakesh R Nair

901511

1

Is your input sorted by default?

– cuonglm
Nov 17 '14 at 6:28

Please clarify your question. Do you need c@0, since that is not a letter within the word?

– slm♦
Nov 17 '14 at 6:35

add a comment |

I would like to find out the number of occurrences of each of the alphabets in a word. Eg

input

aabbbddd

output

a@2 b@3 c@0 d@3

How can I perform this using shell script?

edited Nov 17 '14 at 6:34

slm♦

249k66522680

asked Nov 17 '14 at 6:14

Rakesh R Nair

901511

1

Is your input sorted by default?

– cuonglm
Nov 17 '14 at 6:28

Please clarify your question. Do you need c@0, since that is not a letter within the word?

– slm♦
Nov 17 '14 at 6:35

add a comment |

I would like to find out the number of occurrences of each of the alphabets in a word. Eg

input

aabbbddd

output

a@2 b@3 c@0 d@3

How can I perform this using shell script?

edited Nov 17 '14 at 6:34

slm♦

249k66522680

asked Nov 17 '14 at 6:14

Rakesh R Nair

901511

I would like to find out the number of occurrences of each of the alphabets in a word. Eg

input

aabbbddd

output

a@2 b@3 c@0 d@3

How can I perform this using shell script?

bash shell shell-script

edited Nov 17 '14 at 6:34

slm♦

249k66522680

asked Nov 17 '14 at 6:14

Rakesh R Nair

901511

edited Nov 17 '14 at 6:34

slm♦

249k66522680

asked Nov 17 '14 at 6:14

Rakesh R Nair

901511

edited Nov 17 '14 at 6:34

slm♦

249k66522680

edited Nov 17 '14 at 6:34

slm♦

249k66522680

edited Nov 17 '14 at 6:34

slm♦

249k66522680

asked Nov 17 '14 at 6:14

Rakesh R Nair

901511

asked Nov 17 '14 at 6:14

Rakesh R Nair

901511

asked Nov 17 '14 at 6:14

Rakesh R Nair

901511

1

Is your input sorted by default?

– cuonglm
Nov 17 '14 at 6:28

Please clarify your question. Do you need c@0, since that is not a letter within the word?

– slm♦
Nov 17 '14 at 6:35

add a comment |

1

Is your input sorted by default?

– cuonglm
Nov 17 '14 at 6:28

Please clarify your question. Do you need c@0, since that is not a letter within the word?

– slm♦
Nov 17 '14 at 6:35

Is your input sorted by default?

– cuonglm
Nov 17 '14 at 6:28

Please clarify your question. Do you need c@0, since that is not a letter within the word?

– slm♦
Nov 17 '14 at 6:35

add a comment |

3 Answers
3

active

oldest

votes

These solutions are case-insensitive:

start cmd:> echo aabbbddd | 

  awk -v FS= '{for (i=1;i<=NF;i++) a[tolower($i)]++;}; 

    END {for (key in a) print key ": " a[key];}'

a: 2

b: 3

d: 3

Or for the complete alphabet:

start cmd:> echo Aabbbddd | 

  awk -v FS= '{for (i=1;i<=NF;i++) a[tolower($i)]++;};

    END {chars="abcdefghijklmnopqrstuvwxyz";

    for (i=1;i<27;i++) { key=substr(chars,i,1);print key ": " a[key]};}'

a: 2

b: 3

c: 

d: 3

e: 

f: 

g: 

h: 

i: 

j: 

k: 

l: 

m: 

n: 

o: 

p: 

q: 

r: 

s: 

t: 

u: 

v: 

w: 

x: 

y: 

z:

edited Nov 17 '14 at 15:42

answered Nov 17 '14 at 6:27

Hauke Laging

56.3k1285135

1

The output is missing c as OP's desired output.

– cuonglm
Nov 17 '14 at 6:31

add a comment |

You could use sed, uniq, and sort:

$ echo -n "aabbbddd" | sed 's/(.)/1n/g'| sort | uniq -c

  2 a

  3 b

  3 d

The above uses sed to take each character and replace it with itself + a newline (n). Now with each character on a newline (and sorted) you can use uniq -c to count the characters.

NOTE: This method will not show any of the characters in between that have zero occurrences.

Alternatively showing each letter's count

$ s="aabbbddd"; for i in {a..z}; do

     v=$(echo -n "$s" | grep -oi $i | wc -l); echo "$i : $v"; done

a : 2

b : 3

c : 0

d : 3

e : 0

f : 0

g : 0

h : 0

i : 0

j : 0

k : 0

l : 0

m : 0

n : 0

o : 0

p : 0

q : 0

r : 0

s : 0

t : 0

u : 0

v : 0

w : 0

x : 0

y : 0

z : 0

This works by looping through all the letters of the alphabet:

 for i in {a..z}; do .... ; done

Each iteration of the loop we grep through the string looking for a specific character, and use the -o option of grep to only return these matches. We then use wc -l to count how many occurrences of each letter we found, and store it in variable $v. We then display each iteration:

 echo "$i : $v"

NOTE: This approach can handle the strings being out of order.

edited Nov 17 '14 at 6:51

answered Nov 17 '14 at 6:32

slm♦

249k66522680

add a comment |

Using only the shell (faster for short strings):

#! /bin/bash -

input=${*:-'aabbbddd'}



tmp=$input

arr=()

maxlen=0

maxchar=''

while ((${#tmp})); do

    firstchar=${tmp:0:1}

    next=${tmp//"$firstchar"}

    len=$((${#tmp}-${#next}))

    arr+=("$firstchar: $len")

    if ((maxlen<len)); then

    maxlen=$len

    maxchar=$firstchar

    fi

    tmp=$next

done



printf '%sn' "${arr[@]}" 

echo "The char "$maxchar" appear $maxlen times in "$input""

Called as:

$ ./script

a@2 b@3 d@3 

The char "b" appear 3 times in "aabbbddd"

answered 16 mins ago

Isaac

11.6k11752

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "106"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f168379%2fnumber-of-occurences-of-letters-in-a-word%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

3 Answers
3

active

oldest

votes

3 Answers
3

active

oldest

votes

These solutions are case-insensitive:

start cmd:> echo aabbbddd | 

  awk -v FS= '{for (i=1;i<=NF;i++) a[tolower($i)]++;}; 

    END {for (key in a) print key ": " a[key];}'

a: 2

b: 3

d: 3

Or for the complete alphabet:

start cmd:> echo Aabbbddd | 

  awk -v FS= '{for (i=1;i<=NF;i++) a[tolower($i)]++;};

    END {chars="abcdefghijklmnopqrstuvwxyz";

    for (i=1;i<27;i++) { key=substr(chars,i,1);print key ": " a[key]};}'

a: 2

b: 3

c: 

d: 3

e: 

f: 

g: 

h: 

i: 

j: 

k: 

l: 

m: 

n: 

o: 

p: 

q: 

r: 

s: 

t: 

u: 

v: 

w: 

x: 

y: 

z:

edited Nov 17 '14 at 15:42

answered Nov 17 '14 at 6:27

Hauke Laging

56.3k1285135

1

The output is missing c as OP's desired output.

– cuonglm
Nov 17 '14 at 6:31

add a comment |

These solutions are case-insensitive:

start cmd:> echo aabbbddd | 

  awk -v FS= '{for (i=1;i<=NF;i++) a[tolower($i)]++;}; 

    END {for (key in a) print key ": " a[key];}'

a: 2

b: 3

d: 3

Or for the complete alphabet:

start cmd:> echo Aabbbddd | 

  awk -v FS= '{for (i=1;i<=NF;i++) a[tolower($i)]++;};

    END {chars="abcdefghijklmnopqrstuvwxyz";

    for (i=1;i<27;i++) { key=substr(chars,i,1);print key ": " a[key]};}'

a: 2

b: 3

c: 

d: 3

e: 

f: 

g: 

h: 

i: 

j: 

k: 

l: 

m: 

n: 

o: 

p: 

q: 

r: 

s: 

t: 

u: 

v: 

w: 

x: 

y: 

z:

edited Nov 17 '14 at 15:42

answered Nov 17 '14 at 6:27

Hauke Laging

56.3k1285135

1

The output is missing c as OP's desired output.

– cuonglm
Nov 17 '14 at 6:31

add a comment |

These solutions are case-insensitive:

start cmd:> echo aabbbddd | 

  awk -v FS= '{for (i=1;i<=NF;i++) a[tolower($i)]++;}; 

    END {for (key in a) print key ": " a[key];}'

a: 2

b: 3

d: 3

Or for the complete alphabet:

start cmd:> echo Aabbbddd | 

  awk -v FS= '{for (i=1;i<=NF;i++) a[tolower($i)]++;};

    END {chars="abcdefghijklmnopqrstuvwxyz";

    for (i=1;i<27;i++) { key=substr(chars,i,1);print key ": " a[key]};}'

a: 2

b: 3

c: 

d: 3

e: 

f: 

g: 

h: 

i: 

j: 

k: 

l: 

m: 

n: 

o: 

p: 

q: 

r: 

s: 

t: 

u: 

v: 

w: 

x: 

y: 

z:

edited Nov 17 '14 at 15:42

answered Nov 17 '14 at 6:27

Hauke Laging

56.3k1285135

These solutions are case-insensitive:

start cmd:> echo aabbbddd | 

  awk -v FS= '{for (i=1;i<=NF;i++) a[tolower($i)]++;}; 

    END {for (key in a) print key ": " a[key];}'

a: 2

b: 3

d: 3

Or for the complete alphabet:

start cmd:> echo Aabbbddd | 

  awk -v FS= '{for (i=1;i<=NF;i++) a[tolower($i)]++;};

    END {chars="abcdefghijklmnopqrstuvwxyz";

    for (i=1;i<27;i++) { key=substr(chars,i,1);print key ": " a[key]};}'

a: 2

b: 3

c: 

d: 3

e: 

f: 

g: 

h: 

i: 

j: 

k: 

l: 

m: 

n: 

o: 

p: 

q: 

r: 

s: 

t: 

u: 

v: 

w: 

x: 

y: 

z:

edited Nov 17 '14 at 15:42

answered Nov 17 '14 at 6:27

Hauke Laging

56.3k1285135

edited Nov 17 '14 at 15:42

answered Nov 17 '14 at 6:27

Hauke Laging

56.3k1285135

answered Nov 17 '14 at 6:27

Hauke Laging

56.3k1285135

answered Nov 17 '14 at 6:27

Hauke Laging

56.3k1285135

1

The output is missing c as OP's desired output.

– cuonglm
Nov 17 '14 at 6:31

add a comment |

1

The output is missing c as OP's desired output.

– cuonglm
Nov 17 '14 at 6:31

The output is missing c as OP's desired output.

– cuonglm
Nov 17 '14 at 6:31

add a comment |

You could use sed, uniq, and sort:

$ echo -n "aabbbddd" | sed 's/(.)/1n/g'| sort | uniq -c

  2 a

  3 b

  3 d

The above uses sed to take each character and replace it with itself + a newline (n). Now with each character on a newline (and sorted) you can use uniq -c to count the characters.

NOTE: This method will not show any of the characters in between that have zero occurrences.

Alternatively showing each letter's count

$ s="aabbbddd"; for i in {a..z}; do

     v=$(echo -n "$s" | grep -oi $i | wc -l); echo "$i : $v"; done

a : 2

b : 3

c : 0

d : 3

e : 0

f : 0

g : 0

h : 0

i : 0

j : 0

k : 0

l : 0

m : 0

n : 0

o : 0

p : 0

q : 0

r : 0

s : 0

t : 0

u : 0

v : 0

w : 0

x : 0

y : 0

z : 0

This works by looping through all the letters of the alphabet:

 for i in {a..z}; do .... ; done

 echo "$i : $v"

NOTE: This approach can handle the strings being out of order.

edited Nov 17 '14 at 6:51

answered Nov 17 '14 at 6:32

slm♦

249k66522680

add a comment |

You could use sed, uniq, and sort:

$ echo -n "aabbbddd" | sed 's/(.)/1n/g'| sort | uniq -c

  2 a

  3 b

  3 d

The above uses sed to take each character and replace it with itself + a newline (n). Now with each character on a newline (and sorted) you can use uniq -c to count the characters.

NOTE: This method will not show any of the characters in between that have zero occurrences.

Alternatively showing each letter's count

$ s="aabbbddd"; for i in {a..z}; do

     v=$(echo -n "$s" | grep -oi $i | wc -l); echo "$i : $v"; done

a : 2

b : 3

c : 0

d : 3

e : 0

f : 0

g : 0

h : 0

i : 0

j : 0

k : 0

l : 0

m : 0

n : 0

o : 0

p : 0

q : 0

r : 0

s : 0

t : 0

u : 0

v : 0

w : 0

x : 0

y : 0

z : 0

This works by looping through all the letters of the alphabet:

 for i in {a..z}; do .... ; done

 echo "$i : $v"

NOTE: This approach can handle the strings being out of order.

edited Nov 17 '14 at 6:51

answered Nov 17 '14 at 6:32

slm♦

249k66522680

add a comment |

You could use sed, uniq, and sort:

$ echo -n "aabbbddd" | sed 's/(.)/1n/g'| sort | uniq -c

  2 a

  3 b

  3 d

The above uses sed to take each character and replace it with itself + a newline (n). Now with each character on a newline (and sorted) you can use uniq -c to count the characters.

NOTE: This method will not show any of the characters in between that have zero occurrences.

Alternatively showing each letter's count

$ s="aabbbddd"; for i in {a..z}; do

     v=$(echo -n "$s" | grep -oi $i | wc -l); echo "$i : $v"; done

a : 2

b : 3

c : 0

d : 3

e : 0

f : 0

g : 0

h : 0

i : 0

j : 0

k : 0

l : 0

m : 0

n : 0

o : 0

p : 0

q : 0

r : 0

s : 0

t : 0

u : 0

v : 0

w : 0

x : 0

y : 0

z : 0

This works by looping through all the letters of the alphabet:

 for i in {a..z}; do .... ; done

 echo "$i : $v"

NOTE: This approach can handle the strings being out of order.

edited Nov 17 '14 at 6:51

answered Nov 17 '14 at 6:32

slm♦

249k66522680

You could use sed, uniq, and sort:

$ echo -n "aabbbddd" | sed 's/(.)/1n/g'| sort | uniq -c

  2 a

  3 b

  3 d

The above uses sed to take each character and replace it with itself + a newline (n). Now with each character on a newline (and sorted) you can use uniq -c to count the characters.

NOTE: This method will not show any of the characters in between that have zero occurrences.

Alternatively showing each letter's count

$ s="aabbbddd"; for i in {a..z}; do

     v=$(echo -n "$s" | grep -oi $i | wc -l); echo "$i : $v"; done

a : 2

b : 3

c : 0

d : 3

e : 0

f : 0

g : 0

h : 0

i : 0

j : 0

k : 0

l : 0

m : 0

n : 0

o : 0

p : 0

q : 0

r : 0

s : 0

t : 0

u : 0

v : 0

w : 0

x : 0

y : 0

z : 0

This works by looping through all the letters of the alphabet:

 for i in {a..z}; do .... ; done

 echo "$i : $v"

NOTE: This approach can handle the strings being out of order.

edited Nov 17 '14 at 6:51

answered Nov 17 '14 at 6:32

slm♦

249k66522680

edited Nov 17 '14 at 6:51

answered Nov 17 '14 at 6:32

slm♦

249k66522680

answered Nov 17 '14 at 6:32

slm♦

249k66522680

answered Nov 17 '14 at 6:32

slm♦

249k66522680

add a comment |

Using only the shell (faster for short strings):

#! /bin/bash -

input=${*:-'aabbbddd'}



tmp=$input

arr=()

maxlen=0

maxchar=''

while ((${#tmp})); do

    firstchar=${tmp:0:1}

    next=${tmp//"$firstchar"}

    len=$((${#tmp}-${#next}))

    arr+=("$firstchar: $len")

    if ((maxlen<len)); then

    maxlen=$len

    maxchar=$firstchar

    fi

    tmp=$next

done



printf '%sn' "${arr[@]}" 

echo "The char "$maxchar" appear $maxlen times in "$input""

Called as:

$ ./script

a@2 b@3 d@3 

The char "b" appear 3 times in "aabbbddd"

answered 16 mins ago

Isaac

11.6k11752

add a comment |

Using only the shell (faster for short strings):

#! /bin/bash -

input=${*:-'aabbbddd'}



tmp=$input

arr=()

maxlen=0

maxchar=''

while ((${#tmp})); do

    firstchar=${tmp:0:1}

    next=${tmp//"$firstchar"}

    len=$((${#tmp}-${#next}))

    arr+=("$firstchar: $len")

    if ((maxlen<len)); then

    maxlen=$len

    maxchar=$firstchar

    fi

    tmp=$next

done



printf '%sn' "${arr[@]}" 

echo "The char "$maxchar" appear $maxlen times in "$input""

Called as:

$ ./script

a@2 b@3 d@3 

The char "b" appear 3 times in "aabbbddd"

answered 16 mins ago

Isaac

11.6k11752

add a comment |

Using only the shell (faster for short strings):

#! /bin/bash -

input=${*:-'aabbbddd'}



tmp=$input

arr=()

maxlen=0

maxchar=''

while ((${#tmp})); do

    firstchar=${tmp:0:1}

    next=${tmp//"$firstchar"}

    len=$((${#tmp}-${#next}))

    arr+=("$firstchar: $len")

    if ((maxlen<len)); then

    maxlen=$len

    maxchar=$firstchar

    fi

    tmp=$next

done



printf '%sn' "${arr[@]}" 

echo "The char "$maxchar" appear $maxlen times in "$input""

Called as:

$ ./script

a@2 b@3 d@3 

The char "b" appear 3 times in "aabbbddd"

answered 16 mins ago

Isaac

11.6k11752

Using only the shell (faster for short strings):

#! /bin/bash -

input=${*:-'aabbbddd'}



tmp=$input

arr=()

maxlen=0

maxchar=''

while ((${#tmp})); do

    firstchar=${tmp:0:1}

    next=${tmp//"$firstchar"}

    len=$((${#tmp}-${#next}))

    arr+=("$firstchar: $len")

    if ((maxlen<len)); then

    maxlen=$len

    maxchar=$firstchar

    fi

    tmp=$next

done



printf '%sn' "${arr[@]}" 

echo "The char "$maxchar" appear $maxlen times in "$input""

Called as:

$ ./script

a@2 b@3 d@3 

The char "b" appear 3 times in "aabbbddd"

answered 16 mins ago

Isaac

11.6k11752

answered 16 mins ago

Isaac

11.6k11752

answered 16 mins ago

Isaac

11.6k11752

answered 16 mins ago

Isaac

11.6k11752

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Unix & Linux Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Cdtjkyj