[Discuss] Weird awk processing
Jerry Feldman
gaf at blu.org
Thu May 2 15:52:33 EDT 2013
On 05/02/2013 01:11 AM, David Rosenstrauch wrote:
> Just stumbled upon the most bizarre awk problem. mawk and gawk are
> showing 2 different results for the same code. Can anyone shed any
> light?
>
> TIA!
>
> DR
>
> ---
>
> sense at ip-10-98-190-45.job:/sense/work/feature-summary/debugging$ cat
> sample.txt
> 32e49398e024dcb79a319c62ceb213ae3e824f77 2
> 32e4cb91fdefe6103d73f1d6e43ecd8430f85334 2
> 32e4cb91fdefe6103d73f1d6e43ecd8430f85334 132
> 32e5434c41a8e2f0178fd19bd868758af6eb67c0 2
> 32e56067 10
> 32e56067 79
> 32e56122 59
> 32e57aacfd27f7fde61184052cb35551213c7cd6 5
>
> sense at ip-10-98-190-45.job:/sense/work/feature-summary/debugging$ cat
> ../totals-by-label.awk
> #!/usr/bin/awk -f
>
> BEGIN {FS="\t"; prev_label = "";}
> {
> curr_label=$1;
> count=$2;
> if (prev_label != "" && curr_label != prev_label) {
> output();
> }
> prev_label=curr_label;
> tot += count;
> }
> END { output(); }
>
> function output() {
> print prev_label"\t"tot;
> tot = 0;
> }
>
> sense at ip-10-98-190-45.job:/sense/work/feature-summary/debugging$ cat
> sample.txt | mawk -f ../totals-by-label.awk
> 32e49398e024dcb79a319c62ceb213ae3e824f77 2
> 32e4cb91fdefe6103d73f1d6e43ecd8430f85334 134
> 32e5434c41a8e2f0178fd19bd868758af6eb67c0 2
> 32e56122 148
> 32e57aacfd27f7fde61184052cb35551213c7cd6 5
>
> sense at ip-10-98-190-45.job:/sense/work/feature-summary/debugging$ cat
> sample.txt | gawk -f ../totals-by-label.awk
> 32e49398e024dcb79a319c62ceb213ae3e824f77 2
> 32e4cb91fdefe6103d73f1d6e43ecd8430f85334 134
> 32e5434c41a8e2f0178fd19bd868758af6eb67c0 2
> 32e56067 89
> 32e56122 59
> 32e57aacfd27f7fde61184052cb35551213c7cd6 5
> _
>
I get the same results from the 2. The first thing I did was to run
sample.txt through sed to convert the spaces to tabs. Certainly the
correct result is the gawk result you got.
[gaf at gaf awk]$ cat sample.txt | gawk -f totals-by-label.awk
2e49398e024dcb79a319c62ceb213ae3e824f77 2
32e4cb91fdefe6103d73f1d6e43ecd8430f85334 134
32e5434c41a8e2f0178fd19bd868758af6eb67c0 2
32e56067 89
32e56122 59
32e57aacfd27f7fde61184052cb35551213c7cd6 5
[gaf at gaf awk]$ cat sample.txt | mawk -f totals-by-label.awk
2e49398e024dcb79a319c62ceb213ae3e824f77 2
32e4cb91fdefe6103d73f1d6e43ecd8430f85334 134
32e5434c41a8e2f0178fd19bd868758af6eb67c0 2
32e56067 89
32e56122 59
32e57aacfd27f7fde61184052cb35551213c7cd6 5
--
Jerry Feldman <gaf at blu.org>
Boston Linux and Unix
PGP key id:3BC1EB90
PGP Key fingerprint: 49E2 C52A FC5A A31F 8D66 C0AF 7CEA 30FC 3BC1 EB90
More information about the Discuss
mailing list