Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This sounds interesting. Could you give an example where you rewrote a pipeline in awk?


Not the op but here is an example: TOKEN=$(kubectl describe secret -n kube-system $(kubectl get secrets -n kube-system | grep default | cut -f1 -d ' ') | grep -E '^token' | cut -f2 -d':' | tr -d '\t' | tr -d " ")

This pipeline may be significantly reduced by replacing cut's with awk, accommodating grep within awk and using awk's gsub in place of tr.


Example of replacing grep+cut with a single awk invokation:

    $ echo token:abc:def | grep -E ^token | cut -d: -f2
    abc
    
    $ echo token:abc:def | awk -F: '/^token/ { print $2 }'
    abc
Conditions don't have to be regular expressions. For example:

    $ echo $CSV
    foo:24
    bar:15
    baz:49
    
    $ echo $CSV | awk -F: '$2 > 20 { print $1 }'
    foo
    baz


Somebody wanted to set breakpoints in their C code by marking them with a comment (note “d” for “debugger”):

  //d
You can get a list of them with a single Awk line.

  awk -F'//d[[:space:]]*' 'NF > 1 {print FILENAME ":" FNR " " $2}' source/*.c
You can even create a GDB script, pretty easily.

(IMO, easier still to configure your editor to support breakpoints, but I’m not the one who chose to do it this way.)


Why are you using the locale-specific [:space:] on source code? In your C source code, are you using spaces other than ASCII 0x20?

Would you have //d<0xA0>rest of comment?

Or some fancy Unicode space made using several UTF-8 bytes?


Tab characters can also be found in source code.


Since you control the \\d format, why would you allow/support anything but a space as a separator? That's just to distinguish it from a comment like "\\delete empty nodes" that is not the \\d debug notation.

If tabs are supported,

  [ \t]
is still shorter than

  [[:space:]]
and if we include all the "isspace" characters from ASCII (vertical tab, form feed, embedded carriage return) except for the line feed that would never occur due to separating lines, we just break even on pure character count:

  [_\t\v\f\r]
TVFR all fall under the left hand, backspace under the right, and nothing requires Shift.

The resulting character class does exactly the same thing under any locale.


There's also [:blank:], which is just space and tab. Both I think are perfectly readable and reasonable options that communicate intent nicely.


ISO C99 says, of the isblank function (to which [:blank:] is related:

The isblank function tests for any character that is a standard blank character or is one of a locale-specific set of characters for which isspace is true and that is used to separate words within a line of text. The standard blank characters are the following: space (’ ’), and horizontal tab (’\t’). In the "C" locale, isblank returns true only for the standard blank characters.

[:blank:] is only the same thing as [\t ] (tab space) if you run your scripts and Awk and everything in the "C" locale.


Interesting, the GNU Grep manual describes both character classes as behaving as if you are in the C locale. I shouldn't have assumed it was the same as in the C standard!


> Why are you using the locale-specific [:space:] on source code?

Because it’s the one I remembered first, it worked, and I didn’t think that it needed any improvement. In fact, I still don’t think it needs any improvement.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: