A Problem

I think I have a Bash problem.  What follows is an actual command from my history.

cat /usr/share/dict/words | fgrep -v "'" | perl -ne 'chomp($_); @b=split(//,$_); print join("", sort(@b))." ".$_."\n";' | tee lookup.txt | perl -pe 's/^([^ ]+) .*/\1/g' | awk '{ print length, $0 }' | sort -n | awk '{$1=""; print $0}' | uniq -c | sort -nr | egrep "^[^0-9]+2 " | awk '{ print length, $0 }' | sort -n | awk '{$1=""; print $0}' | perl -pe 's/[ 0-9]//g' | xargs -i grep {} lookup.txt | perl -pe 's/[^ ]+ //g' | tail -n2

It’s just so hard to bite the bullet, admit that the problem has grown in scope, and move it to its own Perl/Python script.  (P.S. The Guinness Book is wrong.  “Conservationalists” is not a real word.)

Edit: to those who are competing in the comments to improve (shorten) the above command: when pasting code, use the <code> tag to override Wordpress quote formatting.

Joey Comeau has a new book out based on Overqualified, which has long been one of my favorite things on the internet.  He writes cover letters to companies.  They each sound businesslike enough for the first paragraph or so, and then you gradually realize you are reading something that is in no way a normal cover letter.  An excerpt from one to Nintendo:

We need a new Mario game, where you rescue the princess in the first ten minutes, and for the rest of the game you try and push down that sick feeling in your stomach that she’s “damaged goods”, a concept detailed again and again in the profoundly sex negative instruction booklet, and when Luigi makes a crack about her and Bowser, you break his nose and immediately regret it. When Peach asks you, in the quiet of her mushroom castle bedroom “do you still love me?” you pretend to be asleep. You press the A button rhythmically, to control your breath, keep it even.

#2 (NeoPost), #28 (Phone surveys) and #58 (MySpace) are three of my favorites.

99 Responses to “A Problem”

  1. Mason says:

    @Aaron A: Yeah, some do, in the sense that they have a lot of extra fields in which you can put pretty much whatever you want. I think the comic is just referring to really specific genres though.

  2. The Interobang Guy says:

    spirov92 : “I wouldn’t risk losing this if something happens to ~/.bash_history .”
    Am i the only person who reads that with a menacing tone?
    I can just picture the mafia bots in futurama: “hey, you best not be messing around with python, cuz that looks like it took an awful lot of work, I wouldn’t risk losing this if something happens to ~/.bash_history”

    actually typing that out i just realized i pronounce ~ “home”, i think there’s something wrong with me ?

  3. Matt Hickford says:

    I pronounce ~ as home, / as root, we all should.

    /etc as ‘et cetera’, fstab as footstab. fsck is harder.

    ls is list.

  4. Crimson says:

    Updating xkcd every day?
    I’m so thrilled. It’s great.

    Thanks! You’re the best.

  5. Cole Erickson says:

    @Matt Hickford:

    lol. Reminds me of how I pronounce HTML: hitmull, XML: ex-ihm-ull, SHTML: well. you can guess, SQL: Skewl (even though I’m well aware it’s sequel), as well as many other things like this.

  6. smcwhtdtmc says:

    /etc is quite clearly pronounced “et-see”. And ~ is “squiggle”. I feel so adamant about this that I’m willing to start a quasi-religious propaganda war. Any takers?

    …and I will do it from emacs.

  7. just kidding says:

    Is this safe to run or do I need to worry about overwriting the kernel with a list of words from the dictionary?

    hahaha

  8. RiotingPacifist says:

    (please imagine me leading a charge while shouting nano, as my post will have much more effect that way)

    How can you say ~ is “squiggle” (at least it should be tilde), that’s is like saying that ? is “three dots” and :) is “colon bracket”. While i wont argue over /etc (ekd) or /usr (user) and honestly don’t care, i will see you on the battlefield regarding ~, oh and i compose all my posts in nano!!!!

  9. James says:

    Hi all. I didn’t quite shorten the code, but I did make it faster. :-)


    #include
    #include
    #include

    #define MAXWORDLEN 30
    #define MAXWORDS 1000000

    struct line {
    char word[MAXWORDLEN];
    char sorted[MAXWORDLEN];
    int length;
    };

    struct pair {
    char first[MAXWORDLEN];
    char second[MAXWORDLEN];
    int length;
    };

    int comp_chars(const void *a, const void *b);
    int comp_sorted(const void *a, const void *b);
    int comp_len(const void *a, const void *b);

    int main(int argc, char *argv[])
    {
    /* Check correct no of arguments */
    if (argc != 2 && argc != 3) {
    printf("Usage: uniq_anag words_list number_of_pairs\n");
    printf("If no number given, default is one.\n");
    return -1;
    }

    /* Declarations */
    FILE *wordlist;
    struct line *lines = calloc(MAXWORDS, sizeof(struct line));
    int i, j;
    char c;
    int no_of_words;
    int no_to_return = (argc == 2 ? 1 : atoi(argv[2]));
    struct pair *pairs = calloc(MAXWORDS, sizeof(struct pair));

    /* Open the file */
    wordlist = fopen(argv[1], "r");
    /* Get the words */
    i = j = 0;
    while ((c = getc(wordlist)) != EOF) {
    if (c != '\n') {
    lines[i].word[j] = c;
    lines[i].sorted[j++] = c;
    }
    else {
    j = 0;
    ++i;
    }
    }
    no_of_words = i;

    /* Order the letters in the second copy */
    for (i = 0; i < no_of_words; ++i) {
    lines[i].length = strlen(lines[i].word);
    qsort(lines[i].sorted, lines[i].length, sizeof(char), comp_chars);
    }

    /* Sort the lines by the sorted words */
    qsort(lines, no_of_words, sizeof(struct line), comp_sorted);

    /* Look for pairs */
    char to_comp[2][MAXWORDLEN];
    strcpy(to_comp[0], lines[0].sorted);
    int matches = 0;
    j = 0;
    for (i = 1; i < no_of_words; ++i) {
    strcpy(to_comp[i%2], lines[i].sorted);
    if (strcmp(to_comp[0],to_comp[1]) == 0)
    matches++;
    else {
    if (matches == 1) {
    strcpy(pairs[j].first, lines[i-2].word);
    strcpy(pairs[j].second, lines[i-1].word);
    pairs[j++].length = lines[i-1].length;
    }
    matches = 0;
    }
    }
    int no_of_pairs = j;

    /* Sort the pairs we've found */
    qsort(pairs, no_of_pairs, sizeof(struct pair), comp_len);

    /* Print results */
    for (i = 0; i < no_to_return; i++) {
    if (pairs[i].length == 0)
    break;
    printf("%s %s\n", pairs[i].first, pairs[i].second);
    }

    /* Clean up */
    free((void *)lines);
    fclose(wordlist);
    return 0;
    }

    int comp_chars(const void *a, const void *b)
    {
    if (*(char*)a < *(char*)b)
    return -1;
    else if (*(char*)a == *(char*)b)
    return 0;
    else
    return 1;
    }

    int comp_sorted(const void *a, const void *b)
    {
    char sorted[2][MAXWORDLEN];
    strcpy(sorted[0],(*(struct line *)a).sorted);
    strcpy(sorted[1],(*(struct line *)b).sorted);
    return strcmp(sorted[0],sorted[1]);
    }

    int comp_len(const void *a, const void *b)
    {
    int lengths[2];
    lengths[0] = (*(struct pair *)a).length;
    lengths[1] = (*(struct pair *)b).length;
    if (lengths[0] < lengths[1])
    return 1;
    else if (lengths[0] == lengths[1])
    return 0;
    else
    return -1;
    }

  10. stu says:

    maybe you should replace the fancy apostrophes like: ‘{ print length, $0 }’ (and elsewhere) with simple ‘ ‘ ones… this would allow one to copy/paste into a shell without error :)

  11. stu says:

    hmmm. your blag borks my input! I enter basic apostrophes and it makes them fancy. bad blag!

  12. benjam says:

    just thought you’d be interested to know that my AVG 8.5 Free blocks the first page of comments as having a virus called BAT/Deleter.

    I know there is nothing malicious (or is there <_<), just thought it was funny.

  13. Henrie Schnee says:

    Have you considered a “Random”-functionality for the blag similar to the one in the comic-section… there’s several years worth of text, so giving the uninformed reader the option to perceive it in a more fatalistic way would be… awesome.

    Respect from Germany & it’s liberal arts majors!

  14. Useless use of cat; tsk, tsk.

    fgrep -v “‘” </usr/share/dict/words | …

    Probably not the greatest of your worries, but hey, it’s a start.

    -Josh

  15. Viadd says:

    If I were trying to use minimal lines I could make this much less clear

    def sortuple(word) :
    letters = [l for l in word.strip()]
    letters.sort()
    return tuple(letters)

    bytuple={}
    pairs=[]

    for w in open('/usr/share/dict/words') :
    w = w.strip("\n")
    tup = sortuple(w)
    if not "'" in tup :
    if tup in bytuple :
    pairs.append( (len(w),bytuple[tup],w) )
    else :
    bytuple[tup] = w

    pairs.sort()
    for tup in pairs[-1:-100:-1] :
    print("%2d %s %s" % tup)

    Which gives

    % time python /Users/palmer/sameletters.py
    22 hydropneumopericardium pneumohydropericardium
    22 cholecystoduodenostomy duodenocholecystostomy
    21 glossolabiopharyngeal labioglossopharyngeal
    21 duodenopancreatectomy pancreatoduodenectomy
    ...
    19 incontrovertibility introconvertibility
    ...
    17 misrepresentation representationism
    ...
    real 0m2.766s
    user 0m2.695s
    sys 0m0.063s

  16. Viadd says:

    And of course, the whitespace is lost even with code tags, so you can’t just cut and paste.

    I am using OS X so it is looking at almost a quarter million words with average length > 9.5 letters.

  17. Ian says:

    How much fs would a fsck ck if a fsck could ck fs?

  18. Aniviller says:

    Even more, you usage of

    perl -pe 's/^([^ ]+) .*/\1/g’

    for extracting first column is overkill (the same goes for extraction of last column later in code).

    There’s nice utility which does just that.


    cut -d' ' -f1

    p.s. sorry for not using “code” before. Here’s repost:


    fgrep -v "'" /usr/share/dict/words | perl -pne 'chomp;$_=join("",sort(split//))." $_\n"'

  19. Caitlin says:

    The day I read this post about Overqualified I bought it. I bought it because you said it was funny, and it seemed pretty neat to me.. but the book is so much bigger than just funny cover letters.
    As you read all the cover letters, it slowly reveals a story about the author’s brother’s death in a car accident and it’s really uniquely written.

    I love it and I’m sharing it with everyone I know. Thanks!

  20. Theyain says:

    At least you’ve never walked in on your roomie masturbating to a bash prompt.

  21. sfink says:

    I know it really, really doesn’t matter, but I can’t resist. You don’t just have a bash problem; you have a perl and awk problem too.

    I know that each of these commands has a zillion options and it’s pointless to memorize each one. But some are worth your while: perl -l, for example. chomp and “\n” are both annoying nuisances having nothing to do with the problem at hand.

    perl -a is the other. perl -pe 's/^([^ ]+) .*/\1/' makes me cringe. You’re friendly with awk; why didn’t you use that? Anyway, I think you meant to say perl -ane 'print $F[0]'.

    Your awk problem is that you use it. Clean out that corner of your brain, stick with Perl, and maybe you’ll have space to remember -a and -l.

    That’s not to mention your regexp problem. What’s all this [^ ]+ nonsense? Seems like you probably meant \w+ anyway.

    I confess I didn’t actually spend the time to figure out what all of that is trying to accomplish. Something like perl -lne '$s=join("",sort split(//)); push @{ $w{$s} }, $_; END { foreach (sort { length($w{$b}[0]) length($w{$a}[0]) } keys %w) { next unless @{ $w{$_ } } > 1; print length($w{$_}[0]), " ", join(" ", @{ $w{$_} }) } }' /usr/share/dict/words | head I guess? (Or what that was before the blag ate it, I mean.)

    Argh! I promised myself not to do that. Evil geekteasing slut! You knew we couldn’t help ourselves…

    But yes, somewhere around the 5th or 6th stage in the pipeline, that should have been rewritten as a single script. I’ve committed similar sins, but I think you’ve beat my worst by a factor of 3 or 4. Probably something to do with my attention span.

  22. Aniviller says:

    @sfink: I believe that this was put together in a hurry (like every oneliner is). So one automatically uses a tool he’s most comfortable with. But I do suggest usage of cut/paste/join/sed/grep if possible. Mostly because oneliners are more readable and portable if you can see what it’s doing. Perl is only for the steps that require serious string work.

    I posted my suggested changes 2 posts above.

    awk is IMO fairly good tool, but pretty useless because when there’s need for it, the problem is already out of hand and should be moved to perl/python/…

  23. Bob LaBla says:

    That’s not really a Bash script, it’s a bunch of Awk and Perl strung together with minimal amount of shell commands. If you like Perl so much, why not just write a Perl script rather than this mess of Perl one-liners strung together?

    You can write perfectly readable, clean code in Bash, but it takes a bit of discipline, and skill.

  24. reCAPTCHA says:

    ly comments

  25. Hey Randal! I want to use one of your comics on my blog…well…uh…I kind of already did. Is the citation the way you want it? I changed the rollover text, problem? Hit me back.

    wastelandamerica.

  26. Dan says:

    There’s nothing wrong with using Bash to string together this much manipulation, as long as:
    1) Each portion of the pipe is an independent logical functional unit (which I am led to believe might not be the case by your “tee”, but it’s still possible).
    2) The command line grew organically, as you tested each portion of the manipulation along the way and thought of the correct next step.
    3) Speed of execution is not an issue.

    For example, here’s a quick script I worked up the other day to reset a password, check accounts and group memberships, and verify sudoers settings:

    FAILSTRING='33[1;31mFAILURE33[0;0m'; for user in user1 user2 user3 user4 anotheruser; do (grep $user /etc/passwd && groups $user | grep staff >> /dev/null) || /bin/echo -e $user: $FAILSTRING; done; grep -E '^%staff' /etc/sudoers || /bin/echo -e "Sudoers: $FAILSTRING"; if [ -z $pwreset ]; then passwd anotheruser; fi; export pwreset='n'

    It works like a charm, and for what I needed it for, it was ideal. For example, by storing state in the shell, I could re-run it after I fixed any issues it identified and it wouldn’t repeat any destructive action.

  27. SuccessfullyWasted2Hours says:

    This is not at all shorter by any means, I admit. But my angle was different:
    You only return the (two) longest matches. For languages with a more complex grammer (French, German), those are commonly of no use, because they are just two different flections of the very same verb (different tenses etc.).

    So, after I figured out what you were doing (given that I know almost nothing about perl, that was simply “first command, check output, add next, check output, …”), I dropped all but the first perl command (which I wouldn’t know how to do in pure shell…) and built up the rest in the command structures I am familiar with. :)
    And when I was there, I added args to play around… min. length, min. multiplicity, sort by length or by multiplicity.
    The result is then grouped by the key (making it better readable as the groups are more obvious).

    It’s also a bit faster (~9s vs. ~35s for the original), but uses more files…
    (Yeah, and it misses the whole “do in one line” thing as it is written now, but I believe with fixed args it *could* be crammed into one.)


    #!/bin/bash
    # sort behaves not as expected:
    # it treats international chars as separators? if LC_ALL is not set to C
    export LC_ALL=C

    # now that this is a script, might as well give options...
    arginvalid=""
    sort="-bylength"
    cntmin=3
    lenmin=8
    while [ $# -gt 0 ]; do
    if [ "$1" == "-bycount" -o "$1" == "-bylength" ]; then
    sort=$1
    else
    if [ "$1" == "-countmin" ]; then
    if [ "$2" -gt 1 -a "$2" -lt 10 ]; then
    cntmin=$2
    else
    echo Invalid argument to '-countmin': Must be followed by a number between 1 and 10.
    fi
    # taking 2 args away
    shift
    else
    if [ "$1" == "-lenmin" ]; then
    if [ "$2" -gt 1 ]; then
    lenmin=$2
    else
    echo Invalid argument to '-lenmin': Must be followed by a number over 1.
    fi
    # taking 2 args away
    shift
    else
    arginvalid=$1
    fi
    fi
    fi

    shift
    done

    if [ "$arginvalid" != "" ]; then
    echo Invalid argument given: "$arginvalid"
    echo Valid choices: '-bycount', '-bylength', '-countmin ', '-lenmin '
    fi

    echo Running with: Sorting \= "$sort", min. multiplicity \= "$cntmin", min. length \= "$lenmin"

    # generate the lookup table and the sorted list of all keys in it
    echo Generating tables...
    cat /usr/share/dict/words \
    | fgrep -v "'" \
    | perl -ne 'chomp($_); @b=split(//,$_); print join("", sort(@b))." ".$_."\n";' \
    | sort \
    | tee lookup.txt \
    | cut -f1 -d\ >keys.txt

    # only at least triplicate (or $cntmin if overridden) keys are kept, result sorted by multiplicity descending
    # min. accepted key length is ~8 (or $lenmin if overridden) chars (international chars are counted twice or more!)
    echo Analyzing keys...
    i=0;
    multifilter=":["
    while [ $i -lt $cntmin ]; do
    multifilter=${multifilter}$i
    i=$(($i + 1))
    done
    multifilter=${multifilter}"]"
    lenfilter="^[^:]\{"${lenmin}",\}:"

    ( \
    count=0; \
    j=""; \
    for i in `cat keys.txt`; do \
    if [ "$j" == "$i" ]; then \
    count=$(($count + 1)); \
    else \
    echo $j:$count; \
    count=1; \
    j="$i"; \
    fi; \
    done; \
    echo $j:$count \
    ) | grep -v $multifilter | grep $lenfilter >keys_len8.txt

    if [ "$sort" == "-bylength" ]; then
    # I don't know how to tell awk to count length only until the :...
    awk '{ print length, $0 }' keys_len8_sorted.txt
    else
    sort -k2 -t: -r keys_len8_sorted.txt
    fi

    # list the keys and the list of the matches
    total=`grep -c : keys_len8_sorted.txt`
    j=0
    pct=0
    echo >&2 Building result: \(one dot equals 1%\)
    ( \
    for i in `cut &2 -n .; \
    pct=$pctnow; \
    fi; \
    done \
    ) >values_len8plus.txt
    echo >&2

    echo --- Complete result following ---
    cat values_len8plus.txt

  28. SuccessfullyWasted2Hours says:

    That *almost* copied right. D’oh. Mkay, replaced all redirections by HTML entities… If I knew how, I’d delete the previous post. :D

    This is not at all shorter by any means, I admit. But my angle was different:
    You only return the (two) longest matches. For languages with a more complex grammer (French, German), those are commonly of no use, because they are just two different flections of the very same verb (different tenses etc.).

    So, after I figured out what you were doing (given that I know almost nothing about perl, that was simply “first command, check output, add next, check output, …”), I dropped all but the first perl command (which I wouldn’t know how to do in pure shell…) and built up the rest in the command structures I am familiar with. :)
    And when I was there, I added args to play around… min. length, min. multiplicity, sort by length or by multiplicity.
    The result is then grouped by the key (making it better readable as the groups are more obvious).

    It’s also a bit faster (~9s vs. ~35s for the original), but uses more files…
    (Yeah, and it misses the whole “do in one line” thing as it is written now, but it *could* be crammed into one, I believe.)


    #!/bin/bash
    # sort behaves not as expected:
    # it treats international chars as separators? if LC_ALL is not set to C
    export LC_ALL=C

    # now that this is a script, might as well give options...
    arginvalid=""
    sort="-bylength"
    cntmin=3
    lenmin=8
    while [ $# -gt 0 ]; do
    if [ "$1" == "-bycount" -o "$1" == "-bylength" ]; then
    sort=$1
    else
    if [ "$1" == "-countmin" ]; then
    if [ "$2" -gt 1 -a "$2" -lt 10 ]; then
    cntmin=$2
    else
    echo Invalid argument to '-countmin': Must be followed by a number between 1 and 10.
    fi
    # taking 2 args away
    shift
    else
    if [ "$1" == "-lenmin" ]; then
    if [ "$2" -gt 1 ]; then
    lenmin=$2
    else
    echo Invalid argument to '-lenmin': Must be followed by a number over 1.
    fi
    # taking 2 args away
    shift
    else
    arginvalid=$1
    fi
    fi
    fi

    shift
    done

    if [ "$arginvalid" != "" ]; then
    echo Invalid argument given: "$arginvalid"
    echo Valid choices: '-bycount', '-bylength', '-countmin <number>', '-lenmin <number>'
    fi

    echo Running with: Sorting \= "$sort", min. multiplicity \= "$cntmin", min. length \= "$lenmin"

    # generate the lookup table and the sorted list of all keys in it
    echo Generating tables...
    cat /usr/share/dict/words \
    | fgrep -v "'" \
    | perl -ne 'chomp($_); @b=split(//,$_); print join("", sort(@b))." ".$_."\n";' \
    | sort \
    | tee lookup.txt \
    | cut -f1 -d\ >keys.txt

    # only at least triplicate (or $cntmin if overridden) keys are kept, result sorted by multiplicity descending
    # min. accepted key length is ~8 (or $lenmin if overridden) chars (international chars are counted twice or more!)
    echo Analyzing keys...
    i=0;
    multifilter=":["
    while [ $i -lt $cntmin ]; do
    multifilter=${multifilter}$i
    i=$(($i + 1))
    done
    multifilter=${multifilter}"]"
    lenfilter="^[^:]\{"${lenmin}",\}:"

    ( \
    count=0; \
    j=""; \
    for i in `cat keys.txt`; do \
    if [ "$j" == "$i" ]; then \
    count=$(($count + 1)); \
    else \
    echo $j:$count; \
    count=1; \
    j="$i"; \
    fi; \
    done; \
    echo $j:$count \
    ) | grep -v $multifilter | grep $lenfilter >keys_len8.txt

    if [ "$sort" == "-bylength" ]; then
    # I don't know how to tell awk to count length only until the :...
    awk '{ print length, $0 }' <keys_len8.txt | sort -nr >keys_len8_sorted.txt
    else
    sort -k2 -t: -r <keys_len8.txt >keys_len8_sorted.txt
    fi

    # list the keys and the list of the matches
    total=`grep -c : keys_len8_sorted.txt`
    j=0
    pct=0
    echo >&2 Building result: \(one dot equals 1%\)
    ( \
    for i in `cut <keys_len8_sorted.txt -f2 -d\ | cut -f1 -d:`; do \
    echo $i:; \
    grep "^$i " lookup.txt; \
    echo; \
    j=$(($j + 1)); \
    pctnow=$(($j * 100 / $total)); \
    if [ $pctnow -ne $pct ]; then \
    echo >&2 -n .; \
    pct=$pctnow; \
    fi; \
    done \
    ) >values_len8plus.txt
    echo >&2

    echo --- Complete result following ---
    cat values_len8plus.txt

  29. words says:

    I pronounce / as “slash”, /etc as “ettckk” (like you have something in your mouth), ~ as “squiggly” (or occasionally “the little squiggly”). fsck is “eff suck”.

  30. hemflit says:

    For someone who doesn’t speak Perl and isn’t motivated to learn, can anyone please explain in English (or in clear pseudocode) what the original shell command sausage was trying to accomplish?

    (Captcha: “10:30 blanche”. 10:30?)

  31. Skips says:

    Am I the only one that pronounces everything (except WWW, which I pronounce as World Wide Web) as the actual letters?

    SQL – Ess Que El
    HTTP – Aich Tee Tee Pee

    and then I pronounce the ones that arent letters the way I imagine they sound.

    / – Fshhhp
    \ – Shhhhpf
    . – Poit
    ~ – ooOOooeeeEEoo

  32. Scmb says:

    In relation to the pronunciation of symbols I’m sure this has already come up:
    http://lists.ding.net/geeks/96/dec/msg00005.html

  33. DeathAnchor says:

    Simply written in a few lines of Perl code without insanity:


    #!/usr/local/bin/perl

    use strict;
    use warnings;

    my $store;
    while(){
    chomp;
    push @{$store->{join '', sort split //}}, $_;
    }

    foreach (sort {length($a) length($b) } keys %{$store}){
    print "@{$store->{$_}}\n" if @{$store->{$_}} > 1;
    }

    And you just do: script.pl /usr/share/dict/words | tail -1

    You can have it remove the names by ignoring anything with a capital in the first loop.

  34. DeathAnchor says:

    Oooo damn you webforms suck with coding:

    inside the while (){ should be the wakas (or angle brackets)

    if this works:
    while(<>){

  35. Katie says:

    I think of ‘etc’ as “etts” for some reason. I guess my brain is lazy.
    ‘usr’ is “user.” I don’t really pronounce ‘/’ at all, I just say “user bin python” or “etts X11 X org dot conf.” Similarly, ‘~/.mozilla’ is just “dot mozilla” and ‘/home/katie’ is “home katie.”

    Even more confusingly, ‘$HOME’ is just “home.” It’s a good thing I don’t often talk to people about Linux in person.

    As for BASH scripts, I pretty much give up and move it to a Python script if it’s longer than one line or involves more than three pipes.

    Fun times:

    fortune | sed '/you/& and Gary Busey/is' | cowsay -n

    P.S.: reCAPTCHA: “cuddling York” – I’m not sure why this amuses me, but it does.

  36. Katie says:

    Related:

    cowsay -n "STACK OVERFLOW" | cowsay -n | cowsay -n | cowsay -n

  37. medyum says:

    Thats not really a Bash script, it’s a bunch of Awk and Perl strung together with minimal amount of shell commands. If you like Perl so much, why not just write a Perl script rather than this mess of Perl one-liners strung together.You can write perfectly readable, clean code in Bash, but it takes a bit of discipline, and skill.

  38. Curator says:

    I don’t know what it is, but the real answer couldn’t possibly be as interesting as this discussion.

  39. FoolishOwl says:

    @Katie:

    I believe you intended
    cowsay -n "STACK OVERFLOW" | cowsay -n | cowsay -n | cowsay -n

  40. FoolishOwl says:

    Oops.
    cowsay "STACK OVERFLOW" | cowsay -n | cowsay -n | cowsay -n

  41. Sweth says:

    Many months later, this discussion came up in another discussion, and so I came back to look at it again, and saw Randall’s response to me. So, belatedly, I clarify: that “47″ was actually a “backslash 0 4 7″, which is the pcre octal for the single quote; the “backslash 0″ part apparently got eaten. If you put that back in, it works (I believe) as Randall intended.

  42. David says:

    This should do it in Haskell (composed at the interactive prompt, much easier to play with than Bash), unless I got the problem description wrong:


    import Data.Ord
    import Data.List

    main = interact (unlines . map show . anagrams . lines)

    anagrams = reverse
    . sortBy (comparing (length.head))
    . filter (not.null.tail)
    . groupBy (\a b -> sort a == sort b)

    Save as anagrams.hs and cat /usr/share/dict/words | runhaskell anagrams.hs

  43. Scott says:

    oh a couple of weeks late.. oh well:

    perl -l15 -ne ‘push @{$x{lc join”",sort split//}},$_;END{map{print join(” “,length,@{$x{$_}}).”\n”}sort{length($a)<=>length($b)}grep{@{$x{$_}}>1}keys%x;}’</usr/share/dict/words|tail

    perl -e’print map{@{$x{$_}}>1&&join(” “,length,@{$x{$_}}).”\n”}sort{length($a)<=>length($b)}map{chomp;push@{$x{$k=lc join”",sort split//}},$_;@{$x{$k}}<2&&$k}<>;’</usr/share/dict/words|tail

  44. Porges says:

    Here’s my variant: getting rid of almost all the Perl and doing the rest with Unixy programs… only thing left that Perl is required for is sorting letters within a string, which I couldn’t figure out how to do simply :)

    </usr/share/dict/words grep -v \' | perl -ne 'chomp;print length."\t$_\t".(join "", sort split "")."\n"' | sort -k 3 | uniq -Df 2 | sort -srnk 1 | head -n 20 | cut -f 2

  45. Yamthief says:

    Wow. This turned in to a proper ‘geek-off’ really quickly…

    …I like it here :)

  46. Brandon says:

    interesting rhetoric on the marios…seems everyone has failed to notice that part.

  47. Gonzalo says:

    @James:

    int no_to_return = (argc == 2 ? 1 : atoi(argv[2]));

    A ternary definitely is not faster.

  48. jo says:

    Here’s the Ruby way of finding anagrams:

    Anagram finder in Ruby

    http://snippets.dzone.com/posts/show/5593

  49. Sindwiller says:

    @ Matt Hickford

    fsck is just filesystem check :) mtab is metatab, I’d pronounce usr as user, even though it actually means ‘unix system resources’.

Leave a Reply