I think I have a Bash problem. What follows is an actual command from my history.
cat /usr/share/dict/words | fgrep -v "'" | perl -ne 'chomp($_); @b=split(//,$_); print join("", sort(@b))." ".$_."\n";' | tee lookup.txt | perl -pe 's/^([^ ]+) .*/\1/g' | awk '{ print length, $0 }' | sort -n | awk '{$1=""; print $0}' | uniq -c | sort -nr | egrep "^[^0-9]+2 " | awk '{ print length, $0 }' | sort -n | awk '{$1=""; print $0}' | perl -pe 's/[ 0-9]//g' | xargs -i grep {} lookup.txt | perl -pe 's/[^ ]+ //g' | tail -n2
It’s just so hard to bite the bullet, admit that the problem has grown in scope, and move it to its own Perl/Python script. (P.S. The Guinness Book is wrong. “Conservationalists” is not a real word.)
Edit: to those who are competing in the comments to improve (shorten) the above command: when pasting code, use the <code> tag to override Wordpress quote formatting.
Joey Comeau has a new book out based on Overqualified, which has long been one of my favorite things on the internet. He writes cover letters to companies. They each sound businesslike enough for the first paragraph or so, and then you gradually realize you are reading something that is in no way a normal cover letter. An excerpt from one to Nintendo:
We need a new Mario game, where you rescue the princess in the first ten minutes, and for the rest of the game you try and push down that sick feeling in your stomach that she’s “damaged goods”, a concept detailed again and again in the profoundly sex negative instruction booklet, and when Luigi makes a crack about her and Bowser, you break his nose and immediately regret it. When Peach asks you, in the quiet of her mushroom castle bedroom “do you still love me?” you pretend to be asleep. You press the A button rhythmically, to control your breath, keep it even.
#2 (NeoPost), #28 (Phone surveys) and #58 (MySpace) are three of my favorites.
@Aaron A: Yeah, some do, in the sense that they have a lot of extra fields in which you can put pretty much whatever you want. I think the comic is just referring to really specific genres though.
spirov92 : “I wouldn’t risk losing this if something happens to ~/.bash_history .”
Am i the only person who reads that with a menacing tone?
I can just picture the mafia bots in futurama: “hey, you best not be messing around with python, cuz that looks like it took an awful lot of work, I wouldn’t risk losing this if something happens to ~/.bash_history”
actually typing that out i just realized i pronounce ~ “home”, i think there’s something wrong with me ?
I pronounce ~ as home, / as root, we all should.
/etc as ‘et cetera’, fstab as footstab. fsck is harder.
ls is list.
Updating xkcd every day?
I’m so thrilled. It’s great.
Thanks! You’re the best.
@Matt Hickford:
lol. Reminds me of how I pronounce HTML: hitmull, XML: ex-ihm-ull, SHTML: well. you can guess, SQL: Skewl (even though I’m well aware it’s sequel), as well as many other things like this.
/etc is quite clearly pronounced “et-see”. And ~ is “squiggle”. I feel so adamant about this that I’m willing to start a quasi-religious propaganda war. Any takers?
…and I will do it from emacs.
Is this safe to run or do I need to worry about overwriting the kernel with a list of words from the dictionary?
hahaha
(please imagine me leading a charge while shouting nano, as my post will have much more effect that way)
How can you say ~ is “squiggle” (at least it should be tilde), that’s is like saying that ? is “three dots” and :) is “colon bracket”. While i wont argue over /etc (ekd) or /usr (user) and honestly don’t care, i will see you on the battlefield regarding ~, oh and i compose all my posts in nano!!!!
Hi all. I didn’t quite shorten the code, but I did make it faster. :-)
#include
#include
#include
#define MAXWORDLEN 30
#define MAXWORDS 1000000
struct line {
char word[MAXWORDLEN];
char sorted[MAXWORDLEN];
int length;
};
struct pair {
char first[MAXWORDLEN];
char second[MAXWORDLEN];
int length;
};
int comp_chars(const void *a, const void *b);
int comp_sorted(const void *a, const void *b);
int comp_len(const void *a, const void *b);
int main(int argc, char *argv[])
{
/* Check correct no of arguments */
if (argc != 2 && argc != 3) {
printf("Usage: uniq_anag words_list number_of_pairs\n");
printf("If no number given, default is one.\n");
return -1;
}
/* Declarations */
FILE *wordlist;
struct line *lines = calloc(MAXWORDS, sizeof(struct line));
int i, j;
char c;
int no_of_words;
int no_to_return = (argc == 2 ? 1 : atoi(argv[2]));
struct pair *pairs = calloc(MAXWORDS, sizeof(struct pair));
/* Open the file */
wordlist = fopen(argv[1], "r");
/* Get the words */
i = j = 0;
while ((c = getc(wordlist)) != EOF) {
if (c != '\n') {
lines[i].word[j] = c;
lines[i].sorted[j++] = c;
}
else {
j = 0;
++i;
}
}
no_of_words = i;
/* Order the letters in the second copy */
for (i = 0; i < no_of_words; ++i) {
lines[i].length = strlen(lines[i].word);
qsort(lines[i].sorted, lines[i].length, sizeof(char), comp_chars);
}
/* Sort the lines by the sorted words */
qsort(lines, no_of_words, sizeof(struct line), comp_sorted);
/* Look for pairs */
char to_comp[2][MAXWORDLEN];
strcpy(to_comp[0], lines[0].sorted);
int matches = 0;
j = 0;
for (i = 1; i < no_of_words; ++i) {
strcpy(to_comp[i%2], lines[i].sorted);
if (strcmp(to_comp[0],to_comp[1]) == 0)
matches++;
else {
if (matches == 1) {
strcpy(pairs[j].first, lines[i-2].word);
strcpy(pairs[j].second, lines[i-1].word);
pairs[j++].length = lines[i-1].length;
}
matches = 0;
}
}
int no_of_pairs = j;
/* Sort the pairs we've found */
qsort(pairs, no_of_pairs, sizeof(struct pair), comp_len);
/* Print results */
for (i = 0; i < no_to_return; i++) {
if (pairs[i].length == 0)
break;
printf("%s %s\n", pairs[i].first, pairs[i].second);
}
/* Clean up */
free((void *)lines);
fclose(wordlist);
return 0;
}
int comp_chars(const void *a, const void *b)
{
if (*(char*)a < *(char*)b)
return -1;
else if (*(char*)a == *(char*)b)
return 0;
else
return 1;
}
int comp_sorted(const void *a, const void *b)
{
char sorted[2][MAXWORDLEN];
strcpy(sorted[0],(*(struct line *)a).sorted);
strcpy(sorted[1],(*(struct line *)b).sorted);
return strcmp(sorted[0],sorted[1]);
}
int comp_len(const void *a, const void *b)
{
int lengths[2];
lengths[0] = (*(struct pair *)a).length;
lengths[1] = (*(struct pair *)b).length;
if (lengths[0] < lengths[1])
return 1;
else if (lengths[0] == lengths[1])
return 0;
else
return -1;
}
maybe you should replace the fancy apostrophes like: ‘{ print length, $0 }’ (and elsewhere) with simple ‘ ‘ ones… this would allow one to copy/paste into a shell without error :)
hmmm. your blag borks my input! I enter basic apostrophes and it makes them fancy. bad blag!
just thought you’d be interested to know that my AVG 8.5 Free blocks the first page of comments as having a virus called BAT/Deleter.
I know there is nothing malicious (or is there <_<), just thought it was funny.
Have you considered a “Random”-functionality for the blag similar to the one in the comic-section… there’s several years worth of text, so giving the uninformed reader the option to perceive it in a more fatalistic way would be… awesome.
Respect from Germany & it’s liberal arts majors!
Useless use of cat; tsk, tsk.
fgrep -v “‘” </usr/share/dict/words | …
Probably not the greatest of your worries, but hey, it’s a start.
-Josh
If I were trying to use minimal lines I could make this much less clear
def sortuple(word) :
letters = [l for l in word.strip()]
letters.sort()
return tuple(letters)
bytuple={}
pairs=[]
for w in open('/usr/share/dict/words') :
w = w.strip("\n")
tup = sortuple(w)
if not "'" in tup :
if tup in bytuple :
pairs.append( (len(w),bytuple[tup],w) )
else :
bytuple[tup] = w
pairs.sort()
for tup in pairs[-1:-100:-1] :
print("%2d %s %s" % tup)
Which gives
% time python /Users/palmer/sameletters.py
22 hydropneumopericardium pneumohydropericardium
22 cholecystoduodenostomy duodenocholecystostomy
21 glossolabiopharyngeal labioglossopharyngeal
21 duodenopancreatectomy pancreatoduodenectomy
...
19 incontrovertibility introconvertibility
...
17 misrepresentation representationism
...
real 0m2.766s
user 0m2.695s
sys 0m0.063s
And of course, the whitespace is lost even with code tags, so you can’t just cut and paste.
I am using OS X so it is looking at almost a quarter million words with average length > 9.5 letters.
How much fs would a fsck ck if a fsck could ck fs?
Even more, you usage of
perl -pe 's/^([^ ]+) .*/\1/g’
for extracting first column is overkill (the same goes for extraction of last column later in code).
There’s nice utility which does just that.
cut -d' ' -f1
p.s. sorry for not using “code” before. Here’s repost:
fgrep -v "'" /usr/share/dict/words | perl -pne 'chomp;$_=join("",sort(split//))." $_\n"'
The day I read this post about Overqualified I bought it. I bought it because you said it was funny, and it seemed pretty neat to me.. but the book is so much bigger than just funny cover letters.
As you read all the cover letters, it slowly reveals a story about the author’s brother’s death in a car accident and it’s really uniquely written.
I love it and I’m sharing it with everyone I know. Thanks!
At least you’ve never walked in on your roomie masturbating to a bash prompt.
I know it really, really doesn’t matter, but I can’t resist. You don’t just have a bash problem; you have a perl and awk problem too.
I know that each of these commands has a zillion options and it’s pointless to memorize each one. But some are worth your while:
perl -l, for example. chomp and “\n” are both annoying nuisances having nothing to do with the problem at hand.perl -ais the other.perl -pe 's/^([^ ]+) .*/\1/'makes me cringe. You’re friendly with awk; why didn’t you use that? Anyway, I think you meant to sayperl -ane 'print $F[0]'.Your awk problem is that you use it. Clean out that corner of your brain, stick with Perl, and maybe you’ll have space to remember -a and -l.
That’s not to mention your regexp problem. What’s all this
[^ ]+nonsense? Seems like you probably meant\w+anyway.I confess I didn’t actually spend the time to figure out what all of that is trying to accomplish. Something like
perl -lne '$s=join("",sort split(//)); push @{ $w{$s} }, $_; END { foreach (sort { length($w{$b}[0]) length($w{$a}[0]) } keys %w) { next unless @{ $w{$_ } } > 1; print length($w{$_}[0]), " ", join(" ", @{ $w{$_} }) } }' /usr/share/dict/words | headI guess? (Or what that was before the blag ate it, I mean.)Argh! I promised myself not to do that. Evil geekteasing slut! You knew we couldn’t help ourselves…
But yes, somewhere around the 5th or 6th stage in the pipeline, that should have been rewritten as a single script. I’ve committed similar sins, but I think you’ve beat my worst by a factor of 3 or 4. Probably something to do with my attention span.
@sfink: I believe that this was put together in a hurry (like every oneliner is). So one automatically uses a tool he’s most comfortable with. But I do suggest usage of cut/paste/join/sed/grep if possible. Mostly because oneliners are more readable and portable if you can see what it’s doing. Perl is only for the steps that require serious string work.
I posted my suggested changes 2 posts above.
awk is IMO fairly good tool, but pretty useless because when there’s need for it, the problem is already out of hand and should be moved to perl/python/…
That’s not really a Bash script, it’s a bunch of Awk and Perl strung together with minimal amount of shell commands. If you like Perl so much, why not just write a Perl script rather than this mess of Perl one-liners strung together?
You can write perfectly readable, clean code in Bash, but it takes a bit of discipline, and skill.
ly comments
Hey Randal! I want to use one of your comics on my blog…well…uh…I kind of already did. Is the citation the way you want it? I changed the rollover text, problem? Hit me back.
wastelandamerica.
There’s nothing wrong with using Bash to string together this much manipulation, as long as:
1) Each portion of the pipe is an independent logical functional unit (which I am led to believe might not be the case by your “tee”, but it’s still possible).
2) The command line grew organically, as you tested each portion of the manipulation along the way and thought of the correct next step.
3) Speed of execution is not an issue.
For example, here’s a quick script I worked up the other day to reset a password, check accounts and group memberships, and verify sudoers settings:
FAILSTRING='33[1;31mFAILURE33[0;0m'; for user in user1 user2 user3 user4 anotheruser; do (grep $user /etc/passwd && groups $user | grep staff >> /dev/null) || /bin/echo -e $user: $FAILSTRING; done; grep -E '^%staff' /etc/sudoers || /bin/echo -e "Sudoers: $FAILSTRING"; if [ -z $pwreset ]; then passwd anotheruser; fi; export pwreset='n'It works like a charm, and for what I needed it for, it was ideal. For example, by storing state in the shell, I could re-run it after I fixed any issues it identified and it wouldn’t repeat any destructive action.
This is not at all shorter by any means, I admit. But my angle was different:
You only return the (two) longest matches. For languages with a more complex grammer (French, German), those are commonly of no use, because they are just two different flections of the very same verb (different tenses etc.).
So, after I figured out what you were doing (given that I know almost nothing about perl, that was simply “first command, check output, add next, check output, …”), I dropped all but the first perl command (which I wouldn’t know how to do in pure shell…) and built up the rest in the command structures I am familiar with. :)
And when I was there, I added args to play around… min. length, min. multiplicity, sort by length or by multiplicity.
The result is then grouped by the key (making it better readable as the groups are more obvious).
It’s also a bit faster (~9s vs. ~35s for the original), but uses more files…
(Yeah, and it misses the whole “do in one line” thing as it is written now, but I believe with fixed args it *could* be crammed into one.)
#!/bin/bash
# sort behaves not as expected:
# it treats international chars as separators? if LC_ALL is not set to C
export LC_ALL=C
# now that this is a script, might as well give options...
arginvalid=""
sort="-bylength"
cntmin=3
lenmin=8
while [ $# -gt 0 ]; do
if [ "$1" == "-bycount" -o "$1" == "-bylength" ]; then
sort=$1
else
if [ "$1" == "-countmin" ]; then
if [ "$2" -gt 1 -a "$2" -lt 10 ]; then
cntmin=$2
else
echo Invalid argument to '-countmin': Must be followed by a number between 1 and 10.
fi
# taking 2 args away
shift
else
if [ "$1" == "-lenmin" ]; then
if [ "$2" -gt 1 ]; then
lenmin=$2
else
echo Invalid argument to '-lenmin': Must be followed by a number over 1.
fi
# taking 2 args away
shift
else
arginvalid=$1
fi
fi
fi
shift
done
if [ "$arginvalid" != "" ]; then
echo Invalid argument given: "$arginvalid"
echo Valid choices: '-bycount', '-bylength', '-countmin ', '-lenmin '
fi
echo Running with: Sorting \= "$sort", min. multiplicity \= "$cntmin", min. length \= "$lenmin"
# generate the lookup table and the sorted list of all keys in it
echo Generating tables...
cat /usr/share/dict/words \
| fgrep -v "'" \
| perl -ne 'chomp($_); @b=split(//,$_); print join("", sort(@b))." ".$_."\n";' \
| sort \
| tee lookup.txt \
| cut -f1 -d\ >keys.txt
# only at least triplicate (or $cntmin if overridden) keys are kept, result sorted by multiplicity descending
# min. accepted key length is ~8 (or $lenmin if overridden) chars (international chars are counted twice or more!)
echo Analyzing keys...
i=0;
multifilter=":["
while [ $i -lt $cntmin ]; do
multifilter=${multifilter}$i
i=$(($i + 1))
done
multifilter=${multifilter}"]"
lenfilter="^[^:]\{"${lenmin}",\}:"
( \
count=0; \
j=""; \
for i in `cat keys.txt`; do \
if [ "$j" == "$i" ]; then \
count=$(($count + 1)); \
else \
echo $j:$count; \
count=1; \
j="$i"; \
fi; \
done; \
echo $j:$count \
) | grep -v $multifilter | grep $lenfilter >keys_len8.txt
if [ "$sort" == "-bylength" ]; then
# I don't know how to tell awk to count length only until the :...
awk '{ print length, $0 }' keys_len8_sorted.txt
else
sort -k2 -t: -r keys_len8_sorted.txt
fi
# list the keys and the list of the matches
total=`grep -c : keys_len8_sorted.txt`
j=0
pct=0
echo >&2 Building result: \(one dot equals 1%\)
( \
for i in `cut &2 -n .; \
pct=$pctnow; \
fi; \
done \
) >values_len8plus.txt
echo >&2
echo --- Complete result following ---
cat values_len8plus.txt
That *almost* copied right. D’oh. Mkay, replaced all redirections by HTML entities… If I knew how, I’d delete the previous post. :D
–
This is not at all shorter by any means, I admit. But my angle was different:
You only return the (two) longest matches. For languages with a more complex grammer (French, German), those are commonly of no use, because they are just two different flections of the very same verb (different tenses etc.).
So, after I figured out what you were doing (given that I know almost nothing about perl, that was simply “first command, check output, add next, check output, …”), I dropped all but the first perl command (which I wouldn’t know how to do in pure shell…) and built up the rest in the command structures I am familiar with. :)
And when I was there, I added args to play around… min. length, min. multiplicity, sort by length or by multiplicity.
The result is then grouped by the key (making it better readable as the groups are more obvious).
It’s also a bit faster (~9s vs. ~35s for the original), but uses more files…
(Yeah, and it misses the whole “do in one line” thing as it is written now, but it *could* be crammed into one, I believe.)
#!/bin/bash
# sort behaves not as expected:
# it treats international chars as separators? if LC_ALL is not set to C
export LC_ALL=C
# now that this is a script, might as well give options...
arginvalid=""
sort="-bylength"
cntmin=3
lenmin=8
while [ $# -gt 0 ]; do
if [ "$1" == "-bycount" -o "$1" == "-bylength" ]; then
sort=$1
else
if [ "$1" == "-countmin" ]; then
if [ "$2" -gt 1 -a "$2" -lt 10 ]; then
cntmin=$2
else
echo Invalid argument to '-countmin': Must be followed by a number between 1 and 10.
fi
# taking 2 args away
shift
else
if [ "$1" == "-lenmin" ]; then
if [ "$2" -gt 1 ]; then
lenmin=$2
else
echo Invalid argument to '-lenmin': Must be followed by a number over 1.
fi
# taking 2 args away
shift
else
arginvalid=$1
fi
fi
fi
shift
done
if [ "$arginvalid" != "" ]; then
echo Invalid argument given: "$arginvalid"
echo Valid choices: '-bycount', '-bylength', '-countmin <number>', '-lenmin <number>'
fi
echo Running with: Sorting \= "$sort", min. multiplicity \= "$cntmin", min. length \= "$lenmin"
# generate the lookup table and the sorted list of all keys in it
echo Generating tables...
cat /usr/share/dict/words \
| fgrep -v "'" \
| perl -ne 'chomp($_); @b=split(//,$_); print join("", sort(@b))." ".$_."\n";' \
| sort \
| tee lookup.txt \
| cut -f1 -d\ >keys.txt
# only at least triplicate (or $cntmin if overridden) keys are kept, result sorted by multiplicity descending
# min. accepted key length is ~8 (or $lenmin if overridden) chars (international chars are counted twice or more!)
echo Analyzing keys...
i=0;
multifilter=":["
while [ $i -lt $cntmin ]; do
multifilter=${multifilter}$i
i=$(($i + 1))
done
multifilter=${multifilter}"]"
lenfilter="^[^:]\{"${lenmin}",\}:"
( \
count=0; \
j=""; \
for i in `cat keys.txt`; do \
if [ "$j" == "$i" ]; then \
count=$(($count + 1)); \
else \
echo $j:$count; \
count=1; \
j="$i"; \
fi; \
done; \
echo $j:$count \
) | grep -v $multifilter | grep $lenfilter >keys_len8.txt
if [ "$sort" == "-bylength" ]; then
# I don't know how to tell awk to count length only until the :...
awk '{ print length, $0 }' <keys_len8.txt | sort -nr >keys_len8_sorted.txt
else
sort -k2 -t: -r <keys_len8.txt >keys_len8_sorted.txt
fi
# list the keys and the list of the matches
total=`grep -c : keys_len8_sorted.txt`
j=0
pct=0
echo >&2 Building result: \(one dot equals 1%\)
( \
for i in `cut <keys_len8_sorted.txt -f2 -d\ | cut -f1 -d:`; do \
echo $i:; \
grep "^$i " lookup.txt; \
echo; \
j=$(($j + 1)); \
pctnow=$(($j * 100 / $total)); \
if [ $pctnow -ne $pct ]; then \
echo >&2 -n .; \
pct=$pctnow; \
fi; \
done \
) >values_len8plus.txt
echo >&2
echo --- Complete result following ---
cat values_len8plus.txt
I pronounce / as “slash”, /etc as “ettckk” (like you have something in your mouth), ~ as “squiggly” (or occasionally “the little squiggly”). fsck is “eff suck”.
For someone who doesn’t speak Perl and isn’t motivated to learn, can anyone please explain in English (or in clear pseudocode) what the original shell command sausage was trying to accomplish?
(Captcha: “10:30 blanche”. 10:30?)
Am I the only one that pronounces everything (except WWW, which I pronounce as World Wide Web) as the actual letters?
SQL – Ess Que El
HTTP – Aich Tee Tee Pee
and then I pronounce the ones that arent letters the way I imagine they sound.
/ – Fshhhp
\ – Shhhhpf
. – Poit
~ – ooOOooeeeEEoo
In relation to the pronunciation of symbols I’m sure this has already come up:
http://lists.ding.net/geeks/96/dec/msg00005.html
Simply written in a few lines of Perl code without insanity:
#!/usr/local/bin/perl
use strict;
use warnings;
my $store;
while(){
chomp;
push @{$store->{join '', sort split //}}, $_;
}
foreach (sort {length($a) length($b) } keys %{$store}){
print "@{$store->{$_}}\n" if @{$store->{$_}} > 1;
}
And you just do:
script.pl /usr/share/dict/words | tail -1You can have it remove the names by ignoring anything with a capital in the first loop.
Oooo damn you webforms suck with coding:
inside the while (){ should be the wakas (or angle brackets)
if this works:
while(<>){I think of ‘etc’ as “etts” for some reason. I guess my brain is lazy.
‘usr’ is “user.” I don’t really pronounce ‘/’ at all, I just say “user bin python” or “etts X11 X org dot conf.” Similarly, ‘~/.mozilla’ is just “dot mozilla” and ‘/home/katie’ is “home katie.”
Even more confusingly, ‘$HOME’ is just “home.” It’s a good thing I don’t often talk to people about Linux in person.
As for BASH scripts, I pretty much give up and move it to a Python script if it’s longer than one line or involves more than three pipes.
Fun times:
fortune | sed '/you/& and Gary Busey/is' | cowsay -n
P.S.: reCAPTCHA: “cuddling York” – I’m not sure why this amuses me, but it does.
Related:
cowsay -n "STACK OVERFLOW" | cowsay -n | cowsay -n | cowsay -n
Thats not really a Bash script, it’s a bunch of Awk and Perl strung together with minimal amount of shell commands. If you like Perl so much, why not just write a Perl script rather than this mess of Perl one-liners strung together.You can write perfectly readable, clean code in Bash, but it takes a bit of discipline, and skill.
I don’t know what it is, but the real answer couldn’t possibly be as interesting as this discussion.
@Katie:
I believe you intended
cowsay -n "STACK OVERFLOW" | cowsay -n | cowsay -n | cowsay -nOops.
cowsay "STACK OVERFLOW" | cowsay -n | cowsay -n | cowsay -nMany months later, this discussion came up in another discussion, and so I came back to look at it again, and saw Randall’s response to me. So, belatedly, I clarify: that “47″ was actually a “backslash 0 4 7″, which is the pcre octal for the single quote; the “backslash 0″ part apparently got eaten. If you put that back in, it works (I believe) as Randall intended.
This should do it in Haskell (composed at the interactive prompt, much easier to play with than Bash), unless I got the problem description wrong:
import Data.Ord
import Data.List
main = interact (unlines . map show . anagrams . lines)
anagrams = reverse
. sortBy (comparing (length.head))
. filter (not.null.tail)
. groupBy (\a b -> sort a == sort b)
Save as anagrams.hs and
cat /usr/share/dict/words | runhaskell anagrams.hsoh a couple of weeks late.. oh well:
perl -l15 -ne ‘push @{$x{lc join”",sort split//}},$_;END{map{print join(” “,length,@{$x{$_}}).”\n”}sort{length($a)<=>length($b)}grep{@{$x{$_}}>1}keys%x;}’</usr/share/dict/words|tail
perl -e’print map{@{$x{$_}}>1&&join(” “,length,@{$x{$_}}).”\n”}sort{length($a)<=>length($b)}map{chomp;push@{$x{$k=lc join”",sort split//}},$_;@{$x{$k}}<2&&$k}<>;’</usr/share/dict/words|tail
Here’s my variant: getting rid of almost all the Perl and doing the rest with Unixy programs… only thing left that Perl is required for is sorting letters within a string, which I couldn’t figure out how to do simply :)
</usr/share/dict/words grep -v \' | perl -ne 'chomp;print length."\t$_\t".(join "", sort split "")."\n"' | sort -k 3 | uniq -Df 2 | sort -srnk 1 | head -n 20 | cut -f 2Wow. This turned in to a proper ‘geek-off’ really quickly…
…I like it here :)
interesting rhetoric on the marios…seems everyone has failed to notice that part.
@James:
int no_to_return = (argc == 2 ? 1 : atoi(argv[2]));
A ternary definitely is not faster.
Here’s the Ruby way of finding anagrams:
Anagram finder in Ruby
http://snippets.dzone.com/posts/show/5593
@ Matt Hickford
fsck is just filesystem check :) mtab is metatab, I’d pronounce usr as user, even though it actually means ‘unix system resources’.