Saturday, August 25, 2007

Quad terminal laser!

I find, not too infrequently, that I need to perform operations across several directories, or across several servers, at once. To best facilitate this, I usually open 3-5 terminals, and then move them around so that I can see them all at once. Why not automate this, I thought. Instead of hard coding sizes and positions for the terminals and assigning the sequence to a keypress, I thought I would do something more useful and write a Python script to find out this information based on the current display, and launch the terminals appropriately.

This has been completed, but the script still has a lot to be desired. You still have to set as a variable whether dual monitors are being used (could not find a way to determine this programatically yet), and it only works with single monitor or TwinView in Linux. You can also set what sort of terminal you want to use (gnome-terminal is default), what side you want them to open on for dual monitor setups (right is default), and what directory they should be in (user's home dir is default). It gets your screen's dimensions, calculates where the terminals should be and how they should be separated, and launches them.

I then added this to my .fluxbox/keys file:
Mod1 q :ExecCommand python /home/sam/code_homerepo/code/userextensions/quadterm.py
So now I just press "Alt + q", and four gnome-terminals launch perfectly spaced on my right monitor!

The code is quite ugly right now, but as it stands:
#/usr/bin/python
#####################################################
# Creates 4 terminals evenly quartering a screen
#
# Linux only right now. Map it to a keystroke for
# maximum convenience.
#
# If you don't want to use gnome-terminal, change the
# next variable. Warning: The width and height
# specified are in characters and rows for gnome-
# terminal; they might have to be in pixels for other
# apps.
#####################################################
terminal = "gnome-terminal"
#####################################################
# If you are not using TwinView, toggle the following:
#####################################################
twinview = True
#####################################################
# Which monitor (if Dual) you want them on:
#####################################################
side = "right"
#####################################################
# Dir the terminals will be in:
#####################################################
workingdir = "~"
#####################################################
import commands
# Get the screen dimensions:
dimeninfo = commands.getoutput("xdpyinfo | grep dimensions")
# Read out the needed numbers:
dimeninfosplit = dimeninfo.split()
dimensions = dimeninfosplit[1]
prex = dimensions.split("x")
x = int(prex[0])
prey = dimensions.split("x")
y = int(prey[1])
# Take some visual buffers into account:
widthsansborder = x - 50
heightsansborder = y - 50
# Calculate dimensions in pixels:
termwidthpixels = widthsansborder / 2
termheightpixels = heightsansborder / 2
# Convert to characters/rows for most apps (like gnome-terminal).
# No idea what the conversion rate should be...
termwidth = termwidthpixels / 16
termheight = termwidthpixels / 60
# Constant width/height for all 4.
t1width = t2width = t3width = t4width = termwidth
t1height = t2height = t3height = t4height = termheight
# Find the offset position for appropriate pairs (in pixels):
t1posx = t3posx = 50
t2posx = t4posx = (50 * 2) + termwidthpixels
t1posy = t2posy = 50
t3posy = t4posy = 50 + termheightpixels
# If TwinView is being used, we need to halve some offsets:
if twinview is True:
t2posx = (((50 * 2) + termwidthpixels) / 2)
t4posx = (((50 * 2) + termwidthpixels) / 2)
if side == "right":
t1posx = 50 + (x / 2)
t2posx = x * .75
t3posx = 50 + (x / 2)
t4posx = x * .75
# Spawn the 4 terminals, with needed positions and sizes, then exit quietly:
commands.getoutput("%s --geometry=%dx%d+%d+%d --working-directory=%s" % \
(terminal, t1width, t1height, t1posx, t1posy, workingdir))
commands.getoutput("%s --geometry=%dx%d+%d+%d --working-directory=%s" % \
(terminal, t2width, t2height, t2posx, t2posy, workingdir))
commands.getoutput("%s --geometry=%dx%d+%d+%d --working-directory=%s" % \
(terminal, t3width, t3height, t3posx, t3posy, workingdir))
commands.getoutput("%s --geometry=%dx%d+%d+%d --working-directory=%s" % \
(terminal, t4width, t4height, t4posx, t4posy, workingdir))
# For debugging, print what we have calculated:
#print "%s at: %d x %d + %d + %d" % (terminal, t1width, t1height, t1posx, t1posy)
#print "%s at: %d x %d + %d + %d" % (terminal, t2width, t2height, t2posx, t2posy)
#print "%s at: %d x %d + %d + %d" % (terminal, t3width, t3height, t3posx, t3posy)
#print "%s at: %d x %d + %d + %d" % (terminal, t4width, t4height, t4posx, t4posy)

Thursday, August 23, 2007

All your base...

Master Ian (whom I would link to, if his blog were not continually DOWN), showed me a very simple yet useful command: basename.

You give it the path to a file, it gives you the filename and extension. If you also give it the extension, it just gives you the filename. Thus:
sam@ZenSam:~$ basename unique-ip-list.csv
unique-ip-list.csv
sam@ZenSam:~$ basename unique-ip-list.csv .csv
unique-ip-list
This could be quite useful in shell scripting, to grab a list of desired for various purposes. One implementation I thought of right off: If you wanted to throw an error message for the usage of a command in a script, instead of having it display "See /usr/bin/zip for details" or something similar, you could run it through basename, and get a better looking name.

Sunday, August 12, 2007

Domainspotter Project Homepage Launched!

I have spent some time thinking about what to do with my Domainspotter program. I believe that it can be further developed into something quite interesting, and somewhat useful. As a result, I decided to create a site to house its documentation, feature roadmap, source code, and other related items. This can now be viewed at domainspotter.org.

There is not a whole lot there right now, but I hope to keep adding to it. This will minimally include the feature map as I have it planned now, and access to the SVN code base. I am still working on the content for the site, but it will mainly link to my roadmap in Trac for the project, as well as the code browser. It will be the home for all things related to Domainspotter.

In the longer term, I will also create a frontend to view results from domainspotter, updated weekly. In addition, I will also be adding various statistics and graphing displays to show the results in different ways.

So check there soon, hopefully things of interest will be found!

Saturday, August 4, 2007

Fun with Python: Domainspotter

In an effort to learn Python better, I will be writing a number of relatively small scripts to perform set tasks. I find that giving myself some goal and having to code towards it is more effective in learning syntax than going through coding exercises in a tutorial.

For the first such project, I wanted to determine how many words in a current English dictionary are not taken as domain names. This could have become a rather complicated exercise, but I limited it to the following:
There are many domains made with acronyms, combinations of names, made up words, and many other forms. But I was curious just how many of the more common terms in the language were still available.

The code as it stands is at the end of this post.

Before you look at how I did it, however (which is far from optimal anyway), I would encourage you to try out the exercise on your own, given the basic requirements stated above. I found that this program was perfect for learning more Python, since it required learning how to perform a number of disparate activities, such as reading and parsing files, using external modules, performing web lookups, and displaying desired data. These are, of course, some of the most general tasks most programs of use must perform.

A few notes. The whois lookups was the only section of the program that I assumed would require something outside of core Python. I came across rwhois.py, a module for performing whois lookups. I still have the code to implement this listed in the program as it stands for illustration, but found it did not meet my needs. It could not locate many domains I confirmed were taken using the simple command whois. So, I decided to fall back on something I knew: the command whois! This seemed to work fine for my needs.

Additionally, while I know the code could stand for optimization all over the place, I was curious how long the web lookups would take. To do this, I modified the lookup function to the following:

import time
self.totalchecked = 0
self.availdomains = []
times = []
for potdomain in self.potdomains:
start = time.time()
runthis = "whois %s" % potdomain
check = commands.getoutput(runthis).split('\n')
if check[8] == 'No match for "%s".' % potdomain.upper():
self.availdomains.append(potdomain)
else:
pass
self.totalchecked+=1
end = time.time()
timeforquery = end - start
times.append(timeforquery)
import pdb
pdb.set_trace()
I ran this against a much shortened dictionary file (the top 10 lines of the real one). Once the debugger launched, I did the following:
(Pdb) print sum(times) / len(times)
0.884103870392
This gave me the average time per web query. Now, note that I found this out after I started the final run for the first time... And since the actual dictionary file has 112,505 entries, I estimated the entire series should be done in around 27.6 hours. So I started my final run at around 7:32pm Saturday evening, and expected it to run until around 11pm Sunday evening. But instead I got an error during the night:

  File "domainspotter.py", line 125, in lookup
if check[8] == 'No match for "%s".' % potdomain.upper():
IndexError: list index out of range

This, I am pretty certain, was due to my internet connection dropping for a moment. This not being an unlikely event, I should add in handling for such situations, and perhaps write to the results output in chunks, instead of all at once! So the complete list will have to wait. But, running against the beginning of the list showed there are some exciting domains still out there:
Domains that seem to be available:

AMusD.net
Abelmoschus.net
Aberdonian.net
Abkhas.com
Abkhas.net
Abkhasian.com
Abkhasian.net
Abkhazian.net
Abnaki.net
Abramis.net

One last note you might already be asking yourself about: The AGID-4 dictionary contains proper and common nouns. I left the proper ones capitalized. It could be argued I should lower() them first, but I don't think one is able to buy a domain name with merely differing capitalization.



Current code of domainspotter.py:

[EDIT] The code base of domainspotter has changed around a lot since I posted this. Instead of having a big block of code here that is static, check out the latest form on my Trac site.

Thursday, August 2, 2007

Beginner Python Debugging

While I won't go into all the details and best practices of Python debugging here (because I do not know them), Chris Blunck shared a very easy and useful method with me that can be used in many situations. First, the code:
import pdb
pdb.set_trace()
That's it. All you have to do is put those two lines just before a line where, for example, an error is being thrown, or where you are not sure what the state of things is. pdb is the "Python Debugger", appropriately enough. There are several ways to invoke and use this module, other than what I listed above. The format I gave is most useful for setting a breakpoint. When your code runs and reaches where you have placed those two lines, you enter the debugging command line. You can check variable values as they exist at the breakpoint, execute code, whatever you want to try to understand the error you are examining.

An example:
>>> list = [1,2,3,4,5,6]
>>> import pdb
>>> pdb.set_trace()
--Return--
> (1)()->None
(Pdb) print list[2]
3
As can be easily seen in this very simple case, once the breakpoint is reached and one enters the debugging command line, it is trivially easy to, for example, verify variable values as compared to what you were expecting them to be. If desired, you can also continue executing the code after the breakpoint by pressing "c" on the pdb command line.

For details on other ways to use pdb, see the official doc. This separate guide has very useful information for Python debugging, specifically with pdb, for beginners.

Wednesday, August 1, 2007

An awesome xargs option and cleaning up SVN accidents

I started using a wonderfully helpful option for xargs recently, after Master Ian showed me its use in a particularly thorny case. Before seeing the cool option, I will give the case as an example of its use:

I am still learning about the best (or even proper) way to use some aspects of SVN. In this particular case, I need to take all the files in a directory that was in the repo, move them into a new directory in the same directory, add another new directory, and copy new files into that.

To better explain the example, here is the structure I had in place, initially:
  • Directory 1 (In SVN)
    • FilesetA
And this is what I wanted in the end:
  • Directory 1
    • Directory1.1
      • FilesetA
    • Directory1.2
      • FilesetB
What I should have done is: made the 2 new directories (1.1 and 1.2) within the first one, copied the new files into one (FilesetB), add and commit both new directories and their contents. Then I should have run svn mv on the files in the top directory (FilesetA) into the second new directory.

What I actually did was: created the two new directories, simply mv'ed FilesetA into one, copied in FilesetB into the other, and then svn add'ed the two new directories and their contents. Then I had a problem. I still had FilesetA in Directory 1 in the repo, but not locally. So I needed to svn rm those, but I could not since they did not exist locally any more!

In the end, I copied the FilesetA out into a temp directory, when up to the directory above my repository, check the entire thing out again, svn rm'ed the FilesetA in Directory 1, and then needed to add FilesetA back into Directory 1.1. This was why I moved it to a temp directory. The problem was that that set had several nested directories with .svn directories in them. I had to remove these before adding them into the SVN'ed directories, else mass confusion would result. So I ran:
find . -type d | xargs -I % rm -rf %/.svn

This gem found all directories from the current directory down, and passed them to rm -rf. However, I could not just pipe the results and use xargs, as I needed to append "/.svn" to specify the right directories to remove. This is where -I comes in handy. This option replaces all instances of the string following it with the results of the first command (before the pipe). In this case the string was "%", for similarity to Python syntax, but any string could be used. The man page uses "replace-str". So in this particular case, if FilesetA included Directory 1.1.1 and 1.1.2, the command would resolve to: "rm -rf Directory1.1.1/.svn Directory1.1.2/.svn", which is precisely what was needed!

For some reason, I had a hard time understanding the option from the man page, but Ian translated it to Python for me:
for dir in [find . -type d]:
cmd = "rm -rf %s/.svn" % dir
os.popen(cmd)
This was much clearer to me. What it basically comes down to is:
OPERATION1 | xargs -I replaceme ANOTHEROPERATION replaceme OTHERTHINGS

translates into
ANOTHEROPERATION OPERATIO1RESULTS OTHERTHINGS
.